From fa7d8df534df2e69be0dcbb7f424455ba7a55466 Mon Sep 17 00:00:00 2001 From: Ian Barwick Date: Mon, 7 Jul 2014 11:39:26 +0900 Subject: [PATCH] Add a "quickstart" guide Provides a succinct overview of the steps needed to get repmgr up and running as. --- QUICKSTART.rst | 286 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100644 QUICKSTART.rst diff --git a/QUICKSTART.rst b/QUICKSTART.rst new file mode 100644 index 00000000..95531bd7 --- /dev/null +++ b/QUICKSTART.rst @@ -0,0 +1,286 @@ +======================== +repmgr: Quickstart guide +======================== + +repmgr is an open-source tool suite for mananaging replication and failover +among multiple PostgreSQL server nodes. It enhances PostgreSQL's built-in +hot-standby capabilities with a set of administration tools for monitoring +replication, setting up standby servers and performing failover/switchover +operations. + +This quickstart guide assumes you are familiar with PostgreSQL replication +setup and Linux/UNIX system administration. For a more detailed tutorial +covering setup on a variety of different systems, see the README.rst file. + +Conceptual Overview +=================== + +repmgr provides two binaries: + + - `repmgr`: a command-line client to manage replication and repmgr configuration + - `repmgrd`: an optional daemon process which runs on standby nodes to monitor + replication and node status + +Each PostgreSQL node requires a repmgr configuration file; additionally +it must be "registered" using the repmgr command-line client. repmgr stores +information about managed nodes in a custom schema on the node's current master +database. + + +Requirements +============ + +repmgr works with PostgreSQL 9.0 and later. All server nodes must be running the +same PostgreSQL major version, and preferably should be running the same minor +version. + +repmgr will work on any Linux or UNIX-like environment capable of running +PostgreSQL. rsync must also be installed. + + +Installation +============ + +repmgr must be installed on each PostgreSQL server node. + +* Packages + - RPM packages for RedHat-based distributions are available from PGDG + - Debian/Ubuntu provide .deb packages. + + It is also possible to build .deb packages directly from the repmgr source; + see README.rst for further details. + +* Source installation + - repmgr source code is hosted at github (https://github.com/2ndQuadrant/repmgr); + tar.gz files can be downloaded from https://github.com/2ndQuadrant/repmgr/releases . + + repmgr can be built easily using PGXS: + + sudo make USE_PGXS=1 install + + +Configuration +------------- + +* Server configuration + +Password-less SSH logins must be enabled for the database system user (typically `postgres`) +between all server nodes to enable repmgr to copy required files. + +* PostgreSQL configuration + +The master PostgreSQL node needs to be configured for replication with the +following settings: + + wal_level = 'hot_standby' # minimal, archive, hot_standby, or logical + archive_mode = on # allows archiving to be done + archive_command = 'cd .' # command to use to archive a logfile segment + max_wal_senders = 10 # max number of walsender processes + wal_keep_segments = 5000 # in logfile segments, 16MB each; 0 disables + hot_standby = on # "on" allows queries during recovery + +Note that repmgr expects a default of 5000 wal_keep_segments, although this +value can be overridden when executing the `repmgr` client. + +Additionally, repmgr requires a dedicated PostgreSQL superuser account +and a database in which to store monitoring and replication data. The +database can in principle be any database, including the default postgres +one, however it's probably advisable to create a dedicated repmgr database. + + +* repmgr configuration + +Each PostgreSQL node requires a repmgr configuration file containing +identification and database connection information. + + cluster=test + node=1 + node_name=node1 + conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db' + pg_bindir=/path/to/postgres/bin + +* `cluster`: common name for the replication cluster; this must be the same on all nodes +* `node`: a unique, abitrary integer identifier +* `name`: a unique, human-readable name +* `conninfo`: standard conninfo string enabling repmgr to connect to the + control database; user and name must be the same on all nodes, while other + parameters such as port may differ. The `host` parameter *must* be a hostname + resolvable by all nodes on the cluster. +* `pg_bindir`: (optional) location of PostgreSQL binaries, if not in the default $PATH + +Note that the configuration file should not be stored inside the PostgreSQL +data directory. + +Each node configuration needs to be registered with repmgr, either using the +repmgr command line tool, or the repmgrd daemon; for details see below. Details +about each node are inserted into the repmgr database (for details see below). + + +Replication setup and monitoring +================================ + +For the purposes of this guide, we'll assume the database user will be +`repmgr_usr` and the database will be `repmgr_db`, and that the following +environment variables are set on each node: + + - $HOME: the PostgreSQL system user's home directory + - $PGDATA: the PostgreSQL data directory + + +Master setup +------------ + +1. Configure PostgreSQL + + - create user and database + + CREATE ROLE repmgr_usr LOGIN SUPERUSER; + CREATE DATABASE repmgr_db OWNER repmgr_usr; + + - configure postgresql.conf for replication (see above) + + - update pg_hba.conf: + + host repmgr_usr repmgr_db 192.168.1.0/24 trust + host replication all 192.168.1.0/24 trust + + Restart the PostgreSQL server after making these changes. + + +2. Create the repmgr configuration file: + + $ cat $HOME/repmgr/repmgr.conf + cluster=test + node=1 + node_name=node1 + conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db' + pg_bindir=/path/to/postgres/bin + +3. Register the master node with repmgr: + + $ repmgr -f $HOME/repmgr/repmgr.conf --verbose master register + [2014-07-04 10:43:42] [INFO] repmgr mgr connecting to master database + [2014-07-04 10:43:42] [INFO] repmgr connected to master, checking its state + [2014-07-04 10:43:42] [INFO] master register: creating database objects inside the repmgr_test schema + [2014-07-04 10:43:43] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db) + + -d is the database defined in repmgr.conf file. + +Slave/standby setup +------------------- + +1. Use repmgr to clone the master: + + $ repmgr -f $HOME/repmgr/repmgr.conf -D $PGDATA -d repmgr_db -U repmgr_usr -R postgres --verbose standby clone 192.168.1.2 + Opening configuration file: ./repmgr.conf + [2014-07-04 10:49:00] [ERROR] Did not find the configuration file './repmgr.conf', continuing + [2014-07-04 10:49:00] [INFO] repmgr connecting to master database + [2014-07-04 10:49:00] [INFO] repmgr connected to master, checking its state + [2014-07-04 10:49:00] [INFO] Successfully connected to primary. Current installation size is 1807 MB + [2014-07-04 10:49:00] [NOTICE] Starting backup... + [2014-07-04 10:49:00] [INFO] creating directory "/path/to/data/"... + (...) + [2014-07-04 10:53:19] [NOTICE] Finishing backup... + NOTICE: pg_stop_backup complete, all required WAL segments have been archived + [2014-07-04 10:53:21] [INFO] repmgr requires primary to keep WAL files 0000000100000000000000AD until at least 0000000100000000000000AD + [2014-07-04 10:53:21] [NOTICE] repmgr standby clone complete + [2014-07-04 10:53:21] [NOTICE] HINT: You can now start your postgresql server + [2014-07-04 10:53:21] [NOTICE] for example : /etc/init.d/postgresql start + + -R is the database system user on the master node. At this point it does not matter + if the `repmgr.conf` file is not found. + + This will clone the PostgreSQL database files from the master, and additionally + create an appropriate `recovery.conf` file. + +2. Start the PostgreSQL server + +3. Create the repmgr configuration file: + + $ cat $HOME/repmgr/repmgr.conf + cluster=test + node=2 + node_name=node2 + conninfo='host=repmgr_node2 user=repmgr_usr dbname=repmgr_db' + pg_bindir=/path/to/postgres/bin + +4. Register the master node with repmgr: + + $ repmgr -f $HOME/repmgr/repmgr.conf --verbose standby register + Opening configuration file: /path/to/repmgr/repmgr.conf + [2014-07-04 11:48:13] [INFO] repmgr connecting to standby database + [2014-07-04 11:48:13] [INFO] repmgr connected to standby, checking its state + [2014-07-04 11:48:13] [INFO] repmgr connecting to master database + [2014-07-04 11:48:13] [INFO] finding node list for cluster 'test' + [2014-07-04 11:48:13] [INFO] checking role of cluster node 'host=repmgr_node1 user=repmgr_usr dbname=repmgr_db' + [2014-07-04 11:48:13] [INFO] repmgr connected to master, checking its state + [2014-07-04 11:48:13] [INFO] repmgr registering the standby + [2014-07-04 11:48:13] [INFO] repmgr registering the standby complete + [2014-07-04 11:48:13] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db) + +Monitoring +---------- + +`repmgrd` is a management and monitoring daemon which runs on standby nodes +and which and can automate remote actions. It can be started simply with e.g.: + + repmgrd -f $HOME/repmgr/repmgr.conf --verbose > $HOME/repmgr/repmgr.log 2>&1 + +or alternatively + + repmgrd -f $HOME/repmgr/repmgr.conf --verbose --monitoring-history > $HOME/repmgr/repmgrd.log 2>&1 + +which will track advance or lag of the replication in every standby in the +`repl_monitor` table. + +Example log output: + + [2014-07-04 11:55:17] [INFO] repmgrd Connecting to database 'host=localhost user=repmgr_usr dbname=repmgr_db' + [2014-07-04 11:55:17] [INFO] repmgrd Connected to database, checking its state + [2014-07-04 11:55:17] [INFO] repmgrd Connecting to primary for cluster 'test' + [2014-07-04 11:55:17] [INFO] finding node list for cluster 'test' + [2014-07-04 11:55:17] [INFO] checking role of cluster node 'host=repmgr_node1 user=repmgr_usr dbname=repmgr_db' + [2014-07-04 11:55:17] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_test' + [2014-07-04 11:55:17] [INFO] repmgrd Checking node 2 in cluster 'test' + [2014-07-04 11:55:17] [INFO] Reloading configuration file and updating repmgr tables + [2014-07-04 11:55:17] [INFO] repmgrd Starting continuous standby node monitoring + + +Failover +-------- + +To promote a standby to master, on the standby execute execute e.g.: + + repmgr -f $HOME/repmgr/repmgr.conf --verbose standby promote + +repmgr will attempt to connect to the current master to verify that it +is not available (if it is, repmgr will not promote the standby). + +Other standby servers need to be told to follow the new master with: + + repmgr -f $HOME/repmgr/repmgr.conf --verbose standby follow + +See file `autofailover_quick_setup.rst` for information on how to set up +automated failover. + + +repmgr database schema +====================== + +repmgr creates a small schema for its own use in the database specified in +each node's conninfo configuration parameter. This database can in principle +be any database. The schema name is the global `cluster` name prefixed +with `repmgr_`, so for the example setup above the schema name is +`repmgr_test`. + +The schema contains two tables: + +* `repl_nodes` + stores information about all registered servers in the cluster +* `repl_monitor` + stores monitoring information about each node + +and one view, `repl_status`, which summarizes the latest monitoring information +for each node. + +