From 4d5ff146e02e665279f48da69a41b640c9433297 Mon Sep 17 00:00:00 2001 From: Ian Barwick Date: Mon, 28 Aug 2017 17:07:14 +0900 Subject: [PATCH] Update README --- README.md | 321 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 308 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 82bbb258..51c3af49 100644 --- a/README.md +++ b/README.md @@ -360,7 +360,7 @@ directory. ### repmgr configuration file -Create a `repmgr.conf` file on the master server. The file must contain at +Create a `repmgr.conf` file on the primary server. The file must contain at least the following parameters: node_id=1 @@ -372,7 +372,7 @@ least the following parameters: 32 bit signed integer between 1 and 2147483647 - `node_name`: a unique string identifying the node; we recommend a name specific to the server (e.g. 'server_1'); avoid names indicating the - current replication role like 'master' or 'standby' as the server's + current replication role like 'primary' or 'standby' as the server's role could change. - `conninfo`: a valid connection string for the `repmgr` database on the *current* server. (On the standby, the database will not yet exist, but @@ -465,7 +465,7 @@ Clone the standby with: INFO: sufficient walsenders available on upstream node (2 required) INFO: successfully connected to source node DETAIL: current installation size is 29 MB - INFO: creating directory "/space/sda1/ibarwick/repmgr-test/node_2/data"... + INFO: creating directory "/var/lib/postgresql/data"... NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: 'pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream ' @@ -544,27 +544,322 @@ then start the server. ### Verify replication is functioning -Connect to the master server and execute: +Connect to the primary server and execute: repmgr=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ - pid | 7704 + pid | 19111 usesysid | 16384 usename | repmgr application_name | node2 - client_addr | 192.168.1.2 + client_addr | ::1 client_hostname | - client_port | 46196 - backend_start | 2016-01-07 17:32:58.322373+09 + client_port | 50378 + backend_start | 2017-08-28 15:14:19.851581+09 backend_xmin | state | streaming - sent_location | 0/3000220 - write_location | 0/3000220 - flush_location | 0/3000220 - replay_location | 0/3000220 + sent_location | 0/7000318 + write_location | 0/7000318 + flush_location | 0/7000318 + replay_location | 0/7000318 sync_priority | 0 sync_state | async + +### Register the standby + +Register the standby server with: + + $ repmgr -f /etc/repmgr.conf standby register + NOTICE: standby node "node2" (id: 2) successfully registered + +Check the node is registered by executing `repmgr cluster show` on the standby: + + $ repmgr -f /etc/repmgr.conf cluster show + ID | Name | Role | Status | Upstream | Connection string + ----+-------+---------+-----------+----------+-------------------------------------- + 1 | node1 | primary | * running | | host=node1 dbname=repmgr user=repmgr + 2 | node2 | standby | running | node1 | host=node2 dbname=repmgr user=repmgr + +The standby server now has a copy of the records for all servers in the +replication cluster. +* * * + +> *TIP*: depending on your environment and workload, it may take some time for +> the standby's node record to propagate from the primary to the standby. Some +> actions (such as starting `repmgrd`) require that the standby's node record +> is present and up-to-date to function correctly - by providing the option +> `--wait-sync` to the `repmgr standby register` command, `repmgr` will wait +> until the record is synchronised before exiting. An optional timeout (in +> seconds) can be added to this option (e.g. `--wait-sync=60`). + +* * * + +Under some circumstances you may wish to register a standby which is not +yet running; this can be the case when using provisioning tools to create +a complex replication cluster. In this case, by using the `-F/--force` +option and providing the connection parameters to the primary server, +the standby can be registered. + +Similarly, with cascading replication it may be necessary to register +a standby whose upstream node has not yet been registered - in this case, +using `-F/--force` will result in the creation of an inactive placeholder +record for the upstream node, which will however later need to be registered +with the `-F/--force` option too. + +When used with `standby register`, care should be taken that use of the +`-F/--force` option does not result in an incorrectly configured cluster. + +### Using Barman to clone a standby + +`repmgr standby clone` can use Barman (the "Backup and +Replication manager", https://www.pgbarman.org/), as a provider of both +base backups and WAL files. + +Barman support provides the following advantages: + +- the primary node does not need to perform a new backup every time a + new standby is cloned; +- a standby node can be disconnected for longer periods without losing + the ability to catch up, and without causing accumulation of WAL + files on the primary node; +- therefore, `repmgr` does not need to use replication slots, and on the + primary node, `wal_keep_segments` does not need to be set. + +> *NOTE*: In view of the above, Barman support is incompatible with +> the `use_replication_slots` setting in `repmgr.conf`. + +In order to enable Barman support for `repmgr standby clone`, following +prerequisites must be met: + +- the `barman_server` setting in `repmgr.conf` is the same as the + server configured in Barman; +- the `barman_host` setting in `repmgr.conf` is set to the SSH + hostname of the Barman server; +- the `restore_command` setting in `repmgr.conf` is configured to + use a copy of the `barman-wal-restore` script shipped with the + `barman-cli` package (see below); +- the Barman catalogue includes at least one valid backup for this + server. + +> *NOTE*: Barman support is automatically enabled if `barman_server` +> is set. Normally it is a good practice to use Barman, for instance +> when fetching a base backup while cloning a standby; in any case, +> Barman mode can be disabled using the `--without-barman` command +> line option. + +> *NOTE*: if you have a non-default SSH configuration on the Barman +> server, e.g. using a port other than 22, then you can set those +> parameters in a dedicated Host section in `~/.ssh/config` +> corresponding to the value of `barman_server` in `repmgr.conf`. See +> the "Host" section in `man 5 ssh_config` for more details. + +`barman-wal-restore` is a Python script provided by the Barman +development team as part of the `barman-cli` package (Barman 2.0 +and later; for Barman 1.x the script is provided separately as +`barman-wal-restore.py`). + +`restore_command` must then be set in `repmgr.conf` as follows: + +