diff --git a/doc/cloning-standbys.sgml b/doc/cloning-standbys.sgml new file mode 100644 index 00000000..777b3677 --- /dev/null +++ b/doc/cloning-standbys.sgml @@ -0,0 +1,162 @@ + + Cloning standbys + + + + + Barman + Cloning a standby from Barman + + can use + 2ndQuadrant's + Barman application + to clone a standby (and also as a fallback source for WAL files). + + + + Barman (aka PgBarman) should be considered as an integral part of any + PostgreSQL replication cluster. For more details see: + https://www.pgbarman.org/. + + + + Barman support provides the following advantages: + + + + the primary node does not need to perform a new backup every time a + new standby is cloned + + + + + a standby node can be disconnected for longer periods without losing + the ability to catch up, and without causing accumulation of WAL + files on the primary node + + + + + WAL management on the primary becomes much easier as there's no need + to use replication slots, and wal_keep_segments + does not need to be set. + + + + + + + Prerequisites for cloning from Barman + + In order to enable Barman support for repmgr standby clone, following + prerequisites must be met: + + + + the barman_server setting in repmgr.conf is the same as the + server configured in Barman; + + + + + the barman_host setting in repmgr.conf is set to the SSH + hostname of the Barman server; + + + + + the restore_command setting in repmgr.conf is configured to + use a copy of the barman-wal-restore script shipped with the + barman-cli package (see below); + + + + + the Barman catalogue includes at least one valid backup for this server. + + + + + + + Barman support is automatically enabled if barman_server + is set. Normally it is good practice to use Barman, for instance + when fetching a base backup while cloning a standby; in any case, + Barman mode can be disabled using the --without-barman + command line option. + + + + + If you have a non-default SSH configuration on the Barman + server, e.g. using a port other than 22, then you can set those + parameters in a dedicated Host section in ~/.ssh/config + corresponding to the value ofbarman_host in + repmgr.conf. See the Host + section in man 5 ssh_config for more details. + + + + It's now possible to clone a standby from Barman, e.g.: + + NOTICE: using configuration file "/etc/repmgr.conf" + NOTICE: destination directory "/var/lib/postgresql/data" provided + INFO: connecting to Barman server to verify backup for test_cluster + INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data" + INFO: creating directory "/var/lib/postgresql/data/repmgr"... + INFO: connecting to Barman server to fetch server parameters + INFO: connecting to upstream node + INFO: connected to source node, checking its state + INFO: successfully connected to source node + DETAIL: current installation size is 29 MB + NOTICE: retrieving backup from Barman... + receiving file list ... + (...) + NOTICE: standby clone (from Barman) complete + NOTICE: you can now start your PostgreSQL server + HINT: for example: pg_ctl -D /var/lib/postgresql/data start + + + + + Using Barman as a WAL file source + + As a fallback in case streaming replication is interrupted, PostgreSQL can optionally + retrieve WAL files from an archive, such as that provided by Barman. This is done by + setting restore_command in recovery.conf to + a valid shell command which can retrieve a specified WAL file from the archive. + + + barman-wal-restore is a Python script provided as part of the barman-cli + package (Barman 2.0 and later; for Barman 1.x the script is provided separately as + barman-wal-restore.py) which performs this function for Barman. + + + To use barman-wal-restore with &repmgr; + and assuming Barman is located on the barmansrv host + and that barman-wal-restore is located as an executable at + /usr/bin/barman-wal-restore, + repmgr.conf should include the following lines: + + barman_host=barmansrv + barman_server=somedb + restore_command=/usr/bin/barman-wal-restore barmansrv somedb %f %p + + + + barman-wal-restore supports command line switches to + control parallelism (--parallel=N) and compression ( + --bzip2, --gzip). + + + + + To use a non-default Barman configuration file on the Barman server, + specify this in repmgr.conf with barman_config: + + barman_config=/path/to/barman.conf + + + + + diff --git a/doc/command-reference.sgml b/doc/command-reference.sgml new file mode 100644 index 00000000..4fd688cf --- /dev/null +++ b/doc/command-reference.sgml @@ -0,0 +1,147 @@ + + repmgr command reference + + + Overview of repmgr commands. + + + + repmgr standby clone + repmgr standby clone + + repmgr standby clone clones a PostgreSQL node from another + PostgreSQL node, typically the primary, but optionally from any other node in + the cluster or from Barman. It creates the recovery.conf file required + to attach the cloned node to the primary node (or another standby, if cascading replication + is in use). + + + + repmgr standby clone does not start the standby, and after cloning + repmgr standby register must be executed to notify &repmgr; of its presence. + + + + + + Handling configuration files + + + Note that by default, all configuration files in the source node's data + directory will be copied to the cloned node. Typically these will be + postgresql.conf, postgresql.auto.conf, + pg_hba.conf and pg_ident.conf. + These may require modification before the standby is started. + + + In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's + configuration files are located outside of the data directory and will + not be copied by default. &repmgr; can copy these files, either to the same + location on the standby server (provided appropriate directory and file permissions + are available), or into the standby's data directory. This requires passwordless + SSH access to the primary server. Add the option --copy-external-config-files + to the repmgr standby clone command; by default files will be copied to + the same path as on the upstream server. Note that the user executing repmgr + must have write access to those directories. + + + To have the configuration files placed in the standby's data directory, specify + --copy-external-config-files=pgdata, but note that + any include directives in the copied files may need to be updated. + + + + For reliable configuration file management we recommend using a + configuration management tool such as Ansible, Chef, Puppet or Salt. + + + + + + Managing WAL during the cloning process + + When initially cloning a standby, you will need to ensure + that all required WAL files remain available while the cloning is taking + place. To ensure this happens when using the default `pg_basebackup` method, + &repmgr; will set pg_basebackup's --xlog-method + parameter to stream, + which will ensure all WAL files generated during the cloning process are + streamed in parallel with the main backup. Note that this requires two + replication connections to be available (&repmgr; will verify sufficient + connections are available before attempting to clone, and this can be checked + before performing the clone using the --dry-run option). + + + To override this behaviour, in repmgr.conf set + pg_basebackup's --xlog-method + parameter to fetch: + + pg_basebackup_options='--xlog-method=fetch' + + and ensure that wal_keep_segments is set to an appropriately high value. + See the + pg_basebackup documentation for details. + + + + + From PostgreSQL 10, pg_basebackup's + --xlog-method parameter has been renamed to + --wal-method. + + + + + + + + repmgr standby register + repmgr standby register + + repmgr standby register adds a standby's information to + the &repmgr; metadata. This command needs to be executed to enable + promote/follow operations and to allow repmgrd to work with the node. + An existing standby can be registered using this command. Execute with the + --dry-run option to check what would happen without actually registering the + standby. + + + + Waiting for the registration to propagate to the standby + + Depending on your environment and workload, it may take some time for + the standby's node record to propagate from the primary to the standby. Some + actions (such as starting repmgrd) require that the standby's node record + is present and up-to-date to function correctly. + + + By providing the option --wait-sync to the + repmgr standby register command, &repmgr; will wait + until the record is synchronised before exiting. An optional timeout (in + seconds) can be added to this option (e.g. --wait-sync=60). + + + + + Registering an inactive node + + Under some circumstances you may wish to register a standby which is not + yet running; this can be the case when using provisioning tools to create + a complex replication cluster. In this case, by using the -F/--force + option and providing the connection parameters to the primary server, + the standby can be registered. + + + Similarly, with cascading replication it may be necessary to register + a standby whose upstream node has not yet been registered - in this case, + using -F/--force will result in the creation of an inactive placeholder + record for the upstream node, which will however later need to be registered + with the -F/--force option too. + + + When used with repmgr standby register, care should be taken that use of the + -F/--force option does not result in an incorrectly configured cluster. + + + + diff --git a/doc/filelist.sgml b/doc/filelist.sgml index c11f0179..e4e2a6b8 100644 --- a/doc/filelist.sgml +++ b/doc/filelist.sgml @@ -39,4 +39,8 @@ + + + + diff --git a/doc/quickstart.sgml b/doc/quickstart.sgml index 3bc51d99..98df606b 100644 --- a/doc/quickstart.sgml +++ b/doc/quickstart.sgml @@ -119,10 +119,10 @@ - repmgr user and database + Create the repmgr user and database Create a dedicated PostgreSQL superuser account and a database for - the `repmgr` metadata, e.g. + the &repmgr; metadata, e.g. createuser -s repmgr @@ -147,12 +147,24 @@ overridden by specifying a separate replication user when registering each node. + + + + &repmgr; will install the repmgr extension, which creates a + repmgr schema containing the &repmgr;'s metadata tables as + well as other functions and views. We also recommend that you set the + repmgr user's search path to include this schema name, e.g. + + ALTER USER repmgr SET search_path TO repmgr, "$user", public; + + + Configuring authentication in pg_hba.conf - Ensure the `repmgr` user has appropriate permissions in pg_hba.conf and + Ensure the repmgr user has appropriate permissions in pg_hba.conf and can connect in replication mode; pg_hba.conf should contain entries similar to the following: @@ -166,6 +178,7 @@ host repmgr repmgr 192.168.1.0/24 trust + Note that these are simple settings for testing purposes. Adjust according to your network environment and authentication requirements. @@ -176,7 +189,7 @@ On the standby, do not create a PostgreSQL instance, but do ensure the destination data directory (and any other directories which you want PostgreSQL to use) exist and are owned by the postgres system user. Permissions - should be set to 0700 (drwx------). + must be set to 0700 (drwx------). Check the primary database is reachable from the standby using psql: @@ -208,16 +221,226 @@ data_directory='/var/lib/postgresql/data' - - - - repmgr.conf should not be stored inside the PostgreSQL data directory, as it could be overwritten when setting up or reinitialising the PostgreSQL server. See sections on and for further details about repmgr.conf. + + + For Debian-based distributions we recommend explictly setting + pg_bindir to the directory where pg_ctl and other binaries + not in the standard path are located. For PostgreSQL 9.6 this would be /usr/lib/postgresql/9.6/bin/. + + + + + See the file + repmgr.conf.sample + for details of all available configuration parameters. + + + + + + + Register the primary server + + To enable &repmgr; to support a replication cluster, the primary node must + be registered with &repmgr;. This installs the repmgr + extension and metadata objects, and adds a metadata record for the primary server: + + + + $ repmgr -f /etc/repmgr.conf primary register + INFO: connecting to primary database... + NOTICE: attempting to install extension "repmgr" + NOTICE: "repmgr" extension successfully installed + NOTICE: primary node record (id: 1) registered + + + Verify status of the cluster like this: + + + $ repmgr -f /etc/repmgr.conf cluster show + ID | Name | Role | Status | Upstream | Connection string + ----+-------+---------+-----------+----------+-------------------------------------------------------- + 1 | node1 | primary | * running | | host=node1 dbname=repmgr user=repmgr connect_timeout=2 + + + The record in the repmgr metadata table will look like this: + + + repmgr=# SELECT * FROM repmgr.nodes; + -[ RECORD 1 ]----+------------------------------------------------------- + node_id | 1 + upstream_node_id | + active | t + node_name | node1 + type | primary + location | default + priority | 100 + conninfo | host=node1 dbname=repmgr user=repmgr connect_timeout=2 + repluser | repmgr + slot_name | + config_file | /etc/repmgr.conf + + Each server in the replication cluster will have its own record. If repmgrd + is in use, the fields upstream_node_id, active and + type will be updated when the node's status or role changes. + + + + + Clone the standby server + + Create a repmgr.conf file on the standby server. It must contain at + least the same parameters as the primary's repmgr.conf, but with + the mandatory values node, node_name, conninfo + (and possibly data_directory) adjusted accordingly, e.g.: + + + node=2 + node_name=node2 + conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2' + data_directory='/var/lib/postgresql/data' + + + Use the --dry-run option to check the standby can be cloned: + + + $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run + NOTICE: using provided configuration file "/etc/repmgr.conf" + NOTICE: destination directory "/var/lib/postgresql/data" provided + INFO: connecting to source node + NOTICE: checking for available walsenders on source node (2 required) + INFO: sufficient walsenders available on source node (2 required) + NOTICE: standby will attach to upstream node 1 + HINT: consider using the -c/--fast-checkpoint option + INFO: all prerequisites for "standby clone" are met + + If no problems are reported, the standby can then be cloned with: + + + $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone + + NOTICE: using configuration file "/etc/repmgr.conf" + NOTICE: destination directory "/var/lib/postgresql/data" provided + INFO: connecting to source node + NOTICE: checking for available walsenders on source node (2 required) + INFO: sufficient walsenders available on source node (2 required) + INFO: creating directory "/var/lib/postgresql/data"... + NOTICE: starting backup (using pg_basebackup)... + HINT: this may take some time; consider using the -c/--fast-checkpoint option + INFO: executing: + pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream + NOTICE: standby clone (using pg_basebackup) complete + NOTICE: you can now start your PostgreSQL server + HINT: for example: pg_ctl -D /var/lib/postgresql/data start + + + This has cloned the PostgreSQL data directory files from the primary node1 + using PostgreSQL's pg_basebackup utility. A recovery.conf + file containing the correct parameters to start streaming from this primary server will be created + automatically. + + + + By default, any configuration files in the primary's data directory will be + copied to the standby. Typically these will be postgresql.conf, + postgresql.auto.conf, pg_hba.conf and + pg_ident.conf. These may require modification before the standby + is started. + + + + Make any adjustments to the standby's PostgreSQL configuration files now, + then start the server. + + + For more details on repmgr standby clone, see the + command reference. + A more detailed overview of cloning options is available in the + administration manual. + + + + + Verify replication is functioning + + Connect to the primary server and execute: + + repmgr=# SELECT * FROM pg_stat_replication; + -[ RECORD 1 ]----+------------------------------ + pid | 19111 + usesysid | 16384 + usename | repmgr + application_name | node2 + client_addr | 192.168.1.12 + client_hostname | + client_port | 50378 + backend_start | 2017-08-28 15:14:19.851581+09 + backend_xmin | + state | streaming + sent_location | 0/7000318 + write_location | 0/7000318 + flush_location | 0/7000318 + replay_location | 0/7000318 + sync_priority | 0 + sync_state | async + This shows that the previously cloned standby (node2 shown in the field + application_name) has connected to the primary from IP address + 192.168.1.12. + + + From PostgreSQL 9.6 you can also use the view + + pg_stat_wal_receiver to check the replication status from the standby. + + + repmgr=# SELECT * FROM pg_stat_wal_receiver; + Expanded display is on. + -[ RECORD 1 ]---------+-------------------------------------------------------------------------------- + pid | 18236 + status | streaming + receive_start_lsn | 0/3000000 + receive_start_tli | 1 + received_lsn | 0/7000538 + received_tli | 1 + last_msg_send_time | 2017-08-28 15:21:26.465728+09 + last_msg_receipt_time | 2017-08-28 15:21:26.465774+09 + latest_end_lsn | 0/7000538 + latest_end_time | 2017-08-28 15:20:56.418735+09 + slot_name | + conninfo | user=repmgr dbname=replication host=node1 application_name=node2 + + Note that the conninfo value is that generated in recovery.conf + and will differ slightly from the primary's conninfo as set in repmgr.conf - + among others it will contain the connecting node's name as application_name. + + + + + Register the standby + + Register the standby server with: + + $ repmgr -f /etc/repmgr.conf standby register + NOTICE: standby node "node2" (ID: 2) successfully registered + + + Check the node is registered by executing repmgr cluster show on the standby: + + $ repmgr -f /etc/repmgr.conf cluster show + ID | Name | Role | Status | Upstream | Location | Connection string + ----+-------+---------+-----------+----------+----------+-------------------------------------- + 1 | node1 | primary | * running | | default | host=node1 dbname=repmgr user=repmgr + 2 | node2 | standby | running | node1 | default | host=node2 dbname=repmgr user=repmgr + + + Both nodes are now registered with &repmgr; and the records have been copied to the standby server. + diff --git a/doc/repmgr.sgml b/doc/repmgr.sgml index e84c2067..a8a8b055 100644 --- a/doc/repmgr.sgml +++ b/doc/repmgr.sgml @@ -23,10 +23,9 @@ - Thisis the official documentation of repmgr &repmgrversion; for - use with PostgreSQL 9.4 - PostgreSQL 10. - It describes all the functionality that the current version of repmgr officially - supports. + Thisis the official documentation of &repmgr; &repmgrversion; for + use with PostgreSQL 9.3 - PostgreSQL 10. + It describes the functionality supported by the current version of &repmgr;. @@ -69,8 +68,14 @@ repmgr administration manual &configuration; + &cloning-standbys; + &command-reference; + &appendix-signatures; + + ]]> +