diff --git a/doc/command-reference.sgml b/doc/command-reference.sgml deleted file mode 100644 index 5ce831dd..00000000 --- a/doc/command-reference.sgml +++ /dev/null @@ -1,610 +0,0 @@ - - repmgr command reference - - - Overview of repmgr commands. - - - - repmgr primary register - repmgr primary register - - repmgr primary register registers a primary node in a - streaming replication cluster, and configures it for use with repmgr, including - installing the &repmgr; extension. This command needs to be executed before any - standby nodes are registered. - - - Execute with the --dry-run option to check what would happen without - actually registering the primary. - - - repmgr master register can be used as an alias for - repmgr primary register/ - - - - - repmgr primary unregister - repmgr primary unregister - - repmgr primary register unregisters an inactive primary node - from the `repmgr` metadata. This is typically when the primary has failed and is - being removed from the cluster after a new primary has been promoted. - - - Execute with the --dry-run option to check what would happen without - actually unregistering the node. - - - - repmgr master unregister can be used as an alias for - repmgr primary unregister/ - - - - - - repmgr standby clone - cloning - - repmgr standby clone - - repmgr standby clone clones a PostgreSQL node from another - PostgreSQL node, typically the primary, but optionally from any other node in - the cluster or from Barman. It creates the recovery.conf file required - to attach the cloned node to the primary node (or another standby, if cascading replication - is in use). - - - - repmgr standby clone does not start the standby, and after cloning - repmgr standby register must be executed to notify &repmgr; of its presence. - - - - - - Handling configuration files - - - Note that by default, all configuration files in the source node's data - directory will be copied to the cloned node. Typically these will be - postgresql.conf, postgresql.auto.conf, - pg_hba.conf and pg_ident.conf. - These may require modification before the standby is started. - - - In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's - configuration files are located outside of the data directory and will - not be copied by default. &repmgr; can copy these files, either to the same - location on the standby server (provided appropriate directory and file permissions - are available), or into the standby's data directory. This requires passwordless - SSH access to the primary server. Add the option --copy-external-config-files - to the repmgr standby clone command; by default files will be copied to - the same path as on the upstream server. Note that the user executing repmgr - must have write access to those directories. - - - To have the configuration files placed in the standby's data directory, specify - --copy-external-config-files=pgdata, but note that - any include directives in the copied files may need to be updated. - - - - For reliable configuration file management we recommend using a - configuration management tool such as Ansible, Chef, Puppet or Salt. - - - - - - Managing WAL during the cloning process - - When initially cloning a standby, you will need to ensure - that all required WAL files remain available while the cloning is taking - place. To ensure this happens when using the default `pg_basebackup` method, - &repmgr; will set pg_basebackup's --xlog-method - parameter to stream, - which will ensure all WAL files generated during the cloning process are - streamed in parallel with the main backup. Note that this requires two - replication connections to be available (&repmgr; will verify sufficient - connections are available before attempting to clone, and this can be checked - before performing the clone using the --dry-run option). - - - To override this behaviour, in repmgr.conf set - pg_basebackup's --xlog-method - parameter to fetch: - - pg_basebackup_options='--xlog-method=fetch' - - and ensure that wal_keep_segments is set to an appropriately high value. - See the - pg_basebackup documentation for details. - - - - - From PostgreSQL 10, pg_basebackup's - --xlog-method parameter has been renamed to - --wal-method. - - - - - - - - repmgr standby register - repmgr standby register - - repmgr standby register adds a standby's information to - the &repmgr; metadata. This command needs to be executed to enable - promote/follow operations and to allow repmgrd to work with the node. - An existing standby can be registered using this command. Execute with the - --dry-run option to check what would happen without actually registering the - standby. - - - - Waiting for the registration to propagate to the standby - - Depending on your environment and workload, it may take some time for - the standby's node record to propagate from the primary to the standby. Some - actions (such as starting repmgrd) require that the standby's node record - is present and up-to-date to function correctly. - - - By providing the option --wait-sync to the - repmgr standby register command, &repmgr; will wait - until the record is synchronised before exiting. An optional timeout (in - seconds) can be added to this option (e.g. --wait-sync=60). - - - - - Registering an inactive node - - Under some circumstances you may wish to register a standby which is not - yet running; this can be the case when using provisioning tools to create - a complex replication cluster. In this case, by using the -F/--force - option and providing the connection parameters to the primary server, - the standby can be registered. - - - Similarly, with cascading replication it may be necessary to register - a standby whose upstream node has not yet been registered - in this case, - using -F/--force will result in the creation of an inactive placeholder - record for the upstream node, which will however later need to be registered - with the -F/--force option too. - - - When used with repmgr standby register, care should be taken that use of the - -F/--force option does not result in an incorrectly configured cluster. - - - - - - - repmgr standby unregister - repmgr standby unregister - - Unregisters a standby with `repmgr`. This command does not affect the actual - replication, just removes the standby's entry from the &repmgr; metadata. - - - To unregister a running standby, execute: - - repmgr standby unregister -f /etc/repmgr.conf - - - This will remove the standby record from &repmgr;'s internal metadata - table (repmgr.nodes). A standby_unregister - event notification will be recorded in the repmgr.events table. - - - If the standby is not running, the command can be executed on another - node by providing the id of the node to be unregistered using - the command line parameter --node-id, e.g. executing the following - command on the master server will unregister the standby with - id 3: - - repmgr standby unregister -f /etc/repmgr.conf --node-id=3 - - - - - - - repmgr standby promote - - repmgr standby promote - - Promotes a standby to a primary if the current primary has failed. This - command requires a valid repmgr.conf file for the standby, either - specified explicitly with -f/--config-file or located in a - default location; no additional arguments are required. - - - If the standby promotion succeeds, the server will not need to be - restarted. However any other standbys will need to follow the new server, - by using ; if repmgrd is active, it will - handle this automatically. - - - - - - - repmgr standby follow - - repmgr standby follow - - Attaches the standby to a new primary. This command requires a valid - repmgr.conf file for the standby, either specified - explicitly with -f/--config-file or located in a - default location; no additional arguments are required. - - - This command will force a restart of the standby server, which must be - running. It can only be used to attach a standby to a new primary node. - - - To re-add an inactive node to the replication cluster, see - - - - - - - - - - repmgr standby switchover - - repmgr standby switchover - - Promotes a standby to primary and demotes the existing primary to a standby. - This command must be run on the standby to be promoted, and requires a - passwordless SSH connection to the current primary. - - - If other standbys are connected to the demotion candidate, &repmgr; can instruct - these to follow the new primary if the option --siblings-follow - is specified. - - - Execute with the --dry-run option to test the switchover as far as - possible without actually changing the status of either node. - - - repmgrd should not be active on any nodes while a switchover is being - executed. This restriction may be lifted in a later version. - - - For more details see the section . - - - - - - - repmgr node status - - repmgr node status - - Displays an overview of a node's basic information and replication - status. This command must be run on the local node. - - - Sample output (execute repmgr node status): - - Node "node1": - PostgreSQL version: 10beta1 - Total data size: 30 MB - Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2 - Role: primary - WAL archiving: off - Archive command: (none) - Replication connections: 2 (of maximal 10) - Replication slots: 0 (of maximal 10) - Replication lag: n/a - - - - See to diagnose issues. - - - - - - repmgr node check - - repmgr node check - - Performs some health checks on a node from a replication perspective. - This command must be run on the local node. - - - Sample output (execute repmgr node check): - - Node "node1": - Server role: OK (node is primary) - Replication lag: OK (N/A - node is primary) - WAL archiving: OK (0 pending files) - Downstream servers: OK (2 of 2 downstream nodes attached) - Replication slots: OK (node has no replication slots) - - - - Additionally each check can be performed individually by supplying - an additional command line parameter, e.g.: - - $ repmgr node check --role - OK (node is primary) - - - - Parameters for individual checks are as follows: - - - - - --role: checks if the node has the expected role - - - - - - --replication-lag: checks if the node is lagging by more than - replication_lag_warning or replication_lag_critical - - - - - - --archive-ready: checks for WAL files which have not yet been archived - - - - - - --downstream: checks that the expected downstream nodes are attached - - - - - - --slots: checks there are no inactive replication slots - - - - - - - Individual checks can also be output in a Nagios-compatible format by additionally - providing the option --nagios. - - - - - - repmgr node rejoin - - repmgr node rejoin - - Enables a dormant (stopped) node to be rejoined to the replication cluster. - - - This can optionally use pg_rewind to re-integrate a node which has diverged - from the rest of the cluster, typically a failed primary. - - - - - - - repmgr cluster show - - repmgr cluster show - - Displays information about each active node in the replication cluster. This - command polls each registered server and shows its role (primary / - standby / bdr) and status. It polls each server - directly and can be run on any node in the cluster; this is also useful when analyzing - connectivity from a particular node. - - - This command requires either a valid repmgr.conf file or a database - connection string to one of the registered nodes; no additional arguments are needed. - - - - Example: - - $ repmgr -f /etc/repmgr.conf cluster show - - ID | Name | Role | Status | Upstream | Location | Connection string - ----+-------+---------+-----------+----------+----------+----------------------------------------- - 1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr - 2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr - 3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr - - - - To show database connection errors when polling nodes, run the command in - --verbose mode. - - - The `cluster show` command accepts an optional parameter --csv, which - outputs the replication cluster's status in a simple CSV format, suitable for - parsing by scripts: - - $ repmgr -f /etc/repmgr.conf cluster show --csv - 1,-1,-1 - 2,0,0 - 3,0,1 - - - The columns have following meanings: - - - - node ID - - - availability (0 = available, -1 = unavailable) - - - recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown) - - - - - - - Note that the availability is tested by connecting from the node where - repmgr cluster show is executed, and does not necessarily imply the node - is down. See and to get - a better overviews of connections between nodes. - - - - - - repmgr cluster matrix - - repmgr cluster matrix - - repmgr cluster matrix runs repmgr cluster show on each - node and arranges the results in a matrix, recording success or failure. - - - repmgr cluster matrix requires a valid repmgr.conf - file on each node. Additionally passwordless `ssh` connections are required between - all nodes. - - - Example 1 (all nodes up): - - $ repmgr -f /etc/repmgr.conf cluster matrix - - Name | Id | 1 | 2 | 3 - -------+----+----+----+---- - node1 | 1 | * | * | * - node2 | 2 | * | * | * - node3 | 3 | * | * | * - - - Example 2 (node1 and node2 up, node3 down): - - $ repmgr -f /etc/repmgr.conf cluster matrix - - Name | Id | 1 | 2 | 3 - -------+----+----+----+---- - node1 | 1 | * | * | x - node2 | 2 | * | * | x - node3 | 3 | ? | ? | ? - - - - Each row corresponds to one server, and indicates the result of - testing an outbound connection from that server. - - - Since node3 is down, all the entries in its row are filled with - ?, meaning that there we cannot test outbound connections. - - - The other two nodes are up; the corresponding rows have x in the - column corresponding to node3, meaning that inbound connections to - that node have failed, and `*` in the columns corresponding to - node1 and node2, meaning that inbound connections - to these nodes have succeeded. - - - Example 3 (all nodes up, firewall dropping packets originating - from node1 and directed to port 5432 on node3) - - running repmgr cluster matrix from node1 gives the following output: - - $ repmgr -f /etc/repmgr.conf cluster matrix - - Name | Id | 1 | 2 | 3 - -------+----+----+----+---- - node1 | 1 | * | * | x - node2 | 2 | * | * | * - node3 | 3 | ? | ? | ? - - - Note this may take some time depending on the connect_timeout - setting in the node conninfo strings; default is - 1 minute which means without modification the above - command would take around 2 minutes to run; see comment elsewhere about setting - connect_timeout) - - - The matrix tells us that we cannot connect from node1 to node3, - and that (therefore) we don't know the state of any outbound - connection from node3. - - - In this case, the command will produce a more - useful result. - - - - - - - repmgr cluster crosscheck - - repmgr cluster crosscheck - - repmgr cluster crosscheck is similar to , - but cross-checks connections between each combination of nodes. In "Example 3" in - we have no information about the state of node3. - However by running repmgr cluster crosscheck it's possible to get a better - overview of the cluster situation: - - $ repmgr -f /etc/repmgr.conf cluster crosscheck - - Name | Id | 1 | 2 | 3 - -------+----+----+----+---- - node1 | 1 | * | * | x - node2 | 2 | * | * | * - node3 | 3 | * | * | * - - - What happened is that repmgr cluster crosscheck merged its own - repmgr cluster matrix with the repmgr cluster matrix - output from node2; the latter is able to connect to node3 - and therefore determine the state of outbound connections from that node. - - - - - - repmgr cluster cleanup - - repmgr cluster cleanup - - Purges monitoring history from the repmgr.monitoring_history table to - prevent excessive table growth. Use the -k/--keep-history to specify the - number of days of monitoring history to retain. This command can be used - manually or as a cronjob. - - - This command requires a valid repmgr.conf file for the node on which it is - executed; no additional arguments are required. - - - - Monitoring history will only be written if repmgrd is active, and - monitoring_history is set to true in repmgr.conf. - - - - - diff --git a/doc/filelist.sgml b/doc/filelist.sgml index 13e0c8bf..c1bed9cb 100644 --- a/doc/filelist.sgml +++ b/doc/filelist.sgml @@ -43,7 +43,23 @@ - + + + + + + + + + + + + + + + + + diff --git a/doc/repmgr-cluster-cleanup.sgml b/doc/repmgr-cluster-cleanup.sgml new file mode 100644 index 00000000..bafc34f1 --- /dev/null +++ b/doc/repmgr-cluster-cleanup.sgml @@ -0,0 +1,22 @@ + + + repmgr cluster cleanup + + repmgr cluster cleanup + + Purges monitoring history from the repmgr.monitoring_history table to + prevent excessive table growth. Use the -k/--keep-history to specify the + number of days of monitoring history to retain. This command can be used + manually or as a cronjob. + + + This command requires a valid repmgr.conf file for the node on which it is + executed; no additional arguments are required. + + + + Monitoring history will only be written if repmgrd is active, and + monitoring_history is set to true in repmgr.conf. + + + diff --git a/doc/repmgr-cluster-crosscheck.sgml b/doc/repmgr-cluster-crosscheck.sgml new file mode 100644 index 00000000..dd361883 --- /dev/null +++ b/doc/repmgr-cluster-crosscheck.sgml @@ -0,0 +1,28 @@ + + + repmgr cluster crosscheck + + repmgr cluster crosscheck + + repmgr cluster crosscheck is similar to , + but cross-checks connections between each combination of nodes. In "Example 3" in + we have no information about the state of node3. + However by running repmgr cluster crosscheck it's possible to get a better + overview of the cluster situation: + + $ repmgr -f /etc/repmgr.conf cluster crosscheck + + Name | Id | 1 | 2 | 3 + -------+----+----+----+---- + node1 | 1 | * | * | x + node2 | 2 | * | * | * + node3 | 3 | * | * | * + + + What happened is that repmgr cluster crosscheck merged its own + repmgr cluster matrix with the repmgr cluster matrix + output from node2; the latter is able to connect to node3 + and therefore determine the state of outbound connections from that node. + + + diff --git a/doc/repmgr-cluster-matrix.sgml b/doc/repmgr-cluster-matrix.sgml new file mode 100644 index 00000000..d37d1406 --- /dev/null +++ b/doc/repmgr-cluster-matrix.sgml @@ -0,0 +1,83 @@ + + + repmgr cluster matrix + + repmgr cluster matrix + + repmgr cluster matrix runs repmgr cluster show on each + node and arranges the results in a matrix, recording success or failure. + + + repmgr cluster matrix requires a valid repmgr.conf + file on each node. Additionally passwordless `ssh` connections are required between + all nodes. + + + Example 1 (all nodes up): + + $ repmgr -f /etc/repmgr.conf cluster matrix + + Name | Id | 1 | 2 | 3 + -------+----+----+----+---- + node1 | 1 | * | * | * + node2 | 2 | * | * | * + node3 | 3 | * | * | * + + + Example 2 (node1 and node2 up, node3 down): + + $ repmgr -f /etc/repmgr.conf cluster matrix + + Name | Id | 1 | 2 | 3 + -------+----+----+----+---- + node1 | 1 | * | * | x + node2 | 2 | * | * | x + node3 | 3 | ? | ? | ? + + + + Each row corresponds to one server, and indicates the result of + testing an outbound connection from that server. + + + Since node3 is down, all the entries in its row are filled with + ?, meaning that there we cannot test outbound connections. + + + The other two nodes are up; the corresponding rows have x in the + column corresponding to node3, meaning that inbound connections to + that node have failed, and `*` in the columns corresponding to + node1 and node2, meaning that inbound connections + to these nodes have succeeded. + + + Example 3 (all nodes up, firewall dropping packets originating + from node1 and directed to port 5432 on node3) - + running repmgr cluster matrix from node1 gives the following output: + + $ repmgr -f /etc/repmgr.conf cluster matrix + + Name | Id | 1 | 2 | 3 + -------+----+----+----+---- + node1 | 1 | * | * | x + node2 | 2 | * | * | * + node3 | 3 | ? | ? | ? + + + Note this may take some time depending on the connect_timeout + setting in the node conninfo strings; default is + 1 minute which means without modification the above + command would take around 2 minutes to run; see comment elsewhere about setting + connect_timeout) + + + The matrix tells us that we cannot connect from node1 to node3, + and that (therefore) we don't know the state of any outbound + connection from node3. + + + In this case, the command will produce a more + useful result. + + + diff --git a/doc/repmgr-cluster-show.sgml b/doc/repmgr-cluster-show.sgml new file mode 100644 index 00000000..f80e3dfa --- /dev/null +++ b/doc/repmgr-cluster-show.sgml @@ -0,0 +1,67 @@ + + + repmgr cluster show + + repmgr cluster show + + Displays information about each active node in the replication cluster. This + command polls each registered server and shows its role (primary / + standby / bdr) and status. It polls each server + directly and can be run on any node in the cluster; this is also useful when analyzing + connectivity from a particular node. + + + This command requires either a valid repmgr.conf file or a database + connection string to one of the registered nodes; no additional arguments are needed. + + + + Example: + + $ repmgr -f /etc/repmgr.conf cluster show + + ID | Name | Role | Status | Upstream | Location | Connection string + ----+-------+---------+-----------+----------+----------+----------------------------------------- + 1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr + 2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr + 3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr + + + + To show database connection errors when polling nodes, run the command in + --verbose mode. + + + The `cluster show` command accepts an optional parameter --csv, which + outputs the replication cluster's status in a simple CSV format, suitable for + parsing by scripts: + + $ repmgr -f /etc/repmgr.conf cluster show --csv + 1,-1,-1 + 2,0,0 + 3,0,1 + + + The columns have following meanings: + + + + node ID + + + availability (0 = available, -1 = unavailable) + + + recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown) + + + + + + + Note that the availability is tested by connecting from the node where + repmgr cluster show is executed, and does not necessarily imply the node + is down. See and to get + a better overviews of connections between nodes. + + diff --git a/doc/repmgr-node-check.sgml b/doc/repmgr-node-check.sgml new file mode 100644 index 00000000..b9a5b5f2 --- /dev/null +++ b/doc/repmgr-node-check.sgml @@ -0,0 +1,70 @@ + + + repmgr node check + + repmgr node check + + Performs some health checks on a node from a replication perspective. + This command must be run on the local node. + + + Sample output (execute repmgr node check): + + Node "node1": + Server role: OK (node is primary) + Replication lag: OK (N/A - node is primary) + WAL archiving: OK (0 pending files) + Downstream servers: OK (2 of 2 downstream nodes attached) + Replication slots: OK (node has no replication slots) + + + + Additionally each check can be performed individually by supplying + an additional command line parameter, e.g.: + + $ repmgr node check --role + OK (node is primary) + + + + Parameters for individual checks are as follows: + + + + + --role: checks if the node has the expected role + + + + + + --replication-lag: checks if the node is lagging by more than + replication_lag_warning or replication_lag_critical + + + + + + --archive-ready: checks for WAL files which have not yet been archived + + + + + + --downstream: checks that the expected downstream nodes are attached + + + + + + --slots: checks there are no inactive replication slots + + + + + + + Individual checks can also be output in a Nagios-compatible format by additionally + providing the option --nagios. + + diff --git a/doc/repmgr-node-rejoin.sgml b/doc/repmgr-node-rejoin.sgml new file mode 100644 index 00000000..14d9f8b7 --- /dev/null +++ b/doc/repmgr-node-rejoin.sgml @@ -0,0 +1,13 @@ + + + repmgr node rejoin + + repmgr node rejoin + + Enables a dormant (stopped) node to be rejoined to the replication cluster. + + + This can optionally use pg_rewind to re-integrate a node which has diverged + from the rest of the cluster, typically a failed primary. + + diff --git a/doc/repmgr-node-status.sgml b/doc/repmgr-node-status.sgml new file mode 100644 index 00000000..8789b1ca --- /dev/null +++ b/doc/repmgr-node-status.sgml @@ -0,0 +1,29 @@ + + + + repmgr node status + + repmgr node status + + Displays an overview of a node's basic information and replication + status. This command must be run on the local node. + + + Sample output (execute repmgr node status): + + Node "node1": + PostgreSQL version: 10beta1 + Total data size: 30 MB + Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2 + Role: primary + WAL archiving: off + Archive command: (none) + Replication connections: 2 (of maximal 10) + Replication slots: 0 (of maximal 10) + Replication lag: n/a + + + + See to diagnose issues. + + diff --git a/doc/repmgr-primary-register.sgml b/doc/repmgr-primary-register.sgml new file mode 100644 index 00000000..08208e79 --- /dev/null +++ b/doc/repmgr-primary-register.sgml @@ -0,0 +1,18 @@ + + repmgr primary register + repmgr primary register + + repmgr primary register registers a primary node in a + streaming replication cluster, and configures it for use with repmgr, including + installing the &repmgr; extension. This command needs to be executed before any + standby nodes are registered. + + + Execute with the --dry-run option to check what would happen without + actually registering the primary. + + + repmgr master register can be used as an alias for + repmgr primary register. + + diff --git a/doc/repmgr-primary-unregister.sgml b/doc/repmgr-primary-unregister.sgml new file mode 100644 index 00000000..c09d05cb --- /dev/null +++ b/doc/repmgr-primary-unregister.sgml @@ -0,0 +1,18 @@ + + repmgr primary unregister + repmgr primary unregister + + repmgr primary register unregisters an inactive primary node + from the &repmgr; metadata. This is typically when the primary has failed and is + being removed from the cluster after a new primary has been promoted. + + + Execute with the --dry-run option to check what would happen without + actually unregistering the node. + + + + repmgr master unregister can be used as an alias for + repmgr primary unregister/ + + diff --git a/doc/repmgr-standby-clone.sgml b/doc/repmgr-standby-clone.sgml new file mode 100644 index 00000000..76f7e52b --- /dev/null +++ b/doc/repmgr-standby-clone.sgml @@ -0,0 +1,91 @@ + + + repmgr standby clone + cloning + + repmgr standby clone + + repmgr standby clone clones a PostgreSQL node from another + PostgreSQL node, typically the primary, but optionally from any other node in + the cluster or from Barman. It creates the recovery.conf file required + to attach the cloned node to the primary node (or another standby, if cascading replication + is in use). + + + + repmgr standby clone does not start the standby, and after cloning + repmgr standby register must be executed to notify &repmgr; of its presence. + + + + + + Handling configuration files + + + Note that by default, all configuration files in the source node's data + directory will be copied to the cloned node. Typically these will be + postgresql.conf, postgresql.auto.conf, + pg_hba.conf and pg_ident.conf. + These may require modification before the standby is started. + + + In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's + configuration files are located outside of the data directory and will + not be copied by default. &repmgr; can copy these files, either to the same + location on the standby server (provided appropriate directory and file permissions + are available), or into the standby's data directory. This requires passwordless + SSH access to the primary server. Add the option --copy-external-config-files + to the repmgr standby clone command; by default files will be copied to + the same path as on the upstream server. Note that the user executing repmgr + must have write access to those directories. + + + To have the configuration files placed in the standby's data directory, specify + --copy-external-config-files=pgdata, but note that + any include directives in the copied files may need to be updated. + + + + For reliable configuration file management we recommend using a + configuration management tool such as Ansible, Chef, Puppet or Salt. + + + + + + Managing WAL during the cloning process + + When initially cloning a standby, you will need to ensure + that all required WAL files remain available while the cloning is taking + place. To ensure this happens when using the default `pg_basebackup` method, + &repmgr; will set pg_basebackup's --xlog-method + parameter to stream, + which will ensure all WAL files generated during the cloning process are + streamed in parallel with the main backup. Note that this requires two + replication connections to be available (&repmgr; will verify sufficient + connections are available before attempting to clone, and this can be checked + before performing the clone using the --dry-run option). + + + To override this behaviour, in repmgr.conf set + pg_basebackup's --xlog-method + parameter to fetch: + + pg_basebackup_options='--xlog-method=fetch' + + and ensure that wal_keep_segments is set to an appropriately high value. + See the + pg_basebackup documentation for details. + + + + + From PostgreSQL 10, pg_basebackup's + --xlog-method parameter has been renamed to + --wal-method. + + + + + diff --git a/doc/repmgr-standby-follow.sgml b/doc/repmgr-standby-follow.sgml new file mode 100644 index 00000000..3181cf77 --- /dev/null +++ b/doc/repmgr-standby-follow.sgml @@ -0,0 +1,21 @@ + + + repmgr standby follow + + repmgr standby follow + + Attaches the standby to a new primary. This command requires a valid + repmgr.conf file for the standby, either specified + explicitly with -f/--config-file or located in a + default location; no additional arguments are required. + + + This command will force a restart of the standby server, which must be + running. It can only be used to attach a standby to a new primary node. + + + To re-add an inactive node to the replication cluster, see + + + + diff --git a/doc/repmgr-standby-promote.sgml b/doc/repmgr-standby-promote.sgml new file mode 100644 index 00000000..1c95d763 --- /dev/null +++ b/doc/repmgr-standby-promote.sgml @@ -0,0 +1,18 @@ + + + repmgr standby promote + + repmgr standby promote + + Promotes a standby to a primary if the current primary has failed. This + command requires a valid repmgr.conf file for the standby, either + specified explicitly with -f/--config-file or located in a + default location; no additional arguments are required. + + + If the standby promotion succeeds, the server will not need to be + restarted. However any other standbys will need to follow the new server, + by using ; if repmgrd + is active, it will handle this automatically. + + diff --git a/doc/repmgr-standby-register.sgml b/doc/repmgr-standby-register.sgml new file mode 100644 index 00000000..b4c77ce5 --- /dev/null +++ b/doc/repmgr-standby-register.sgml @@ -0,0 +1,50 @@ + + repmgr standby register + repmgr standby register + + repmgr standby register adds a standby's information to + the &repmgr; metadata. This command needs to be executed to enable + promote/follow operations and to allow repmgrd to work with the node. + An existing standby can be registered using this command. Execute with the + --dry-run option to check what would happen without actually registering the + standby. + + + + Waiting for the registration to propagate to the standby + + Depending on your environment and workload, it may take some time for + the standby's node record to propagate from the primary to the standby. Some + actions (such as starting repmgrd) require that the standby's node record + is present and up-to-date to function correctly. + + + By providing the option --wait-sync to the + repmgr standby register command, &repmgr; will wait + until the record is synchronised before exiting. An optional timeout (in + seconds) can be added to this option (e.g. --wait-sync=60). + + + + + Registering an inactive node + + Under some circumstances you may wish to register a standby which is not + yet running; this can be the case when using provisioning tools to create + a complex replication cluster. In this case, by using the -F/--force + option and providing the connection parameters to the primary server, + the standby can be registered. + + + Similarly, with cascading replication it may be necessary to register + a standby whose upstream node has not yet been registered - in this case, + using -F/--force will result in the creation of an inactive placeholder + record for the upstream node, which will however later need to be registered + with the -F/--force option too. + + + When used with repmgr standby register, care should be taken that use of the + -F/--force option does not result in an incorrectly configured cluster. + + + diff --git a/doc/repmgr-standby-switchover.sgml b/doc/repmgr-standby-switchover.sgml new file mode 100644 index 00000000..102d6311 --- /dev/null +++ b/doc/repmgr-standby-switchover.sgml @@ -0,0 +1,27 @@ + + + repmgr standby switchover + + repmgr standby switchover + + Promotes a standby to primary and demotes the existing primary to a standby. + This command must be run on the standby to be promoted, and requires a + passwordless SSH connection to the current primary. + + + If other standbys are connected to the demotion candidate, &repmgr; can instruct + these to follow the new primary if the option --siblings-follow + is specified. + + + Execute with the --dry-run option to test the switchover as far as + possible without actually changing the status of either node. + + + repmgrd should not be active on any nodes while a switchover is being + executed. This restriction may be lifted in a later version. + + + For more details see the section . + + diff --git a/doc/repmgr-standby-unregister.sgml b/doc/repmgr-standby-unregister.sgml new file mode 100644 index 00000000..7fd6e2f9 --- /dev/null +++ b/doc/repmgr-standby-unregister.sgml @@ -0,0 +1,29 @@ + + repmgr standby unregister + repmgr standby unregister + + Unregisters a standby with `repmgr`. This command does not affect the actual + replication, just removes the standby's entry from the &repmgr; metadata. + + + To unregister a running standby, execute: + + repmgr standby unregister -f /etc/repmgr.conf + + + This will remove the standby record from &repmgr;'s internal metadata + table (repmgr.nodes). A standby_unregister + event notification will be recorded in the repmgr.events table. + + + If the standby is not running, the command can be executed on another + node by providing the id of the node to be unregistered using + the command line parameter --node-id, e.g. executing the following + command on the master server will unregister the standby with + id 3: + + repmgr standby unregister -f /etc/repmgr.conf --node-id=3 + + + + diff --git a/doc/repmgr.sgml b/doc/repmgr.sgml index 02f1a9af..1d498ae4 100644 --- a/doc/repmgr.sgml +++ b/doc/repmgr.sgml @@ -52,6 +52,7 @@ PostgreSQL replication asynchronous + HA high-availability @@ -72,9 +73,27 @@ &promoting-standby; &follow-new-primary; &switchover; - &command-reference; + + repmgr command reference + + &repmgr-primary-register; + &repmgr-primary-unregister; + &repmgr-standby-clone; + &repmgr-standby-register; + &repmgr-standby-unregister; + &repmgr-standby-promote; + &repmgr-standby-follow; + &repmgr-standby-switchover; + &repmgr-node-status; + &repmgr-node-check; + &repmgr-node-rejoin; + &repmgr-cluster-show; + &repmgr-cluster-matrix; + &repmgr-cluster-crosscheck; + &repmgr-cluster-cleanup; + &appendix-signatures;