Performing a switchover with repmgr

Performing a switchover with repmgr switchover A typical use-case for replication is a combination of primary and standby server, with the standby serving as a backup which can easily be activated in case of a problem with the primary. Such an unplanned failover would normally be handled by promoting the standby, after which an appropriate action must be taken to restore the old primary. In some cases however it's desirable to promote the standby in a planned way, e.g. so maintenance can be performed on the primary; this kind of switchover is supported by the command. repmgr standby switchover differs from other &repmgr; actions in that it also performs actions on other servers (the demotion candidate, and optionally any other servers which are to follow the new primary), which means passwordless SSH access is required to those servers from the one where repmgr standby switchover is executed. repmgr standby switchover performs a relatively complex series of operations on two servers, and should therefore be performed after careful preparation and with adequate attention. In particular you should be confident that your network environment is stable and reliable. Additionally you should be sure that the current primary can be shut down quickly and cleanly. In particular, access from applications should be minimalized or preferably blocked completely. Also be aware that if there is a backlog of files waiting to be archived, PostgreSQL will not shut down until archiving completes. We recommend running repmgr standby switchover at the most verbose logging level (--log-level=DEBUG --verbose) and capturing all output to assist troubleshooting any problems. Please also read carefully the sections and below. Preparing for switchover switchover preparation As mentioned in the previous section, success of the switchover operation depends on &repmgr; being able to shut down the current primary server quickly and cleanly. Ensure that the promotion candidate has sufficient free walsenders available (PostgreSQL configuration item max_wal_senders), and if replication slots are in use, at least one free slot is available for the demotion candidate ( PostgreSQL configuration item max_replication_slots). Ensure that a passwordless SSH connection is possible from the promotion candidate (standby) to the demotion candidate (current primary). If --siblings-follow will be used, ensure that passwordless SSH connections are possible from the promotion candidate to all nodes attached to the demotion candidate (including the witness server, if in use). &repmgr; expects to find the &repmgr; binary in the same path on the remote server as on the local server. Double-check which commands will be used to stop/start/restart the current primary; this can be done by e.g. executing repmgr node service on the current primary: repmgr -f /etc/repmgr.conf node service --list-actions --action=stop repmgr -f /etc/repmgr.conf node service --list-actions --action=start repmgr -f /etc/repmgr.conf node service --list-actions --action=restart These commands can be defined in repmgr.conf with , and . If &repmgr; is installed from a package. you should set these commands to use the appropriate service commands defined by the package/operating system as these will ensure PostgreSQL is stopped/started properly taking into account configuration and log file locations etc. If the options aren't defined, &repmgr; will fall back to using pg_ctl to stop/start/restart PostgreSQL, which may not work properly, particularly when executed on a remote server. For more details, see . On systemd systems we strongly recommend using the appropriate systemctl commands (typically run via sudo) to ensure systemd is informed about the status of the PostgreSQL service. If using sudo for the systemctl calls, make sure the sudo specification doesn't require a real tty for the user. If not set this way, repmgr will fail to stop the primary. See the documentation section for further details. Check that access from applications is minimalized or preferably blocked completely, so applications are not unexpectedly interrupted. If an exclusive backup is running on the current primary, or if WAL replay is paused on the standby, &repmgr; will not perform the switchover. Check there is no significant replication lag on standbys attached to the current primary. If WAL file archiving is set up, check that there is no backlog of files waiting to be archived, as PostgreSQL will not finally shut down until all of these have been archived. If there is a backlog exceeding archive_ready_warning WAL files, &repmgr; will emit a warning before attempting to perform a switchover; you can also check manually with repmgr node check --archive-ready. From repmgr 4.2, &repmgr; will instruct any running &repmgrd; instances to pause operations while the switchover is being carried out, to prevent &repmgrd; from unintentionally promoting a node. For more details, see . Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd; is not running on any nodes while a switchover is being executed. Finally, consider executing repmgr standby switchover with the --dry-run option; this will perform any necessary checks and inform you about success/failure, and stop before the first actual command is run (which would be the shutdown of the current primary). Example output: $ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode INFO: SSH connection to host "node1" succeeded INFO: archive mode is "off" INFO: replication lag on this standby is 0 seconds INFO: all sibling nodes are reachable via SSH NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby INFO: following shutdown command would be run on node "node1": "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop" INFO: parameter "shutdown_check_timeout" is set to 60 seconds Be aware that checks the prerequisites for performing the switchover and some basic sanity checks on the state of the database which might effect the switchover operation (e.g. replication lag); it cannot however guarantee the switchover operation will succeed. In particular, if the current primary does not shut down cleanly, &repmgr; will not be able to reliably execute the switchover (as there would be a danger of divergence between the former and new primary nodes). See for a full list of available command line options and repmgr.conf settings relevant to performing a switchover. Switchover and pg_rewind pg_rewind using with "repmgr standby switchover" If the demotion candidate does not shut down smoothly or cleanly, there's a risk it will have a slightly divergent timeline and will not be able to attach to the new primary. To fix this situation without needing to reclone the old primary, it's possible to use the pg_rewind utility, which will usually be able to resync the two servers. To have &repmgr; execute pg_rewind if it detects this situation after promoting the new primary, add the option. If &repmgr; detects a situation where it needs to execute pg_rewind, it will execute a CHECKPOINT on the new primary before executing pg_rewind. For more details on pg_rewind, see: https://www.postgresql.org/docs/current/app-pgrewind.html. pg_rewind has been part of the core PostgreSQL distribution since version 9.5. Users of PostgreSQL 9.4 will need to manually install it; the source code is available here: https://github.com/vmware/pg_rewind. If the pg_rewind binary is not installed in the PostgreSQL bin directory, provide its full path on the demotion candidate with . Note that building the 9.4 version of pg_rewind requires the PostgreSQL source code. Executing the switchover command switchover execution To demonstrate switchover, we will assume a replication cluster with a primary (node1) and one standby (node2); after the switchover node2 should become the primary with node1 following it. The switchover command must be run from the standby which is to be promoted, and in its simplest form looks like this: $ repmgr -f /etc/repmgr.conf standby switchover NOTICE: executing switchover on node "node2" (ID: 2) INFO: searching for primary node INFO: checking if node 1 is primary INFO: current primary node is 1 INFO: SSH connection to host "node1" succeeded INFO: archive mode is "off" INFO: replication lag on this standby is 0 seconds NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby NOTICE: stopping current primary node "node1" (ID: 1) NOTICE: issuing CHECKPOINT DETAIL: executing server command "pg_ctl -l /var/log/postgres/startup.log -D '/var/lib/pgsql/data' -m fast -W stop" INFO: checking primary status; 1 of 6 attempts NOTICE: current primary has been cleanly shut down at location 0/3001460 NOTICE: promoting standby to primary DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote" server promoting NOTICE: STANDBY PROMOTE successful DETAIL: server "node2" (ID: 2) was successfully promoted to primary INFO: setting node 1's primary to node 2 NOTICE: starting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' restart" NOTICE: NODE REJOIN successful DETAIL: node 1 is now attached to node 2 NOTICE: switchover was successful DETAIL: node "node2" is now primary NOTICE: STANDBY SWITCHOVER is complete The old primary is now replicating as a standby from the new primary, and the cluster status will now look like this: $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string ----+-------+---------+-----------+----------+----------+-------------------------------------- 1 | node1 | standby | running | node2 | default | host=node1 dbname=repmgr user=repmgr 2 | node2 | primary | * running | | default | host=node2 dbname=repmgr user=repmgr If &repmgrd; is in use, it's worth double-checking that all nodes are unpaused by executing repmgr service status (&repmgr; 4.2 - 4.4: repmgr daemon status). Users of &repmgr; versions prior to 4.2 will need to manually restart &repmgrd; on all nodes after the switchover is completed. Caveats switchover caveats If using PostgreSQL 9.4, you should ensure that the shutdown command is configured to use PostgreSQL's fast shutdown mode (the default in 9.5 and later). If relying on pg_ctl to perform database server operations, you should include -m fast in pg_ctl_options in repmgr.conf. pg_rewind *requires* that either wal_log_hints is enabled, or that data checksums were enabled when the cluster was initialized. See the pg_rewind documentation for details. Troubleshooting switchover issues switchover troubleshooting As emphasised previously, performing a switchover is a non-trivial operation and there are a number of potential issues which can occur. While &repmgr; attempts to perform sanity checks, there's no guaranteed way of determining the success of a switchover without actually carrying it out. Demotion candidate (old primary) does not shut down &repmgr; may abort a switchover with a message like: ERROR: shutdown of the primary server could not be confirmed HINT: check the primary server status before performing any further actions This means the shutdown of the old primary has taken longer than &repmgr; expected, and it has given up waiting. In this case, check the PostgreSQL log on the primary server to see what is going on. It's entirely possible the shutdown process is just taking longer than the timeout set by the configuration parameter shutdown_check_timeout (default: 60 seconds), in which case you may need to adjust this parameter. Note that shutdown_check_timeout is set on the node where repmgr standby switchover is executed (promotion candidate); setting it on the demotion candidate (former primary) will have no effect. If the primary server has shut down cleanly, and no other node has been promoted, it is safe to restart it, in which case the replication cluster will be restored to its original configuration. Switchover aborts with an "exclusive backup" error &repmgr; may abort a switchover with a message like: ERROR: unable to perform a switchover while primary server is in exclusive backup mode HINT: stop backup before attempting the switchover This means an exclusive backup is running on the current primary; interrupting this will not only abort the backup, but potentially leave the primary with an ambiguous backup state. To proceed, either wait until the backup has finished, or cancel it with the command SELECT pg_stop_backup(). For more details see the PostgreSQL documentation section Making an exclusive low level backup.