Preparing for switchover
+
- As mentioned above, success of the switchover operation depends on &repmgr;
- being able to shut down the current primary server quickly and cleanly.
+ As mentioned in the previous section, success of the switchover operation depends on
+ &repmgr; being able to shut down the current primary server quickly and cleanly.
+
Double-check which commands will be used to stop/start/restart the current
primary; on the primary execute:
repmgr -f /etc/repmgr.conf node service --list --action=stop
repmgr -f /etc/repmgr.conf node service --list --action=start
- repmgr -f /etc/repmgr.conf node service --list --action=restart
-
+ repmgr -f /etc/repmgr.conf node service --list --action=restart
+
+
+ These commands can be defined in repmgr.conf with
+ ,
+ and .
+
+
+
+
+ If &repmgr; is installed from a package. you should set these commands
+ to use the appropriate service commands defined by the package/operating
+ system as these will ensure PostgreSQL is stopped/started properly
+ taking into account configuration and log file locations etc.
+
+
+ If the options aren't defined, &repmgr; will
+ fall back to using pg_ctl to stop/start/restart
+ PostgreSQL, which may not work properly.
+
+
+
On systemd systems we strongly recommend using the appropriate
systemctl commands (typically run via sudo) to ensure
- systemd informed about the status of the PostgreSQL service.
+ systemd is informed about the status of the PostgreSQL service.
If using sudo for the systemctl calls, make sure the
@@ -79,25 +101,30 @@
this way, repmgr will fail to stop the primary.
+
Check that access from applications is minimalized or preferably blocked
completely, so applications are not unexpectedly interrupted.
+
Check there is no significant replication lag on standbys attached to the
current primary.
+
If WAL file archiving is set up, check that there is no backlog of files waiting
- to be archived, as PostgreSQL will not finally shut down until all these have been
+ to be archived, as PostgreSQL will not finally shut down until all of these have been
archived. If there is a backlog exceeding archive_ready_warning WAL files,
&repmgr; will emit a warning before attempting to perform a switchover; you can also check
manually with repmgr node check --archive-ready.
+
Ensure that repmgrd is *not* running anywhere to prevent it unintentionally
promoting a node.
+
Finally, consider executing repmgr standby switchover with the
--dry-run option; this will perform any necessary checks and inform you about
@@ -115,6 +142,48 @@
"pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
+
+
+
+ Be aware that checks the prerequisites
+ for performing the switchover and some basic sanity checks on the
+ state of the database which might effect the switchover operation
+ (e.g. replication lag); it cannot however guarantee the switchover
+ operation will succeed. In particular, if the current primary
+ does not shut down cleanly, &repmgr; will not be able to reliably
+ execute the switchover (as there would be a danger of divergence
+ between the former and new primary nodes).
+
+
+
+
+ Note that following parameters in repmgr.conf are relevant to the
+ switchover operation:
+
+
+
+ reconnect_attempts: number of times to check the original primary
+ for a clean shutdown after executing the shutdown command, before aborting
+
+
+
+
+ reconnect_interval: interval (in seconds) to check the original
+ primary for a clean shutdown after executing the shutdown command (up to a maximum
+ of reconnect_attempts tries)
+
+
+
+
+ replication_lag_critical:
+ if replication lag (in seconds) on the standby exceeds this value, the
+ switchover will be aborted (unless the -F/--force option
+ is provided)
+
+
+
+
+