diff --git a/doc/filelist.sgml b/doc/filelist.sgml
index 1e240de6..d3f8b5a0 100644
--- a/doc/filelist.sgml
+++ b/doc/filelist.sgml
@@ -58,7 +58,6 @@
-
diff --git a/doc/repmgr.sgml b/doc/repmgr.sgml
index 0f2b4888..39292d98 100644
--- a/doc/repmgr.sgml
+++ b/doc/repmgr.sgml
@@ -88,7 +88,6 @@
&repmgrd-witness-server;
&repmgrd-degraded-monitoring;
&repmgrd-monitoring;
- &repmgrd-notes;
&repmgrd-bdr;
diff --git a/doc/repmgrd-notes.sgml b/doc/repmgrd-notes.sgml
deleted file mode 100644
index 31910758..00000000
--- a/doc/repmgrd-notes.sgml
+++ /dev/null
@@ -1,38 +0,0 @@
-
-
-
- repmgrd
- notes
-
- repmgrd notes
-
-
-
- repmgrd
- paused WAL replay
-
-
- repmgrd and paused WAL replay
-
- If WAL replay has been paused (using pg_wal_replay_pause(),
- on PostgreSQL 9.6 and earlier pg_xlog_replay_pause()),
- in a failover situation repmgrd will
- automatically resume WAL replay.
-
-
- This is because if WAL replay is paused, but WAL is pending replay,
- PostgreSQL cannot be promoted until WAL replay is resumed.
-
-
-
- repmgr standby promote
- will refuse to promote a node in this state, as the PostgreSQL
- promote command will not be acted on until
- WAL replay is resumed, leaving the cluster in a potentially
- unstable state. In this case it is up to the user to
- decide whether to resume WAL replay.
-
-
-
-
-
diff --git a/doc/repmgrd-operation.sgml b/doc/repmgrd-operation.sgml
new file mode 100644
index 00000000..29a029b6
--- /dev/null
+++ b/doc/repmgrd-operation.sgml
@@ -0,0 +1,216 @@
+
+
+ repmgrd
+ operation
+
+
+ repmgrd operation
+
+
+
+
+
+ repmgrd
+ pausing
+
+
+
+ pausing repmgrd
+
+
+ Pausing repmgrd
+
+
+ In normal operation, repmgrd monitors the state of the
+ PostgreSQL node it is running on, and will take appropriate action if problems
+ are detected, e.g. (if so configured) promote the node to primary, if the existing
+ primary has been determined as failed.
+
+
+
+ However, repmgrd is unable to distinguish between
+ planned outages (such as performing a switchover
+ or installing PostgreSQL maintenance released), and an actual server outage. In versions prior to
+ &repmgr; 4.2 it was necessary to stop repmgrd on all nodes (or at least
+ on all nodes where repmgrd is
+ configured for automatic failover)
+ to prevent repmgrd from making unintentional changes to the
+ replication cluster.
+
+
+
+ From &repmgr; 4.2, repmgrd
+ can now be "paused", i.e. instructed not to take any action such as performing a failover.
+ This can be done from any node in the cluster, removing the need to stop/restart
+ each repmgrd individually.
+
+
+
+
+ For major PostgreSQL upgrades, e.g. from PostgreSQL 10 to PostgreSQL 11,
+ repmgrd should be shut down completely and only started up
+ once the &repmgr; packages for the new PostgreSQL major version have been installed.
+
+
+
+
+ Prerequisites for pausing repmgrd
+
+ In order to be able to pause/unpause repmgrd, following
+ prerequisites must be met:
+
+
+
+ &repmgr; 4.2 or later must be installed on all nodes.
+
+
+
+ The same major &repmgr; version (e.g. 4.2) must be installed on all nodes (and preferably the same minor version).
+
+
+
+
+ PostgreSQL on all nodes must be accessible from the node where the
+ pause/unpause operation is executed, using the
+ conninfo string shown by repmgr cluster show.
+
+
+
+
+
+
+ These conditions are required for normal &repmgr; operation in any case.
+
+
+
+
+
+
+ Pausing/unpausing repmgrd
+
+ To pause repmgrd, execute repmgr daemon pause, e.g.:
+
+$ repmgr -f /etc/repmgr.conf daemon pause
+NOTICE: node 1 (node1) paused
+NOTICE: node 2 (node2) paused
+NOTICE: node 3 (node3) paused
+
+
+ The state of repmgrd on each node can be checked with
+ repmgr daemon status, e.g.:
+ $ repmgr -f /etc/repmgr.conf daemon status
+ ID | Name | Role | Status | repmgrd | PID | Paused?
+----+-------+---------+---------+---------+------+---------
+ 1 | node1 | primary | running | running | 7851 | yes
+ 2 | node2 | standby | running | running | 7889 | yes
+ 3 | node3 | standby | running | running | 7918 | yes
+
+
+
+
+ If executing a switchover with repmgr standby switchover,
+ &repmgr; will automatically pause/unpause repmgrd as part of the switchover process.
+
+
+
+
+ If the primary (in this example, node1) is stopped, repmgrd
+ running on one of the standbys (here: node2) will react like this:
+
+[2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
+[2018-09-20 12:22:21] [INFO] checking state of node 1, 1 of 5 attempts
+[2018-09-20 12:22:21] [INFO] sleeping 1 seconds until next reconnection attempt
+...
+[2018-09-20 12:22:24] [INFO] sleeping 1 seconds until next reconnection attempt
+[2018-09-20 12:22:25] [INFO] checking state of node 1, 5 of 5 attempts
+[2018-09-20 12:22:25] [WARNING] unable to reconnect to node 1 after 5 attempts
+[2018-09-20 12:22:25] [NOTICE] node is paused
+[2018-09-20 12:22:33] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state
+[2018-09-20 12:22:33] [DETAIL] repmgrd paused by administrator
+[2018-09-20 12:22:33] [HINT] execute "repmgr daemon unpause" to resume normal failover mode
+
+
+ If the primary becomes available again (e.g. following a software upgrade), repmgrd
+ will automatically reconnect, e.g.:
+
+[2018-09-20 13:12:41] [NOTICE] reconnected to upstream node 1 after 8 seconds, resuming monitoring
+
+
+
+ To unpause repmgrd, execute repmgr daemon unpause, e.g.:
+
+$ repmgr -f /etc/repmgr.conf daemon unpause
+NOTICE: node 1 (node1) unpaused
+NOTICE: node 2 (node2) unpaused
+NOTICE: node 3 (node3) unpaused
+
+
+
+
+ If the previous primary is no longer accessible when repmgrd
+ is unpaused, no failover action will be taken. Instead, a new primary must be manually promoted using
+ repmgr standby promote,
+ and any standbys attached to the new primary with
+ repmgr standby follow.
+
+
+ This is to prevent repmgr daemon unpause
+ resulting in the automatic promotion of a new primary, which may be a problem particularly
+ in larger clusters, where repmgrd could select a different promotion
+ candidate to the one intended by the administrator.
+
+
+
+
+ Details on the repmgrd pausing mechanism
+
+
+ The pause state of each node will be stored over a PostgreSQL restart.
+
+
+
+ repmgr daemon pause and
+ repmgr daemon unpause can be
+ executed even if repmgrd is not running; in this case,
+ repmgrd will start up in whichever pause state has been set.
+
+
+
+ repmgr daemon pause and
+ repmgr daemon unpause
+ do not stop/start repmgrd.
+
+
+
+
+
+
+
+ repmgrd
+ paused WAL replay
+
+
+ repmgrd and paused WAL replay
+
+ If WAL replay has been paused (using pg_wal_replay_pause(),
+ on PostgreSQL 9.6 and earlier pg_xlog_replay_pause()),
+ in a failover situation repmgrd will
+ automatically resume WAL replay.
+
+
+ This is because if WAL replay is paused, but WAL is pending replay,
+ PostgreSQL cannot be promoted until WAL replay is resumed.
+
+
+
+ repmgr standby promote
+ will refuse to promote a node in this state, as the PostgreSQL
+ promote command will not be acted on until
+ WAL replay is resumed, leaving the cluster in a potentially
+ unstable state. In this case it is up to the user to
+ decide whether to resume WAL replay.
+
+
+
+
+