doc: emphasise that BDR2 support is for BDR2 only

doc: add a link to the current documentation from the contents page
doc: clarify "cluster show" error codes
2026-03-23 15:16:29 +00:00 · 2019-04-05 11:16:14 +09:00 · 2019-04-03 10:45:26 +09:00 · 2019-03-18 10:51:04 +09:00 · 2019-03-15 15:08:19 +09:00 · 2019-03-15 14:02:59 +09:00
130 changed files with 25799 additions and 6883 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -39,6 +39,10 @@ lib*.pc
 # test output
 /results/
 /regression.diffs
 /regression.out
 /doc/Makefile
 # other
 /.lineno
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,7 +2,7 @@ License and Contributions
 =========================
 `repmgr` is licensed under the GPL v3.  All of its code and documentation is
-Copyright 2010-2017, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
+Copyright 2010-2018, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
 details.
 The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -28,4 +28,3 @@ project. For more details see:
 Contributors should reformat their code similarly before submitting code to
 the project, in order to minimize merge conflicts with other work.
 >>>>>>> Add further documentation files
--- a/4
+++ b/4
@@ -1,4 +1,4 @@
-Copyright (c) 2010-2017, 2ndQuadrant Limited
+Copyright (c) 2010-2018, 2ndQuadrant Limited
 All rights reserved.
 This program is free software: you can redistribute it and/or modify
@@ -12,5 +12,5 @@ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 You should have received a copy of the GNU General Public License
-along with this program.  If not, see http://www.gnu.org/licenses/
+along with this program.  If not, see https://www.gnu.org/licenses/
 to obtain one.
--- a/FAQ.md
+++ b/FAQ.md
@@ -0,0 +1,10 @@
 FAQ - Frequently Asked Questions about repmgr
 =============================================
 The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](https://repmgr.org/docs/current/appendix-faq.html "repmgr FAQ")
 The repmgr 3.x FAQ can be found here:
    https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/FAQ.md
 Note that repmgr 3.x is no longer supported.
--- a/446
+++ b/446
@@ -1,2 +1,446 @@
-4.0     2017-09
+4.2     2018-10-24
        repmgr: add parameter "shutdown_check_timeout" for use by "standby switchover";
          GitHub #504 (Ian)
        repmgr: add "--node-id" option to "repmgr cluster cleanup"; GitHub #493 (Ian)
        repmgr: report unreachable nodes when running "repmgr cluster (matrix|crosscheck);
          GitHub #246 (Ian)
        repmgr: add configuration file parameter "repmgr_bindir"; GitHub #246 (Ian)
        repmgr: fix "Missing replication slots" label in "node check"; GitHub #507 (Ian)
        repmgrd: fix parsing of -d/--daemonize option (Ian)
        repmgrd: support "pausing" of repmgrd (Ian)
 4.1.1   2018-09-05
        logging: explicitly log the text of failed queries as ERRORs to
          assist logfile analysis; GitHub #498
        repmgr: truncate version string, if necessary; GitHub #490 (Ian)
        repmgr: improve messages emitted during "standby promote" (Ian)
        repmgr: "standby clone" - don't copy external config files in --dry-run
          mode; GitHub #491 (Ian)
        repmgr: add "cluster_cleanup" event; GitHub #492 (Ian)
        repmgr: (standby switchover) improve detection of free walsenders;
          GitHub #495 (Ian)
        repmgr: (node rejoin) improve replication slot handling; GitHub #499 (Ian)
        repmgrd: ensure that sending SIGHUP always results in the log file
          being reopened; GitHub #485 (Ian)
        repmgrd: report version number *after* logger initialisation; GitHub #487 (Ian)
        repmgrd: fix startup on witness node when local data is stale; GitHub #488/#489 (Ian)
        repmgrd: improve cascaded standby failover handling; GitHub #480 (Ian)
        repmgrd: improve reconnection handling (Ian)
 4.1.0   2018-07-31
        repmgr: change default log_level to INFO, add documentation; GitHub #470 (Ian)
        repmgr: add "--missing-slots" check to "repmgr node check" (Ian)
        repmgr: improve command line error handling; GitHub #464 (Ian)
        repmgr: fix "standby register --wait-sync" when no timeout provided (Ian)
        repmgr: "cluster show" returns non-zero value if an issue encountered;
          GitHub #456 (Ian)
        repmgr: "node check" and "node status" returns non-zero value if an issue
           encountered (Ian)
        repmgr: add CSV output mode to "cluster event"; GitHub #471 (Ian)
        repmgr: add -q/--quiet option to suppress non-error output; GitHub #468 (Ian)
        repmgr: "node status" returns non-zero value if an issue encountered (Ian)
        repmgr: enable "recovery_min_apply_delay" to be 0; GitHub #448 (Ian)
        repmgr: "cluster cleanup" - add missing help options; GitHub #461/#462 (gclough)
        repmgr: ensure witness node follows new primary after switchover;
          GitHub #453 (Ian)
        repmgr: fix witness node handling in "node check"/"node status";
          GitHub #451 (Ian)
        repmgr: fix "primary_slot_name" when using "standby clone" with --recovery-conf-only;
          GitHub #474 (Ian)
        repmgr: don't perform a switchover if an exclusive backup is running;
          GitHub #476 (Martín)
        repmgr: enable "witness unregister" to be run on any node; GitHub #472 (Ian)
        repmgrd: create a PID file by default; GitHub #457 (Ian)
        repmgrd: daemonize process by default; GitHub #458 (Ian)
 4.0.6   2018-06-14
        repmgr: (witness register) prevent registration of a witness server with the
          same name as an existing node (Ian)
        repmgr: (standby follow) check node has actually connected to new primary
          before reporting success; GitHub #444 (Ian)
        repmgr: (standby clone) improve handling of external configuration file copying,
          including consideration in --dry-run check; GitHub #443 (Ian)
        repmgr: (standby clone) don't require presence of "user" parameter in
          conninfo string; GitHub #437 (Ian)
        repmgr: (standby clone) improve documentation of --recovery-conf-only
          mode; GitHub #438 (Ian)
        repmgr: (node rejoin) fix bug when parsing --config-files parameter;
          GitHub #442 (Ian)
        repmgr: when using --dry-run, force log level to INFO to ensure output
          will always be displayed; GitHub #441 (Ian)
        repmgr: (cluster matrix/crosscheck) return non-zero exit code if node
           connection issues detected; GitHub #447 (Ian)
        repmgrd: ensure local node is counted as quorum member; GitHub #439 (Ian)
 4.0.5   2018-05-02
        repmgr: poll demoted primary after restart as a standby during a
          switchover operation; GitHub #408 (Ian)
        repmgr: add configuration parameter "config_directory"; GitHub #424 (Ian)
        repmgr: add "dbname=replication" to all replication connection strings;
          GitHub #421 (Ian)
        repmgr: add sanity check if --upstream-node-id not supplied when executing
          "standby register"; GitHub #395 (Ian)
        repmgr: enable provision of "archive_cleanup_command" in recovery.conf;
          GitHub #416 (Ian)
        repmgr: actively check for node to rejoin cluster; GitHub #415 (Ian)
        repmgr: enable pg_rewind to be used with PostgreSQL 9.3/9.4; GitHub #413 (Ian)
        repmgr: fix minimum accepted value for "degraded_monitoring_timeout";
          GitHub #411 (Ian)
        repmgr: fix superuser password handling; GitHub #400 (Ian)
        repmgr: fix parsing of "archive_ready_critical" configuration file
          parameter; GitHub #426 (Ian)
        repmgr: fix display of conninfo parsing error messages (Ian)
        repmgr: fix "repmgr cluster crosscheck" output; GitHub #389 (Ian)
        repmgrd: prevent standby connection handle from going stale (Ian)
        repmgrd: fix memory leaks in witness code; GitHub #402 (AndrzejNowicki, Martín)
        repmgrd: handle "pg_ctl promote" timeout; GitHub #425 (Ian)
        repmgrd: handle failover situation with only two nodes in the primary
          location, and at least one node in another location; GitHub #407 (Ian)
        repmgrd: set "connect_timeout=2" when pinging a server (Ian)
 4.0.4   2018-03-09
        repmgr: add "standby clone --recovery-conf-only" option; GitHub #382 (Ian)
        repmgr: make "standby promote" timeout values configurable; GitHub #387 (Ian)
        repmgr: improve replication slot warnings generated by "node status";
          GitHub #385 (Ian)
        repmgr: remove restriction on replication slots when cloning from
          a Barman server; GitHub #379 (Ian)
        repmgr: ensure "node rejoin" honours "--dry-run" option; GitHub #383 (Ian)
        repmgr: fix --superuser handling when cloning a standby; GitHub #380 (Ian)
        repmgr: update various help options; GitHub #391, #392 (hasegeli)
        repmgrd: add event "repmgrd_shutdown"; GitHub #393 (Ian)
        repmgrd: improve detection of status change from primary to standby (Ian)
        repmgrd: improve log output in various situations (Ian)
        repmgrd: improve reconnection to the local node after a failover (Ian)
        repmgrd: ensure witness server connects to new primary after a failover (Ian)
 4.0.3   2018-02-15
        repmgr: improve switchover handling when "pg_ctl" used to control the
          server and logging output is not explicitly redirected (Ian)
        repmgr: improve switchover log messages and exit code when old primary could
          not be shut down cleanly (Ian)
        repmgr: check demotion candidate can make a replication connection to the
          promotion candidate before executing a switchover; GitHub #370 (Ian)
        repmgr: add check for sufficient walsenders/replication slots before executing
          a switchover; GitHub #371 (Ian)
        repmgr: add --dry-run mode to "repmgr standby follow"; GitHub #368 (Ian)
        repmgr: provide information about the primary node for "standby_register" and
          "standby_follow" event notifications; GitHub #375 (Ian)
        repmgr: add "standby_register_sync" event notification; GitHub #374 (Ian)
        repmgr: output any connection error messages in "cluster show"'s list of
          warnings; GitHub #369 (Ian)
        repmgr: ensure an inactive data directory can be deleted; GitHub #366 (Ian)
        repmgr: fix upstream node display in "repmgr node status"; GitHub #363 (fanf2)
        repmgr: improve/clarify documentation and update --help output for
          "primary unregister"; GitHub #373 (Ian)
        repmgr: allow replication slots when Barman is configured; GitHub #379 (Ian)
        repmgr: fix parsing of "pg_basebackup_options"; GitHub #376 (Ian)
        repmgr: ensure "pg_subtrans" directory is created when cloning a standby in
          Barman mode (Ian)
        repmgr: fix primary node check in "witness register"; GitHub #377 (Ian)
 4.0.2   2018-01-18
        repmgr: add missing -W option to getopt_long() invocation; GitHub #350 (Ian)
        repmgr: automatically create slot name if missing; GitHub #343 (Ian)
        repmgr: fixes to parsing output of remote repmgr invocations; GitHub #349 (Ian)
        repmgr: BDR support - create missing connection replication set
          if required; GitHub #347 (Ian)
        repmgr: handle missing node record in "repmgr node rejoin"; GitHub #358 (Ian)
        repmgr: enable documentation to be build as single HTML file; GitHub #353 (fanf2)
        repmgr: recognize "--terse" option for "repmgr cluster event"; GitHub #360 (Ian)
        repmgr: add "--wait-start" option for "repmgr standby register"; GitHub #356 (Ian)
        repmgr: add "%p" event notification parameter for "repmgr standby switchover"
          containing the node ID of the demoted primary (Ian)
        docs: various fixes and updates (Ian, Daymel, Martín, ams)
 4.0.1   2017-12-13
        repmgr: ensure "repmgr node check --action=" returns appropriate return
          code; GitHub #340 (Ian)
        repmgr: add missing schema qualification in get_all_node_records_with_upstream()
          query GitHub #341 (Martín)
        repmgr: initialise "voting_term" table in application, not extension SQL;
          GitHub #344 (Ian)
        repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian)
        repmgr: fix configuration file sanity check; GitHub #342 (Ian)
 4.0.0   2017-11-21
        Complete rewrite with many changes; for details see the repmgr 4.0.0 release
        notes at: https://repmgr.org/docs/4.0/release-4.0.0.html
 3.3.2   2017-06-01
        Add support for PostgreSQL 10 (Ian)
        repmgr: ensure --replication-user option is honoured when passing database
          connection parameters as a conninfo string (Ian)
        repmgr: improve detection of pg_rewind on remote server (Ian)
        repmgr: add DETAIL log output for additional clarification of error messages (Ian)
        repmgr: suppress various spurious error messages in `standby follow` and
          `standby switchover` (Ian)
        repmgr: add missing `-P` option (Ian)
        repmgrd: monitoring statistic reporting fixes (Ian)
 3.3.1   2017-03-13
        repmgrd: prevent invalid apply lag value being written to the
          monitoring table (Ian)
        repmgrd: fix error in XLogRecPtr conversion when calculating
          monitoring statistics (Ian)
        repmgr: if replication slots in use, where possible delete slot on old
          upstream node after following new upstream (Ian)
        repmgr: improve logging of rsync actions (Ian)
        repmgr: improve `standby clone` when synchronous replication in use (Ian)
        repmgr: stricter checking of allowed node id values
        repmgr: enable `master register --force` when there is a foreign key
          dependency from a standby node (Ian)
 3.3     2016-12-27
        repmgr: always log to STDERR even if log facility defined (Ian)
        repmgr: add --log-to-file to log repmgr output to the defined
          log facility (Ian)
        repmgr: improve handling of command line parameter errors (Ian)
        repmgr: add option --upstream-conninfo to explicitly set
          'primary_conninfo' in recovery.conf (Ian)
        repmgr: enable a standby to be registered which isn't running (Ian)
        repmgr: enable `standby register --force` to update a node record
          with cascaded downstream node records (Ian)
        repmgr: add option `--no-conninfo-password` (Abhijit, Ian)
        repmgr: add initial support for PostgreSQL 10.0 (Ian)
        repmgr: escape values in primary_conninfo if needed (Ian)
 3.2.1   2016-10-24
        repmgr: require a valid repmgr cluster name unless -F/--force
          supplied (Ian)
        repmgr: check master server is registered with repmgr before
          cloning (Ian)
        repmgr: ensure data directory defaults to that of the source node (Ian)
        repmgr: various fixes to Barman cloning mode (Gianni, Ian)
        repmgr: fix `repmgr cluster crosscheck` output (Ian)
 3.2     2016-10-05
        repmgr: add support for cloning from a Barman backup (Gianni)
        repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni)
        repmgr: suppress connection error display in `repmgr cluster show`
          unless `--verbose` supplied (Ian)
        repmgr: add commands `witness register` and `witness unregister` (Ian)
        repmgr: enable `standby unregister` / `witness unregister` to be
          executed for a node which is not running (Ian)
        repmgr: remove deprecated command line options --initdb-no-pwprompt and
           -l/--local-port (Ian)
        repmgr: before cloning with pg_basebackup, check that sufficient free
           walsenders are available (Ian)
        repmgr: add option `--wait-sync` for `standby register` which causes
           repmgr to wait for the registered node record to synchronise to
           the standby (Ian)
        repmgr: add option `--copy-external-config-files` for files outside
           of the data directory (Ian)
        repmgr: only require `wal_keep_segments` to be set in certain corner
           cases (Ian)
        repmgr: better support cloning from a node other than the one to
           stream from (Ian)
        repmgrd: add configuration options to override the default pg_ctl
           commands (Jarkko Oranen, Ian)
        repmgrd: don't start if node is inactive and failover=automatic (Ian)
        packaging: improve "repmgr-auto" Debian package (Gianni)
 3.1.5   2016-08-15
        repmgrd: in a failover situation, prevent endless looping when
          attempting to establish the status of a node with
          `failover=manual` (Ian)
        repmgrd: improve handling of failover events on standbys with
          `failover=manual`, and create a new event notification
          for this, `standby_disconnect_manual` (Ian)
        repmgr: add further event notifications (Gianni)
        repmgr: when executing `standby switchover`, don't collect remote
          command output unless required (Gianni, Ian)
        repmgrd: improve standby monitoring query (Ian, based on suggestion
          from  Álvaro)
        repmgr: various command line handling improvements (Ian)
 3.1.4   2016-07-12
        repmgr: new configuration option for setting "restore_command"
          in the recovery.conf file generated by repmgr (Martín)
        repmgr: add --csv option to "repmgr cluster show" (Gianni)
        repmgr: enable provision of a conninfo string as the -d/--dbname
          parameter, similar to other PostgreSQL utilities (Ian)
        repmgr: during switchover operations improve detection of
          demotion candidate shutdown (Ian)
        various bugfixes and documentation updates (Ian, Martín)
 3.1.3   2016-05-17
        repmgrd: enable monitoring when a standby is catching up by
          replaying archived WAL (Ian)
        repmgrd: when upstream_node_id is NULL, assume upstream node
          to be current master (Ian)
        repmgrd: check for reappearance of the master node if standby
          promotion fails (Ian)
        improve handling of rsync failure conditions (Martín)
 3.1.2   2016-04-12
        Fix pg_ctl path generation in do_standby_switchover() (Ian)
        Regularly sync witness server repl_nodes table (Ian)
        Documentation improvements (Gianni, dhyannataraj)
        (Experimental) ensure repmgr handles failover slots when copying
          in rsync mode (Craig, Ian)
        rsync mode handling fixes (Martín)
        Enable repmgr to compile against 9.6devel (Ian)
 3.1.1   2016-02-24
        Add '-P/--pwprompt' option for "repmgr create witness" (Ian)
        Prevent repmgr/repmgrd running as root (Ian)
 3.1.0   2016-02-01
        Add "repmgr standby switchover" command (Ian)
        Revised README file (Ian)
        Remove requirement for 'archive_mode' to be enabled (Ian)
        Improve -?/--help output, showing default values if relevant (Ian)
        Various bugfixes to command line/configuration parameter handling (Ian)
 3.0.3   2016-01-04
        Create replication slot if required before base backup is run (Abhijit)
        standy clone: when using rsync, clean up "pg_replslot" directory (Ian)
        Improve --help output (Ian)
        Improve config file parsing (Ian)
        Various logging output improvements, including explicit HINTS (Ian)
        Add --log-level to explicitly set log level on command line (Ian)
        Repurpose --verbose to display extra log output (Ian)
        Add --terse to hide hints and other non-critical output (Ian)
        Reference internal functions with explicit catalog path (Ian)
        When following a new primary, have repmgr (not repmgrd) create the new slot (Ian)
        Add /etc/repmgr.conf as a default configuration file location (Ian)
        Prevent repmgrd's -v/--verbose option expecting a parameter (Ian)
        Prevent invalid replication_lag values being written to the monitoring table (Ian)
        Improve repmgrd behaviour when monitored standby node is temporarily
          unavailable (Martín)
 3.0.2   2015-10-02
        Improve handling of --help/--version options; and improve help output (Ian)
        Improve handling of situation where logfile can't be opened (Ian)
        Always pass -D/--pgdata option to pg_basebackup (Ian)
        Bugfix: standby clone --force does not empty pg_xlog (Gianni)
        Bugfix: autofailover with reconnect_attempts > 1 (Gianni)
        Bugfix: ignore comments after values (soxwellfb)
        Bugfix: handle string values in 'node' parameter correctly (Gregory Duchatelet)
        Allow repmgr to be compiled with a newer libpq (Marco)
        Bugfix: call update_node_record_set_upstream() for STANDBY FOLLOW (Tomas)
        Update `repmgr --help` output (per Github report from renard)
        Update tablespace remapping in --rsync-only mode for 9.5 and later (Ian)
        Deprecate `-l/--local-port` option - the port can be extracted
          from the conninfo string in repmgr.conf (Ian)
        Add STANDBY UNREGISTER (Vik Fearing)
        Don't fail with error when registering master if schema already defined (Ian)
        Fixes to whitespace handling when parsing config file (Ian)
 3.0.1   2015-04-16
        Prevent repmgrd from looping infinitely if node was not registered (Ian)
        When promoting a standby, have repmgr (not repmgrd) handle metadata updates (Ian)
        Re-use replication slot if it already exists (Ian)
        Prevent a test SSH connection being made when not needed (Ian)
        Correct monitoring table column names (Ian)
 3.0     2015-03-27
        Require PostgreSQL 9.3 or later (Ian)
        Use `pg_basebackup` by default (instead of `rsync`) to clone standby servers (Ian)
        Use `pg_ctl promote` to promote a standby to primary
        Enable tablespace remapping using `pg_basebackup` (in PostgreSQL 9.3 with `rsync`) (Ian)
        Support cascaded standbys (Ian)
        "pg_bindir" no longer required as a configuration parameter (Ian)
        Enable replication slots to be used (PostgreSQL 9.4 and later (Ian)
        Command line option "--check-upstream-config" (Ian)
        Add event logging table and option to execute an external program when an event occurs (Ian)
        General usability and logging message improvements (Ian)
        Code consolidation and cleanup (Ian)
 2.0.3   2015-04-16
        Add -S/--superuser option for witness database creation Ian)
        Add -c/--fast-checkpoint option for cloning (Christoph)
        Add option "--initdb-no-pwprompt" (Ian)
 2.0.2   2015-02-17
        Add "--checksum" in rsync when using "--force" (Jaime)
        Use createdb/createuser instead of psql (Jaime)
        Fixes to witness creation and monitoring (wamonite)
        Use default master port if none supplied (Martín)
        Documentation fixes and improvements (Ian)
 2.0.1   2014-07-16
        Documentation fixes and new QUICKSTART file (Ian)
        Explicitly specify directories to ignore when cloning (Ian)
        Fix log level for some log messages (Ian)
        RHEL/CentOS specfile, init script and Makefile fixes (Nathan Van Overloop)
        Debian init script and config file documentation fixes (József Kószó)
        Typo fixes (Riegie Godwin Jeyaranchen, PriceChild)
 2.0stable 2014-01-30
        Documentation fixes (Christian)
        General refactoring, code quality improvements and stabilization work (Christian)
        Added proper daemonizing (-d/--daemonize) (Christian)
        Added PID file handling (-p/--pid-file) (Christian)
        New config option: monitor_interval_secs (Christian)
        New config option: retry_promote_interval (Christian)
        New config option: logfile (Christian)
        New config option: pg_bindir (Christian)
        New config option: pgctl_options (Christian)
 2.0beta2 2013-12-19
        Improve autofailover logic and algorithms (Jaime, Andres)
        Ignore pg_log when cloning (Jaime)
        Add timestamps to log line in stderr (Christian)
        Correctly check wal_keep_segments (Jay Taylor)
        Add a ssh_options parameter (Jay Taylor)
 2.0beta1 2012-07-27
        Make CLONE command try to make an exact copy including $PGDATA location (Cedric)
        Add detection of master failure (Jaime)
        Add the notion of a witness server (Jaime)
        Add autofailover capabilities (Jaime)
        Add a configuration parameter to indicate the script to execute on failover or follow (Jaime)
        Make the monitoring optional and turned off by default, it can be turned on with --monitoring-history switch (Jaime)
        Add tunables to specify number of retries to reconnect to master and the time between them (Jaime)
 1.2.0   2012-07-27
        Test ssh connection before trying to rsync (Cédric)
        Add CLUSTER SHOW command (Carlo)
        Add CLUSTER CLEANUP command (Jaime)
        Add function write_primary_conninfo (Marco)
        Teach repmgr how to get tablespace's location in different pg version (Jaime)
        Improve version message (Carlo)
 1.1.1   2012-04-18
        Add --ignore-rsync-warning (Cédric)
        Add strnlen for compatibility with OS X (Greg)
        Improve performance of the repl_status view (Jaime)
        Remove last argument from log_err (Jaime, Reported by Jeroen Dekkers)
        Complete documentation about possible error conditions (Jaime)
        Document how to clean history (Jaime)
 1.1.0   2011-03-09
        Make options -U, -R and -p not mandatory (Jaime)
 1.1.0b1 2011-02-24
        Fix missing "--force" option in help (Greg Smith)
        Correct warning message for wal_keep_segments (Bas van Oostveen)
        Add Debian build/usage docs (Bas, Hannu Krosing, Cedric Villemain)
        Add Debian .deb packaging (Hannu)
        Move configuration data into a structure (Bas, Gabriele Bartolini)
        Make rsync options configurable (Bas)
        Add syslog as alternate logging destination (Gabriele)
        Change from using malloc to static memory allocations (Gabriele)
        Add debugging messages after every query (Gabriele)
        Parameterize schema name used for repmgr (Gabriele)
        Avoid buffer overruns by using snprintf etc. (Gabriele)
        Fix use of database query after close (Gabriele)
        Add information about progress during "standby clone" (Gabriele)
        Fix double free errors in repmgrd (Charles Duffy, Greg)
        Make repmgr exit with an error code when encountering an error (Charles)
        Standardize on error return codes, use in repmgrd too (Greg)
        Add [un]install actions/SQL like most contrib modules (Daniel Farina)
        Wrap all string construction and produce error on overflow (Daniel)
        Correct freeing of memory from first_wal_segment (Daniel)
        Allow creating recovery.conf file with a password (Daniel)
        Inform when STANDBY CLONE sees an unused config file (Daniel)
        Use 64-bit computation for WAL apply_lag (Greg)
        Add info messages for database and general work done (Greg)
        Map old verbose flag into a useful setting for the new logger (Greg)
        Document repmgrd startup restrictions and log info about them (Greg)
 1.0.0   2010-12-05
        First public release
--- a/Makefile.in
+++ b/Makefile.in
@@ -11,7 +11,11 @@ EXTENSION = repmgr
 DATA = \
  repmgr--unpackaged--4.0.sql \
-  repmgr--4.0.sql
+  repmgr--4.0.sql \
  repmgr--4.0--4.1.sql \
  repmgr--4.1.sql \
  repmgr--4.1--4.2.sql \
  repmgr--4.2.sql
 REGRESS = repmgr_extension
@@ -26,20 +30,26 @@ all: \
 PG_CPPFLAGS = -std=gnu89 -I$(includedir_internal) -I$(libpq_srcdir) -Wall -Wmissing-prototypes -Wmissing-declarations $(EXTRA_CFLAGS)
 SHLIB_LINK = $(libpq)
-HEADERS = $(wildcard *.h)
+
 OBJS = \
 	repmgr.o
 include Makefile.global
 ifeq ($(vpath_build),yes)
 	HEADERS = $(wildcard *.h)
 else
 	HEADERS_built = $(wildcard *.h)
 endif
 $(info Building against PostgreSQL $(MAJORVERSION))
 REPMGR_CLIENT_OBJS = repmgr-client.o \
-	repmgr-action-primary.o repmgr-action-standby.o repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o \
+	repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o \
 	repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o repmgr-action-daemon.o \
 	configfile.o log.o strutil.o controldata.o dirutil.o compat.o dbutils.o
-REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o
+REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o compat.o
 DATE=$(shell date "+%Y-%m-%d")
 repmgr_version.h: repmgr_version.h.in
@@ -63,6 +73,12 @@ Makefile: Makefile.in config.status configure
 Makefile.global: Makefile.global.in config.status configure
 	./config.status $@
 doc:
 	$(MAKE) -C doc all
 install-doc:
 	$(MAKE) -C doc install
 clean: additional-clean
 maintainer-clean: additional-maintainer-clean
@@ -71,9 +87,11 @@ additional-clean:
 	rm -f repmgr-client.o
 	rm -f repmgr-action-primary.o
 	rm -f repmgr-action-standby.o
 	rm -f repmgr-action-witness.o
 	rm -f repmgr-action-bdr.o
 	rm -f repmgr-action-node.o
 	rm -f repmgr-action-cluster.o
 	rm -f repmgr-action-daemon.o
 	rm -f repmgrd.o
 	rm -f repmgrd-physical.o
 	rm -f repmgrd-bdr.o
--- a/README.md
+++ b/README.md
--- a/TODO.md
+++ b/TODO.md
@@ -0,0 +1,20 @@
 TODO
 ====
 This file contains a list of improvements which are desireable and/or have
 been requested, and which we aim to address/implement when time and resources
 permit.
 It is *not* a roadmap and there's no guarantee of any item being implemented
 within any given timeframe.
 Enable suspension of repmgrd failover
 -------------------------------------
 When performing maintenance, e.g. a switchover, it's necessary to stop all
 repmgrd nodes to prevent unintended failover; this is obviously inconvenient.
 We'll need to implement some way of notifying each repmgrd to suspend automatic
 failover until further notice.
 Requested in GitHub #410 ( https://github.com/2ndQuadrant/repmgr/issues/410 )
--- a/compat.c
+++ b/compat.c
@@ -6,7 +6,7 @@
 *    supported PostgreSQL versions. They're unlikely to change but
 *    it would be worth keeping an eye on them for any fixes/improvements.
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
--- a/compat.h
+++ b/compat.h
@@ -1,6 +1,6 @@
 /*
 * compat.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
--- a/config.h.in
+++ b/config.h.in
@@ -1,4 +1,2 @@
 /* config.h.in.  Generated from configure.in by autoheader.  */
 /* Only build repmgr for BDR */
 #undef BDR_ONLY
--- a/configfile.c
+++ b/configfile.c
@@ -1,7 +1,7 @@
 /*
 * config.c - parse repmgr.conf and other configuration-related functionality
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,10 +28,8 @@ char		config_file_path[MAXPGPATH] = "";
 static bool config_file_provided = false;
 bool		config_file_found = false;
 static void parse_config(t_configuration_options *options, bool terse);
 static void _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *warning_list);
 static bool parse_bool(const char *s,
 		   const char *config_item,
 		   ItemList *error_list);
 static void _parse_line(char *buf, char *name, char *value);
 static void parse_event_notifications_list(t_configuration_options *options, const char *arg);
@@ -73,6 +71,56 @@ load_config(const char *config_file, bool verbose, bool terse, t_configuration_o
 		strncpy(config_file_path, config_file, MAXPGPATH);
 		canonicalize_path(config_file_path);
 		/* relative path supplied - convert to absolute path */
 		if (config_file_path[0] != '/')
 		{
 			PQExpBufferData fullpath;
 			char *pwd = NULL;
 			initPQExpBuffer(&fullpath);
 			/*
 			 * we'll attempt to use $PWD to derive the effective path; getcwd()
 			 * will likely resolve symlinks, which may result in a path which
 			 * isn't permanent (e.g. if filesystem mountpoints change).
 			 */
 			pwd = getenv("PWD");
 			if (pwd != NULL)
 			{
 				appendPQExpBufferStr(&fullpath, pwd);
 			}
 			else
 			{
 				/* $PWD not available - fall back to getcwd() */
 				char cwd[MAXPGPATH] = "";
 				if (getcwd(cwd, MAXPGPATH) == NULL)
 				{
 					log_error(_("unable to execute getcwd()"));
 					log_detail("%s", strerror(errno));
 					termPQExpBuffer(&fullpath);
 					exit(ERR_BAD_CONFIG);
 				}
 				appendPQExpBufferStr(&fullpath, cwd);
 			}
 			appendPQExpBuffer(&fullpath,
 							  "/%s", config_file_path);
 			log_debug("relative configuration file converted to:\n  \"%s\"",
 					  fullpath.data);
 			strncpy(config_file_path, fullpath.data, MAXPGPATH);
 			termPQExpBuffer(&fullpath);
 			canonicalize_path(config_file_path);
 		}
 		if (stat(config_file_path, &stat_config) != 0)
 		{
 			log_error(_("provided configuration file \"%s\" not found: %s"),
@@ -81,6 +129,7 @@ load_config(const char *config_file, bool verbose, bool terse, t_configuration_o
 			exit(ERR_BAD_CONFIG);
 		}
 		if (verbose == true)
 		{
 			log_notice(_("using provided configuration file \"%s\""), config_file);
@@ -187,7 +236,7 @@ end_search:
 }
-void
+static void
 parse_config(t_configuration_options *options, bool terse)
 {
 	/* Collate configuration file errors here for friendlier reporting */
@@ -234,7 +283,9 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	memset(options->node_name, 0, sizeof(options->node_name));
 	memset(options->conninfo, 0, sizeof(options->conninfo));
 	memset(options->data_directory, 0, sizeof(options->data_directory));
 	memset(options->config_directory, 0, sizeof(options->data_directory));
 	memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
 	memset(options->repmgr_bindir, 0, sizeof(options->repmgr_bindir));
 	options->replication_type = REPLICATION_TYPE_PHYSICAL;
 	/*-------------
@@ -249,7 +300,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->log_status_interval = DEFAULT_LOG_STATUS_INTERVAL;
 	/*-----------------------
-	 * standby action settings
+	 * standby clone settings
 	 *------------------------
 	 */
 	options->use_replication_slots = false;
@@ -260,7 +311,30 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->tablespace_mapping.tail = NULL;
 	memset(options->recovery_min_apply_delay, 0, sizeof(options->recovery_min_apply_delay));
 	options->recovery_min_apply_delay_provided = false;
 	memset(options->archive_cleanup_command, 0, sizeof(options->archive_cleanup_command));
 	options->use_primary_conninfo_password = false;
 	memset(options->passfile, 0, sizeof(options->passfile));
 	/*-------------------------
 	 * standby promote settings
 	 *-------------------------
 	 */
 	options->promote_check_timeout = DEFAULT_PROMOTE_CHECK_TIMEOUT;
 	options->promote_check_interval = DEFAULT_PROMOTE_CHECK_INTERVAL;
 	/*------------------------
 	 * standby follow settings
 	 *------------------------
 	 */
 	options->primary_follow_timeout = DEFAULT_PRIMARY_FOLLOW_TIMEOUT;
 	options->standby_follow_timeout = DEFAULT_STANDBY_FOLLOW_TIMEOUT;
 	/*------------------------
 	 * standby switchover settings
 	 *------------------------
 	 */
 	options->shutdown_check_timeout = DEFAULT_SHUTDOWN_CHECK_TIMEOUT;
 	options->standby_reconnect_timeout = DEFAULT_STANDBY_RECONNECT_TIMEOUT;
 	/*-----------------
 	 * repmgrd settings
@@ -281,7 +355,14 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->degraded_monitoring_timeout = -1;
 	options->async_query_timeout = DEFAULT_ASYNC_QUERY_TIMEOUT;
 	options->primary_notification_timeout = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT;
-	options->primary_follow_timeout = DEFAULT_PRIMARY_FOLLOW_TIMEOUT;
+	options->repmgrd_standby_startup_timeout = -1; /* defaults to "standby_reconnect_timeout" if not set */
 	memset(options->repmgrd_pid_file, 0, sizeof(options->repmgrd_pid_file));
 	/*-------------
 	 * witness settings
 	 *-------------
 	 */
 	options->witness_sync_interval = DEFAULT_WITNESS_SYNC_INTERVAL;
 	/*-------------
 	 * BDR settings
@@ -394,6 +475,9 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 			strncpy(options->conninfo, value, MAXLEN);
 		else if (strcmp(name, "data_directory") == 0)
 			strncpy(options->data_directory, value, MAXPGPATH);
 		else if (strcmp(name, "config_directory") == 0)
 			strncpy(options->config_directory, value, MAXPGPATH);
 		else if (strcmp(name, "replication_user") == 0)
 		{
 			if (strlen(value) < NAMEDATALEN)
@@ -404,6 +488,8 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		}
 		else if (strcmp(name, "pg_bindir") == 0)
 			strncpy(options->pg_bindir, value, MAXPGPATH);
 		else if (strcmp(name, "repmgr_bindir") == 0)
 			strncpy(options->repmgr_bindir, value, MAXPGPATH);
 		else if (strcmp(name, "replication_type") == 0)
 		{
@@ -439,13 +525,40 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 			parse_time_unit_parameter(name, value, options->recovery_min_apply_delay, error_list);
 			options->recovery_min_apply_delay_provided = true;
 		}
 		else if (strcmp(name, "archive_cleanup_command") == 0)
 			strncpy(options->archive_cleanup_command, value, MAXLEN);
 		else if (strcmp(name, "use_primary_conninfo_password") == 0)
 			options->use_primary_conninfo_password = parse_bool(value, name, error_list);
 		else if (strcmp(name, "passfile") == 0)
 			strncpy(options->passfile, value, sizeof(options->passfile));
 		/* standby promote settings */
 		else if (strcmp(name, "promote_check_timeout") == 0)
 			options->promote_check_timeout = repmgr_atoi(value, name, error_list, 1);
 		else if (strcmp(name, "promote_check_interval") == 0)
 			options->promote_check_interval = repmgr_atoi(value, name, error_list, 1);
 		/* standby follow settings */
 		else if (strcmp(name, "primary_follow_timeout") == 0)
 			options->primary_follow_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "standby_follow_timeout") == 0)
 			options->standby_follow_timeout = repmgr_atoi(value, name, error_list, 0);
 		/* standby switchover settings */
 		else if (strcmp(name, "shutdown_check_timeout") == 0)
 			options->shutdown_check_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "standby_reconnect_timeout") == 0)
 			options->standby_reconnect_timeout = repmgr_atoi(value, name, error_list, 0);
 		/* node rejoin settings */
 		else if (strcmp(name, "node_rejoin_timeout") == 0)
 			options->node_rejoin_timeout = repmgr_atoi(value, name, error_list, 0);
 		/* node check settings */
 		else if (strcmp(name, "archive_ready_warning") == 0)
 			options->archive_ready_warning = repmgr_atoi(value, name, error_list, 1);
-		else if (strcmp(name, "archive_ready_critcial") == 0)
+		else if (strcmp(name, "archive_ready_critical") == 0)
 			options->archive_ready_critical = repmgr_atoi(value, name, error_list, 1);
 		else if (strcmp(name, "replication_lag_warning") == 0)
 			options->replication_lag_warning = repmgr_atoi(value, name, error_list, 1);
@@ -486,13 +599,19 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		else if (strcmp(name, "monitoring_history") == 0)
 			options->monitoring_history = parse_bool(value, name, error_list);
 		else if (strcmp(name, "degraded_monitoring_timeout") == 0)
-			options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, 1);
+			options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, -1);
 		else if (strcmp(name, "async_query_timeout") == 0)
 			options->async_query_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "primary_notification_timeout") == 0)
 			options->primary_notification_timeout = repmgr_atoi(value, name, error_list, 0);
-		else if (strcmp(name, "primary_follow_timeout") == 0)
+		else if (strcmp(name, "repmgrd_standby_startup_timeout") == 0)
-			options->primary_follow_timeout = repmgr_atoi(value, name, error_list, 0);
+			options->repmgrd_standby_startup_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "repmgrd_pid_file") == 0)
 			strncpy(options->repmgrd_pid_file, value, MAXPGPATH);
 		/* witness settings */
 		else if (strcmp(name, "witness_sync_interval") == 0)
 			options->witness_sync_interval = repmgr_atoi(value, name, error_list, 1);
 		/* BDR settings */
 		else if (strcmp(name, "bdr_local_monitoring_only") == 0)
@@ -604,7 +723,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		 * Raise an error if a known parameter is provided with an empty
 		 * value. Currently there's no reason why empty parameters are needed;
 		 * if we want to accept those, we'd need to add stricter default
-		 * checking, as currently e.g. an empty `node` value will be converted
+		 * checking, as currently e.g. an empty `node_id` value will be converted
 		 * to '0'.
 		 */
 		if (known_parameter == true && !strlen(value))
@@ -670,6 +789,17 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		PQconninfoFree(conninfo_options);
 	}
 	/* set values for parameters which default to other parameters */
 	/*
 	 * From 4.1, "repmgrd_standby_startup_timeout" replaces "standby_reconnect_timeout"
 	 * in repmgrd; fall back to "standby_reconnect_timeout" if no value explicitly provided
 	 */
 	if (options->repmgrd_standby_startup_timeout == -1)
 	{
 		options->repmgrd_standby_startup_timeout = options->standby_reconnect_timeout;
 	}
 	/* add warning about changed "barman_" parameter meanings */
 	if ((options->barman_host[0] == '\0' && options->barman_server[0] != '\0') ||
 		(options->barman_host[0] != '\0' && options->barman_server[0] == '\0'))
@@ -677,7 +807,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		item_list_append(error_list,
 						 _("use \"barman_host\" for the hostname of the Barman server"));
 		item_list_append(error_list,
-						 _("use \"barman_server\" for the name of the [server] section in the Barman configururation file"));
+						 _("use \"barman_server\" for the name of the [server] section in the Barman configuration file"));
 	}
@@ -686,13 +816,19 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	if (options->archive_ready_warning >= options->archive_ready_critical)
 	{
 		item_list_append(error_list,
-						 _("\archive_ready_critical\" must be greater than  \"archive_ready_warning\""));
+						 _("\"archive_ready_critical\" must be greater than  \"archive_ready_warning\""));
 	}
 	if (options->replication_lag_warning >= options->replication_lag_critical)
 	{
 		item_list_append(error_list,
-						 _("\replication_lag_critical\" must be greater than  \"replication_lag_warning\""));
+						 _("\"replication_lag_critical\" must be greater than  \"replication_lag_warning\""));
 	}
 	if (options->standby_reconnect_timeout < options->node_rejoin_timeout)
 	{
 		item_list_append(error_list,
 						 _("\"standby_reconnect_timeout\" must be equal to or greater than \"node_rejoin_timeout\""));
 	}
 }
@@ -858,12 +994,11 @@ parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemL
 	char	   *ptr = NULL;
 	int			targ = strtol(value, &ptr, 10);
-	if (targ < 1)
+	if (targ < 0)
 	{
 		if (errors != NULL)
 		{
-			item_list_append_format(
+			item_list_append_format(errors,
 									errors,
 									_("invalid value provided for \"%s\""),
 									name);
 		}
@@ -917,13 +1052,16 @@ parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemL
 * - promote_delay
 * - reconnect_attempts
 * - reconnect_interval
 * - repmgrd_standby_startup_timeout
 * - retry_promote_interval_secs
 *
- * non-changeable options
+ * non-changeable options (repmgrd references these from the "repmgr.nodes"
 * table, not the configuration file)
 *
 * - node_id
 * - node_name
 * - data_directory
 * - location
 * - priority
 * - replication_type
 *
@@ -932,7 +1070,7 @@ parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemL
 */
 bool
-reload_config(t_configuration_options *orig_options)
+reload_config(t_configuration_options *orig_options, t_server_type server_type)
 {
 	PGconn	   *conn;
 	t_configuration_options new_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
@@ -942,17 +1080,50 @@ reload_config(t_configuration_options *orig_options)
 	static ItemList config_errors = {NULL, NULL};
 	static ItemList config_warnings = {NULL, NULL};
 	PQExpBufferData errors;
 	log_info(_("reloading configuration file"));
 	_parse_config(&new_options, &config_errors, &config_warnings);
 	if (server_type == PRIMARY || server_type == STANDBY)
 	{
 		if (new_options.promote_command[0] == '\0')
 		{
 			item_list_append(&config_errors, _("\"promote_command\": required parameter was not found"));
 		}
 		if (new_options.follow_command[0] == '\0')
 		{
 			item_list_append(&config_errors, _("\"follow_command\": required parameter was not found"));
 		}
 	}
 	if (config_errors.head != NULL)
 	{
-		/* XXX dump errors to log */
+		ItemListCell *cell = NULL;
 		log_warning(_("unable to parse new configuration, retaining current configuration"));
 		initPQExpBuffer(&errors);
 		appendPQExpBufferStr(&errors,
 							 "following errors were detected:\n");
 		for (cell = config_errors.head; cell; cell = cell->next)
 		{
 			appendPQExpBuffer(&errors,
 							  "  %s\n", cell->string);
 		}
 		log_detail("%s", errors.data);
 		termPQExpBuffer(&errors);
 		return false;
 	}
 	/* The following options cannot be changed */
 	if (new_options.node_id != orig_options->node_id)
@@ -961,7 +1132,7 @@ reload_config(t_configuration_options *orig_options)
 		return false;
 	}
-	if (strcmp(new_options.node_name, orig_options->node_name) != 0)
+	if (strncmp(new_options.node_name, orig_options->node_name, MAXLEN) != 0)
 	{
 		log_warning(_("\"node_name\" cannot be changed, keeping current configuration"));
 		return false;
@@ -1005,7 +1176,7 @@ reload_config(t_configuration_options *orig_options)
 	}
 	/* conninfo */
-	if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
+	if (strncmp(orig_options->conninfo, new_options.conninfo, MAXLEN) != 0)
 	{
 		/* Test conninfo string works */
 		conn = establish_db_connection(new_options.conninfo, false);
@@ -1032,7 +1203,7 @@ reload_config(t_configuration_options *orig_options)
 	}
 	/* event_notification_command */
-	if (strcmp(orig_options->event_notification_command, new_options.event_notification_command) != 0)
+	if (strncmp(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN);
 		log_info(_("\"event_notification_command\" is now \"%s\""), new_options.event_notification_command);
@@ -1041,7 +1212,7 @@ reload_config(t_configuration_options *orig_options)
 	}
 	/* event_notifications */
-	if (strcmp(orig_options->event_notifications_orig, new_options.event_notifications_orig) != 0)
+	if (strncmp(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN) != 0)
 	{
 		strncpy(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN);
 		log_info(_("\"event_notifications\" is now \"%s\""), new_options.event_notifications_orig);
@@ -1061,7 +1232,7 @@ reload_config(t_configuration_options *orig_options)
 	}
 	/* follow_command */
-	if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
+	if (strncmp(orig_options->follow_command, new_options.follow_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->follow_command, new_options.follow_command, MAXLEN);
 		log_info(_("\"follow_command\" is now \"%s\""), new_options.follow_command);
@@ -1098,7 +1269,7 @@ reload_config(t_configuration_options *orig_options)
 	/* promote_command */
-	if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
+	if (strncmp(orig_options->promote_command, new_options.promote_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->promote_command, new_options.promote_command, MAXLEN);
 		log_info(_("\"promote_command\" is now \"%s\""), new_options.promote_command);
@@ -1106,7 +1277,7 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
-	/* promote_delay */
+	/* promote_delay (for testing use only; not documented */
 	if (orig_options->promote_delay != new_options.promote_delay)
 	{
 		orig_options->promote_delay = new_options.promote_delay;
@@ -1133,23 +1304,32 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
 	/* repmgrd_standby_startup_timeout */
 	if (orig_options->repmgrd_standby_startup_timeout != new_options.repmgrd_standby_startup_timeout)
 	{
 		orig_options->repmgrd_standby_startup_timeout = new_options.repmgrd_standby_startup_timeout;
 		log_info(_("\"repmgrd_standby_startup_timeout\" is now \"%i\""), new_options.repmgrd_standby_startup_timeout);
 		config_changed = true;
 	}
 	/*
 	 * Handle changes to logging configuration
 	 */
 	/* log_facility */
-	if (strcmp(orig_options->log_facility, new_options.log_facility) != 0)
+	if (strncmp(orig_options->log_facility, new_options.log_facility, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_facility, new_options.log_facility);
+		strncpy(orig_options->log_facility, new_options.log_facility, MAXLEN);
 		log_info(_("\"log_facility\" is now \"%s\""), new_options.log_facility);
 		log_config_changed = true;
 	}
 	/* log_file */
-	if (strcmp(orig_options->log_file, new_options.log_file) != 0)
+	if (strncmp(orig_options->log_file, new_options.log_file, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_file, new_options.log_file);
+		strncpy(orig_options->log_file, new_options.log_file, MAXLEN);
 		log_info(_("\"log_file\" is now \"%s\""), new_options.log_file);
 		log_config_changed = true;
@@ -1157,9 +1337,9 @@ reload_config(t_configuration_options *orig_options)
 	/* log_level */
-	if (strcmp(orig_options->log_level, new_options.log_level) != 0)
+	if (strncmp(orig_options->log_level, new_options.log_level, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_level, new_options.log_level);
+		strncpy(orig_options->log_level, new_options.log_level, MAXLEN);
 		log_info(_("\"log_level\" is now \"%s\""), new_options.log_level);
 		log_config_changed = true;
@@ -1225,13 +1405,23 @@ exit_with_config_file_errors(ItemList *config_errors, ItemList *config_warnings,
 void
-exit_with_cli_errors(ItemList *error_list)
+exit_with_cli_errors(ItemList *error_list, const char *repmgr_command)
 {
 	fprintf(stderr, _("The following command line errors were encountered:\n"));
 	print_item_list(error_list);
-	fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname());
+	if (repmgr_command != NULL)
 	{
 		fprintf(stderr, _("Try \"%s --help\" or \"%s %s --help\" for more information.\n"),
 				progname(),
 				progname(),
 				repmgr_command);
 	}
 	else
 	{
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname());
 	}
 	exit(ERR_BAD_CONFIG);
 }
@@ -1336,11 +1526,14 @@ repmgr_atoi(const char *value, const char *config_item, ItemList *error_list, in
 *
 *   https://www.postgresql.org/docs/current/static/config-setting.html
 */
-static bool
+bool
 parse_bool(const char *s, const char *config_item, ItemList *error_list)
 {
 	PQExpBufferData errors;
 	if (s == NULL)
 		return true;
 	if (strcasecmp(s, "0") == 0)
 		return false;
@@ -1533,31 +1726,112 @@ clear_event_notification_list(t_configuration_options *options)
 }
-bool
+int
-parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
+parse_output_to_argv(const char *string, char ***argv_array)
 {
 	int			options_len = 0;
 	char	   *options_string = NULL;
 	char	   *options_string_ptr = NULL;
 	int			c = 1,
 	   			argc_item = 1;
 	char	   *argv_item = NULL;
 	char	  **local_argv_array = NULL;
 	ItemListCell *cell;
 	/*
 	 * Add parsed options to this list, then copy to an array to pass to
 	 * getopt
 	 */
-	static ItemList option_argv = {NULL, NULL};
+	ItemList option_argv = {NULL, NULL};
-	char	   *argv_item = NULL;
+	options_len = strlen(string) + 1;
-	int			c,
+	options_string = pg_malloc0(options_len);
-				argc_item = 1;
+	options_string_ptr = options_string;
 	/* Copy the string before operating on it with strtok() */
 	strncpy(options_string, string, options_len);
 	/* Extract arguments into a list and keep a count of the total */
 	while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
 	{
 		item_list_append(&option_argv, trim(argv_item));
 		argc_item++;
 		if (options_string_ptr != NULL)
 			options_string_ptr = NULL;
 	}
 	pfree(options_string);
 	/*
 	 * Array of argument values to pass to getopt_long - this will need to
 	 * include an empty string as the first value (normally this would be the
 	 * program name)
 	 */
 	local_argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
 	/* Insert a blank dummy program name at the start of the array */
 	local_argv_array[0] = pg_malloc0(1);
 	/*
 	 * Copy the previously extracted arguments from our list to the array
 	 */
 	for (cell = option_argv.head; cell; cell = cell->next)
 	{
 		int			argv_len = strlen(cell->string) + 1;
 		local_argv_array[c] = (char *)pg_malloc0(argv_len);
 		strncpy(local_argv_array[c], cell->string, argv_len);
 		c++;
 	}
 	local_argv_array[c] = NULL;
 	item_list_free(&option_argv);
 	*argv_array = local_argv_array;
 	return argc_item;
 }
 void
 free_parsed_argv(char ***argv_array)
 {
 	char	  **local_argv_array = *argv_array;
 	int			i = 0;
 	while (local_argv_array[i] != NULL)
 	{
 		pfree((char *)local_argv_array[i]);
 		i++;
 	}
 	pfree((char **)local_argv_array);
 	*argv_array = NULL;
 }
 bool
 parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
 {
 	bool		backup_options_ok = true;
 	int			c = 0,
 				argc_item = 0;
 	char	  **argv_array = NULL;
 	ItemListCell *cell = NULL;
 	int			optindex = 0;
 	struct option *long_options = NULL;
 	bool		backup_options_ok = true;
 	/* We're only interested in these options */
 	static struct option long_options_9[] =
@@ -1583,56 +1857,12 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
 	if (!strlen(pg_basebackup_options))
 		return backup_options_ok;
 	options_len = strlen(pg_basebackup_options) + 1;
 	options_string = pg_malloc(options_len);
 	options_string_ptr = options_string;
 	if (server_version_num >= 100000)
 		long_options = long_options_10;
 	else
 		long_options = long_options_9;
-	/* Copy the string before operating on it with strtok() */
+	argc_item = parse_output_to_argv(pg_basebackup_options, &argv_array);
 	strncpy(options_string, pg_basebackup_options, options_len);
 	/* Extract arguments into a list and keep a count of the total */
 	while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
 	{
 		item_list_append(&option_argv, argv_item);
 		argc_item++;
 		if (options_string_ptr != NULL)
 			options_string_ptr = NULL;
 	}
 	/*
 	 * Array of argument values to pass to getopt_long - this will need to
 	 * include an empty string as the first value (normally this would be the
 	 * program name)
 	 */
 	argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
 	/* Insert a blank dummy program name at the start of the array */
 	argv_array[0] = pg_malloc0(1);
 	c = 1;
 	/*
 	 * Copy the previously extracted arguments from our list to the array
 	 */
 	for (cell = option_argv.head; cell; cell = cell->next)
 	{
 		int			argv_len = strlen(cell->string) + 1;
 		argv_array[c] = pg_malloc0(argv_len);
 		strncpy(argv_array[c], cell->string, argv_len);
 		c++;
 	}
 	argv_array[c] = NULL;
 	/* Reset getopt's optind variable */
 	optind = 0;
@@ -1676,15 +1906,7 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
 		backup_options_ok = false;
 	}
-	pfree(options_string);
+	free_parsed_argv(&argv_array);
 	{
 		int			i;
 		for (i = 0; i < argc_item + 2; i++)
 			pfree(argv_array[i]);
 	}
 	pfree(argv_array);
 	return backup_options_ok;
 }
--- a/configfile.h
+++ b/configfile.h
@@ -1,7 +1,7 @@
 /*
 * configfile.h
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 *
 * This program is free software: you can redistribute it and/or modify
@@ -73,7 +73,9 @@ typedef struct
 	char		conninfo[MAXLEN];
 	char		replication_user[NAMEDATALEN];
 	char		data_directory[MAXPGPATH];
 	char		config_directory[MAXPGPATH];
 	char		pg_bindir[MAXPGPATH];
 	char		repmgr_bindir[MAXPGPATH];
 	int			replication_type;
 	/* log settings */
@@ -82,14 +84,31 @@ typedef struct
 	char		log_file[MAXLEN];
 	int			log_status_interval;
-	/* standby action settings */
+	/* standby clone settings */
 	bool		use_replication_slots;
 	char		pg_basebackup_options[MAXLEN];
 	char		restore_command[MAXLEN];
 	TablespaceList tablespace_mapping;
 	char		recovery_min_apply_delay[MAXLEN];
 	bool		recovery_min_apply_delay_provided;
 	char		archive_cleanup_command[MAXLEN];
 	bool		use_primary_conninfo_password;
 	char		passfile[MAXPGPATH];
 	/* standby promote settings */
 	int			promote_check_timeout;
 	int			promote_check_interval;
 	/* standby follow settings */
 	int			primary_follow_timeout;
 	int			standby_follow_timeout;
 	/* standby switchover settings */
 	int			shutdown_check_timeout;
 	int			standby_reconnect_timeout;
 	/* node rejoin settings */
 	int			node_rejoin_timeout;
 	/* node check settings */
 	int			archive_ready_warning;
@@ -97,6 +116,9 @@ typedef struct
 	int			replication_lag_warning;
 	int			replication_lag_critical;
 	/* witness settings */
 	int			witness_sync_interval;
 	/* repmgrd settings */
 	failover_mode_opt failover;
 	char		location[MAXLEN];
@@ -110,7 +132,8 @@ typedef struct
 	int			degraded_monitoring_timeout;
 	int			async_query_timeout;
 	int			primary_notification_timeout;
-	int			primary_follow_timeout;
+	int			repmgrd_standby_startup_timeout;
 	char		repmgrd_pid_file[MAXPGPATH];
 	/* BDR settings */
 	bool		bdr_local_monitoring_only;
@@ -149,14 +172,26 @@ typedef struct
 #define T_CONFIGURATION_OPTIONS_INITIALIZER { \
 		/* node information */ \
-		UNKNOWN_NODE_ID, "", "", "", "", "", REPLICATION_TYPE_PHYSICAL,	\
+		UNKNOWN_NODE_ID, "", "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL,	\
 		/* log settings */ \
 		"", "", "", DEFAULT_LOG_STATUS_INTERVAL,	\
-		/* standby action settings */ \
+		/* standby clone settings */ \
-		false, "", "", { NULL, NULL }, "", false, false, \
+		false, "", "", { NULL, NULL }, "", false, "", false, "", \
 		/* standby promote settings */ \
 		DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
 		/* standby follow settings */ \
 		DEFAULT_PRIMARY_FOLLOW_TIMEOUT,	\
 		DEFAULT_STANDBY_FOLLOW_TIMEOUT,	\
 		/* standby switchover settings */ \
 		DEFAULT_SHUTDOWN_CHECK_TIMEOUT, \
 		DEFAULT_STANDBY_RECONNECT_TIMEOUT, \
 		/* node rejoin settings */ \
 		DEFAULT_NODE_REJOIN_TIMEOUT, \
 		/* node check settings */ \
 		DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
 		DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
 		/* witness settings */ \
 		DEFAULT_WITNESS_SYNC_INTERVAL, \
 		/* repmgrd settings */ \
 		FAILOVER_MANUAL, DEFAULT_LOCATION, DEFAULT_PRIORITY, "", "", \
 		DEFAULT_MONITORING_INTERVAL, \
@@ -165,7 +200,7 @@ typedef struct
        false, -1, \
 		DEFAULT_ASYNC_QUERY_TIMEOUT, \
 		DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT,	\
-		DEFAULT_PRIMARY_FOLLOW_TIMEOUT,	\
+		-1, "", \
 		/* BDR settings */ \
 		false, DEFAULT_BDR_RECOVERY_TIMEOUT, \
 		/* service settings */ \
@@ -241,30 +276,36 @@ typedef struct
 	"", "", "", "" \
 }
-
+#include "dbutils.h"
 void		set_progname(const char *argv0);
 const char *progname(void);
 void		load_config(const char *config_file, bool verbose, bool terse, t_configuration_options *options, char *argv0);
-void		parse_config(t_configuration_options *options, bool terse);
+bool		reload_config(t_configuration_options *orig_options, t_server_type server_type);
 bool		reload_config(t_configuration_options *orig_options);
 bool		parse_recovery_conf(const char *data_dir, t_recovery_conf *conf);
 bool		parse_bool(const char *s,
 					   const char *config_item,
 					   ItemList *error_list);
 int repmgr_atoi(const char *s,
 			const char *config_item,
 			ItemList *error_list,
 			int minval);
 bool parse_pg_basebackup_options(const char *pg_basebackup_options,
 							t_basebackup_options *backup_options,
 							int server_version_num,
 							ItemList *error_list);
 int parse_output_to_argv(const char *string, char ***argv_array);
 void free_parsed_argv(char ***argv_array);
 /* called by repmgr-client and repmgrd */
-void		exit_with_cli_errors(ItemList *error_list);
+void		exit_with_cli_errors(ItemList *error_list, const char *repmgr_command);
 void		print_item_list(ItemList *item_list);
 #endif							/* _REPMGR_CONFIGFILE_H_ */
--- a/45
+++ b/45
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.69 for repmgr 4.0.
+# Generated by GNU Autoconf 2.69 for repmgr 4.2.
 #
 # Report bugs to <pgsql-bugs@postgresql.org>.
 #
@@ -11,7 +11,7 @@
 # This configure script is free software; the Free Software Foundation
 # gives unlimited permission to copy, distribute and modify it.
 #
-# Copyright (c) 2010-2017, 2ndQuadrant Ltd.
+# Copyright (c) 2010-2018, 2ndQuadrant Ltd.
 ## -------------------- ##
 ## M4sh Initialization. ##
 ## -------------------- ##
@@ -582,8 +582,8 @@ MAKEFLAGS=
 # Identity of this package.
 PACKAGE_NAME='repmgr'
 PACKAGE_TARNAME='repmgr'
-PACKAGE_VERSION='4.0'
+PACKAGE_VERSION='4.2'
-PACKAGE_STRING='repmgr 4.0'
+PACKAGE_STRING='repmgr 4.2'
 PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org'
 PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/'
@@ -633,7 +633,6 @@ SHELL'
 ac_subst_files=''
 ac_user_opts='
 enable_option_checking
 with_bdr_only
 '
      ac_precious_vars='build_alias
 host_alias
@@ -1179,7 +1178,7 @@ if test "$ac_init_help" = "long"; then
  # Omit some internal or obsolete options to make the list less imposing.
  # This message is too long to be a string in the A/UX 3.1 sh.
  cat <<_ACEOF
-\`configure' configures repmgr 4.0 to adapt to many kinds of systems.
+\`configure' configures repmgr 4.2 to adapt to many kinds of systems.
 Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1240,15 +1239,10 @@ fi
 if test -n "$ac_init_help"; then
  case $ac_init_help in
-     short | recursive ) echo "Configuration of repmgr 4.0:";;
+     short | recursive ) echo "Configuration of repmgr 4.2:";;
   esac
  cat <<\_ACEOF
 Optional Packages:
  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no)
  --with-bdr-only         BDR-only build
 Some influential environment variables:
  PG_CONFIG   Location to find pg_config for target PostgreSQL (default PATH)
@@ -1319,14 +1313,14 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
  cat <<\_ACEOF
-repmgr configure 4.0
+repmgr configure 4.2
 generated by GNU Autoconf 2.69
 Copyright (C) 2012 Free Software Foundation, Inc.
 This configure script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it.
-Copyright (c) 2010-2017, 2ndQuadrant Ltd.
+Copyright (c) 2010-2018, 2ndQuadrant Ltd.
 _ACEOF
  exit
 fi
@@ -1338,7 +1332,7 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
-It was created by repmgr $as_me 4.0, which was
+It was created by repmgr $as_me 4.2, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
  $ $0 $@
@@ -1694,20 +1688,6 @@ ac_config_headers="$ac_config_headers config.h"
 # Check whether --with-bdr_only was given.
 if test "${with_bdr_only+set}" = set; then :
  withval=$with_bdr_only;
 fi
 if test "x$with_bdr_only" != "x"; then :
 $as_echo "#define BDR_ONLY \"1\"" >>confdefs.h
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5
 $as_echo_n "checking for a sed that does not truncate output... " >&6; }
 if ${ac_cv_path_SED+:} false; then :
@@ -1871,6 +1851,8 @@ ac_config_files="$ac_config_files Makefile"
 ac_config_files="$ac_config_files Makefile.global"
 ac_config_files="$ac_config_files doc/Makefile"
 cat >confcache <<\_ACEOF
 # This file is a shell script that caches the results of configure
 # tests run on this system so they can be shared between configure
@@ -2377,7 +2359,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by repmgr $as_me 4.0, which was
+This file was extended by repmgr $as_me 4.2, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
  CONFIG_FILES    = $CONFIG_FILES
@@ -2440,7 +2422,7 @@ _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-repmgr config.status 4.0
+repmgr config.status 4.2
 configured by $0, generated by GNU Autoconf 2.69,
  with options \\"\$ac_cs_config\\"
@@ -2564,6 +2546,7 @@ do
    "config.h") CONFIG_HEADERS="$CONFIG_HEADERS config.h" ;;
    "Makefile") CONFIG_FILES="$CONFIG_FILES Makefile" ;;
    "Makefile.global") CONFIG_FILES="$CONFIG_FILES Makefile.global" ;;
    "doc/Makefile") CONFIG_FILES="$CONFIG_FILES doc/Makefile" ;;
  *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;;
  esac
--- a/configure.in
+++ b/configure.in
@@ -1,17 +1,11 @@
-AC_INIT([repmgr], [4.0], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
+AC_INIT([repmgr], [4.2], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
-AC_COPYRIGHT([Copyright (c) 2010-2017, 2ndQuadrant Ltd.])
+AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.])
 AC_CONFIG_HEADER(config.h)
 AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)])
 AC_ARG_WITH([bdr_only], [AS_HELP_STRING([--with-bdr-only], [BDR-only build])])
 AS_IF([test "x$with_bdr_only" != "x"],
    [AC_DEFINE([BDR_ONLY], ["1"], [Only build repmgr for BDR])]
 )
 AC_PROG_SED
 if test -z "$PG_CONFIG"; then
@@ -65,5 +59,6 @@ AC_SUBST(vpath_build)
 AC_CONFIG_FILES([Makefile])
 AC_CONFIG_FILES([Makefile.global])
 AC_CONFIG_FILES([doc/Makefile])
 AC_OUTPUT
--- a/controldata.c
+++ b/controldata.c
@@ -1,6 +1,6 @@
 /*
 * controldata.c
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
@@ -37,13 +37,8 @@ get_system_identifier(const char *data_directory)
 	uint64		system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
 	control_file_info = get_controlfile(data_directory);
 	system_identifier = control_file_info->system_identifier;
 	if (control_file_info->control_file_processed == true)
 		system_identifier = control_file_info->control_file->system_identifier;
 	else
 		system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
 	pfree(control_file_info->control_file);
 	pfree(control_file_info);
 	return system_identifier;
@@ -57,13 +52,8 @@ get_db_state(const char *data_directory)
 	control_file_info = get_controlfile(data_directory);
-	if (control_file_info->control_file_processed == true)
+	state = control_file_info->state;
 		state = control_file_info->control_file->state;
 	else
 		/* if we were unable to parse the control file, assume DB is shut down */
 		state = DB_SHUTDOWNED;
 	pfree(control_file_info->control_file);
 	pfree(control_file_info);
 	return state;
@@ -78,12 +68,8 @@ get_latest_checkpoint_location(const char *data_directory)
 	control_file_info = get_controlfile(data_directory);
-	if (control_file_info->control_file_processed == false)
+	checkPoint = control_file_info->checkPoint;
 		return InvalidXLogRecPtr;
 	checkPoint = control_file_info->control_file->checkPoint;
 	pfree(control_file_info->control_file);
 	pfree(control_file_info);
 	return checkPoint;
@@ -98,16 +84,8 @@ get_data_checksum_version(const char *data_directory)
 	control_file_info = get_controlfile(data_directory);
-	if (control_file_info->control_file_processed == false)
+	data_checksum_version = (int) control_file_info->data_checksum_version;
 	{
 		data_checksum_version = -1;
 	}
 	else
 	{
 		data_checksum_version = (int) control_file_info->control_file->data_checksum_version;
 	}
 	pfree(control_file_info->control_file);
 	pfree(control_file_info);
 	return data_checksum_version;
@@ -139,33 +117,109 @@ describe_db_state(DBState state)
 /*
- * we maintain our own version of get_controlfile() as we need cross-version
+ * We maintain our own version of get_controlfile() as we need cross-version
 * compatibility, and also don't care if the file isn't readable.
 */
 static ControlFileInfo *
 get_controlfile(const char *DataDir)
 {
 	ControlFileInfo *control_file_info;
-	int			fd;
+	FILE	   *fp = NULL;
 	int			fd, ret, version_num;
 	char		PgVersionPath[MAXPGPATH] = "";
 	char		ControlFilePath[MAXPGPATH] = "";
 	char		file_version_string[64] = "";
 	long		file_major, file_minor;
 	char	   *endptr = NULL;
 	void	   *ControlFileDataPtr = NULL;
 	int			expected_size = 0;
 	control_file_info = palloc0(sizeof(ControlFileInfo));
 	/* set default values */
 	control_file_info->control_file_processed = false;
-	control_file_info->control_file = palloc0(sizeof(ControlFileData));
+	control_file_info->system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
 	control_file_info->state = DB_SHUTDOWNED;
 	control_file_info->checkPoint = InvalidXLogRecPtr;
 	control_file_info->data_checksum_version = -1;
 	/*
 	 * Read PG_VERSION, as we'll need to determine which struct to read
 	 * the control file contents into
 	 */
 	snprintf(PgVersionPath, MAXPGPATH, "%s/PG_VERSION", DataDir);
 	fp = fopen(PgVersionPath, "r");
 	if (fp == NULL)
 	{
 		log_warning(_("could not open file \"%s\" for reading"),
 					PgVersionPath);
 		log_detail("%s", strerror(errno));
 		return control_file_info;
 	}
 	file_version_string[0] = '\0';
 	ret = fscanf(fp, "%63s", file_version_string);
 	fclose(fp);
 	if (ret != 1 || endptr == file_version_string)
 	{
 		log_warning(_("unable to determine major version number from PG_VERSION"));
 		return control_file_info;
 	}
 	file_major = strtol(file_version_string, &endptr, 10);
 	file_minor = 0;
 	if (*endptr == '.')
 		file_minor = strtol(endptr + 1, NULL, 10);
 	version_num = ((int) file_major * 10000) + ((int) file_minor * 100);
 	if (version_num < 90300)
 	{
 		log_warning(_("Data directory appears to be initialised for %s"), file_version_string);
 		return control_file_info;
 	}
 	snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);
 	if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1)
 	{
-		log_debug("could not open file \"%s\" for reading: %s",
+		log_warning(_("could not open file \"%s\" for reading"),
-				  ControlFilePath, strerror(errno));
+					ControlFilePath);
 		log_detail("%s", strerror(errno));
 		return control_file_info;
 	}
-	if (read(fd, control_file_info->control_file, sizeof(ControlFileData)) != sizeof(ControlFileData))
+
 	if (version_num >= 90500)
 	{
-		log_debug("could not read file \"%s\": %s",
+		expected_size = sizeof(ControlFileData95);
-				  ControlFilePath, strerror(errno));
+		ControlFileDataPtr = palloc0(expected_size);
 	}
 	else if (version_num >= 90400)
 	{
 		expected_size = sizeof(ControlFileData94);
 		ControlFileDataPtr = palloc0(expected_size);
 	}
 	else if (version_num >= 90300)
 	{
 		expected_size = sizeof(ControlFileData93);
 		ControlFileDataPtr = palloc0(expected_size);
 	}
 	if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
 	{
 		log_warning(_("could not read file \"%s\""),
 					ControlFilePath);
 		log_detail("%s", strerror(errno));
 		return control_file_info;
 	}
@@ -173,6 +227,41 @@ get_controlfile(const char *DataDir)
 	control_file_info->control_file_processed = true;
 	if (version_num >= 110000)
 	{
 		ControlFileData11 *ptr = (struct ControlFileData11 *)ControlFileDataPtr;
 		control_file_info->system_identifier = ptr->system_identifier;
 		control_file_info->state = ptr->state;
 		control_file_info->checkPoint = ptr->checkPoint;
 		control_file_info->data_checksum_version = ptr->data_checksum_version;
 	}
 	else if (version_num >= 90500)
 	{
 		ControlFileData95 *ptr = (struct ControlFileData95 *)ControlFileDataPtr;
 		control_file_info->system_identifier = ptr->system_identifier;
 		control_file_info->state = ptr->state;
 		control_file_info->checkPoint = ptr->checkPoint;
 		control_file_info->data_checksum_version = ptr->data_checksum_version;
 	}
 	else if (version_num >= 90400)
 	{
 		ControlFileData94 *ptr = (struct ControlFileData94 *)ControlFileDataPtr;
 		control_file_info->system_identifier = ptr->system_identifier;
 		control_file_info->state = ptr->state;
 		control_file_info->checkPoint = ptr->checkPoint;
 		control_file_info->data_checksum_version = ptr->data_checksum_version;
 	}
 	else if (version_num >= 90300)
 	{
 		ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
 		control_file_info->system_identifier = ptr->system_identifier;
 		control_file_info->state = ptr->state;
 		control_file_info->checkPoint = ptr->checkPoint;
 		control_file_info->data_checksum_version = ptr->data_checksum_version;
 	}
 	pfree(ControlFileDataPtr);
 	/*
 	 * We don't check the CRC here as we're potentially checking a pg_control
 	 * file from a different PostgreSQL version to the one repmgr was compiled
--- a/controldata.h
+++ b/controldata.h
@@ -1,6 +1,6 @@
 /*
 * controldata.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
@@ -12,12 +12,326 @@
 #include "postgres_fe.h"
 #include "catalog/pg_control.h"
 /*
 * A simplified representation of pg_control containing only those fields
 * required by repmgr.
 */
 typedef struct
 {
 	bool		control_file_processed;
-	ControlFileData *control_file;
+	uint64		system_identifier;
 	DBState		state;
 	XLogRecPtr	checkPoint;
 	uint32		data_checksum_version;
 } ControlFileInfo;
 /* Same for 9.3, 9.4 */
 typedef struct CheckPoint93
 {
 	XLogRecPtr	redo;			/* next RecPtr available when we began to
 								 * create CheckPoint (i.e. REDO start point) */
 	TimeLineID	ThisTimeLineID; /* current TLI */
 	TimeLineID	PrevTimeLineID; /* previous TLI, if this record begins a new
 								 * timeline (equals ThisTimeLineID otherwise) */
 	bool		fullPageWrites; /* current full_page_writes */
 	uint32		nextXidEpoch;	/* higher-order bits of nextXid */
 	TransactionId nextXid;		/* next free XID */
 	Oid			nextOid;		/* next free OID */
 	MultiXactId nextMulti;		/* next free MultiXactId */
 	MultiXactOffset nextMultiOffset;	/* next free MultiXact offset */
 	TransactionId oldestXid;	/* cluster-wide minimum datfrozenxid */
 	Oid			oldestXidDB;	/* database with minimum datfrozenxid */
 	MultiXactId oldestMulti;	/* cluster-wide minimum datminmxid */
 	Oid			oldestMultiDB;	/* database with minimum datminmxid */
 	pg_time_t	time;			/* time stamp of checkpoint */
 	TransactionId oldestActiveXid;
 } CheckPoint93;
 /* Same for 9.5, 9.6, 10, HEAD */
 typedef struct CheckPoint95
 {
 	XLogRecPtr	redo;			/* next RecPtr available when we began to
 								 * create CheckPoint (i.e. REDO start point) */
 	TimeLineID	ThisTimeLineID; /* current TLI */
 	TimeLineID	PrevTimeLineID; /* previous TLI, if this record begins a new
 								 * timeline (equals ThisTimeLineID otherwise) */
 	bool		fullPageWrites; /* current full_page_writes */
 	uint32		nextXidEpoch;	/* higher-order bits of nextXid */
 	TransactionId nextXid;		/* next free XID */
 	Oid			nextOid;		/* next free OID */
 	MultiXactId nextMulti;		/* next free MultiXactId */
 	MultiXactOffset nextMultiOffset;	/* next free MultiXact offset */
 	TransactionId oldestXid;	/* cluster-wide minimum datfrozenxid */
 	Oid			oldestXidDB;	/* database with minimum datfrozenxid */
 	MultiXactId oldestMulti;	/* cluster-wide minimum datminmxid */
 	Oid			oldestMultiDB;	/* database with minimum datminmxid */
 	pg_time_t	time;			/* time stamp of checkpoint */
 	TransactionId oldestCommitTsXid;	/* oldest Xid with valid commit
 										 * timestamp */
 	TransactionId newestCommitTsXid;	/* newest Xid with valid commit
 										 * timestamp */
 	TransactionId oldestActiveXid;
 } CheckPoint95;
 typedef struct ControlFileData93
 {
 	uint64		system_identifier;
 	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
 	uint32		catalog_version_no;		/* see catversion.h */
 	DBState		state;			/* see enum above */
 	pg_time_t	time;			/* time stamp of last pg_control update */
 	XLogRecPtr	checkPoint;		/* last check point record ptr */
 	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
 	CheckPoint93	checkPointCopy; /* copy of last check point record */
 	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
 	int			wal_level;
 	int			MaxConnections;
 	int			max_prepared_xacts;
 	int			max_locks_per_xact;
 	uint32		maxAlign;		/* alignment requirement for tuples */
 	double		floatFormat;	/* constant 1234567.0 */
 	uint32		blcksz;			/* data block size for this DB */
 	uint32		relseg_size;	/* blocks per segment of large relation */
 	uint32		xlog_blcksz;	/* block size within WAL files */
 	uint32		xlog_seg_size;	/* size of each WAL segment */
 	uint32		nameDataLen;	/* catalog name field width */
 	uint32		indexMaxKeys;	/* max number of columns in an index */
 	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
 	/* flag indicating internal format of timestamp, interval, time */
 	bool		enableIntTimes; /* int64 storage enabled? */
 	/* flags indicating pass-by-value status of various types */
 	bool		float4ByVal;	/* float4 pass-by-value? */
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 } ControlFileData93;
 /*
 * Following fields added since 9.3:
 *
 * 	int			max_worker_processes;
 *  int			max_prepared_xacts;
 *  int			max_locks_per_xact;
 *
 */
 typedef struct ControlFileData94
 {
 	uint64		system_identifier;
 	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
 	uint32		catalog_version_no;		/* see catversion.h */
 	DBState		state;			/* see enum above */
 	pg_time_t	time;			/* time stamp of last pg_control update */
 	XLogRecPtr	checkPoint;		/* last check point record ptr */
 	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
 	CheckPoint93	checkPointCopy; /* copy of last check point record */
 	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
 	int			wal_level;
 	bool		wal_log_hints;
 	int			MaxConnections;
 	int			max_worker_processes;
 	int			max_prepared_xacts;
 	int			max_locks_per_xact;
 	uint32		maxAlign;		/* alignment requirement for tuples */
 	double		floatFormat;	/* constant 1234567.0 */
 	uint32		blcksz;			/* data block size for this DB */
 	uint32		relseg_size;	/* blocks per segment of large relation */
 	uint32		xlog_blcksz;	/* block size within WAL files */
 	uint32		xlog_seg_size;	/* size of each WAL segment */
 	uint32		nameDataLen;	/* catalog name field width */
 	uint32		indexMaxKeys;	/* max number of columns in an index */
 	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
 	uint32		loblksize;		/* chunk size in pg_largeobject */
 	bool		enableIntTimes; /* int64 storage enabled? */
 	bool		float4ByVal;	/* float4 pass-by-value? */
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 } ControlFileData94;
 /*
 * Following field added since 9.4:
 *
 *	bool		track_commit_timestamp;
 *
 * Unchanged in 9.6
 *
 * In 10, following field appended *after* "data_checksum_version":
 *
 *	char		mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
 *
 * (but we don't care about that)
 */
 typedef struct ControlFileData95
 {
 	uint64		system_identifier;
 	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
 	uint32		catalog_version_no;		/* see catversion.h */
 	DBState		state;			/* see enum above */
 	pg_time_t	time;			/* time stamp of last pg_control update */
 	XLogRecPtr	checkPoint;		/* last check point record ptr */
 	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
 	CheckPoint95	checkPointCopy; /* copy of last check point record */
 	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
 	int			wal_level;
 	bool		wal_log_hints;
 	int			MaxConnections;
 	int			max_worker_processes;
 	int			max_prepared_xacts;
 	int			max_locks_per_xact;
 	bool		track_commit_timestamp;
 	uint32		maxAlign;		/* alignment requirement for tuples */
 	double		floatFormat;	/* constant 1234567.0 */
 	uint32		blcksz;			/* data block size for this DB */
 	uint32		relseg_size;	/* blocks per segment of large relation */
 	uint32		xlog_blcksz;	/* block size within WAL files */
 	uint32		xlog_seg_size;	/* size of each WAL segment */
 	uint32		nameDataLen;	/* catalog name field width */
 	uint32		indexMaxKeys;	/* max number of columns in an index */
 	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
 	uint32		loblksize;		/* chunk size in pg_largeobject */
 	bool		enableIntTimes; /* int64 storage enabled? */
 	bool		float4ByVal;	/* float4 pass-by-value? */
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 	uint32		data_checksum_version;
 } ControlFileData95;
 /*
 * Following field removed in 11:
 *
 *  XLogRecPtr	prevCheckPoint;
 *
 * In 10, following field appended *after* "data_checksum_version":
 *
 * 	char		mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
 *
 * (but we don't care about that)
 */
 typedef struct ControlFileData11
 {
 	uint64		system_identifier;
 	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
 	uint32		catalog_version_no;		/* see catversion.h */
 	DBState		state;			/* see enum above */
 	pg_time_t	time;			/* time stamp of last pg_control update */
 	XLogRecPtr	checkPoint;		/* last check point record ptr */
 	CheckPoint95	checkPointCopy; /* copy of last check point record */
 	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
 	int			wal_level;
 	bool		wal_log_hints;
 	int			MaxConnections;
 	int			max_worker_processes;
 	int			max_prepared_xacts;
 	int			max_locks_per_xact;
 	bool		track_commit_timestamp;
 	uint32		maxAlign;		/* alignment requirement for tuples */
 	double		floatFormat;	/* constant 1234567.0 */
 	uint32		blcksz;			/* data block size for this DB */
 	uint32		relseg_size;	/* blocks per segment of large relation */
 	uint32		xlog_blcksz;	/* block size within WAL files */
 	uint32		xlog_seg_size;	/* size of each WAL segment */
 	uint32		nameDataLen;	/* catalog name field width */
 	uint32		indexMaxKeys;	/* max number of columns in an index */
 	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
 	uint32		loblksize;		/* chunk size in pg_largeobject */
 	bool		enableIntTimes; /* int64 storage enabled? */
 	bool		float4ByVal;	/* float4 pass-by-value? */
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 	uint32		data_checksum_version;
 } ControlFileData11;
 extern DBState get_db_state(const char *data_directory);
 extern const char *describe_db_state(DBState state);
 extern int	get_data_checksum_version(const char *data_directory);
--- a/dbutils.c
+++ b/dbutils.c
--- a/dbutils.h
+++ b/dbutils.h
@@ -1,7 +1,7 @@
 /*
 * dbutils.h
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,8 +28,10 @@
 #include "strutil.h"
 #include "voting.h"
-#define REPMGR_NODES_COLUMNS "node_id, type, upstream_node_id, node_name, conninfo, repluser, slot_name, location, priority, active, config_file, '' AS upstream_node_name "
+#define REPMGR_NODES_COLUMNS "n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name "
-#define BDR_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_status, node_name, node_local_dsn, node_init_from_dsn, node_read_only, node_seq_id"
+#define BDR2_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_name, node_local_dsn, ''"
 #define BDR3_NODES_COLUMNS "ns.node_id, 0, 0, ns.node_name, ns.interface_connstr, ns.peer_state_name"
 #define ERRBUFF_SIZE 512
@@ -38,12 +40,14 @@ typedef enum
 	UNKNOWN = 0,
 	PRIMARY,
 	STANDBY,
 	WITNESS,
 	BDR
 } t_server_type;
 typedef enum
 {
 	REPMGR_INSTALLED = 0,
 	REPMGR_OLD_VERSION_INSTALLED,
 	REPMGR_AVAILABLE,
 	REPMGR_UNAVAILABLE,
 	REPMGR_UNKNOWN
@@ -73,17 +77,18 @@ typedef enum
 {
 	NODE_STATUS_UNKNOWN = -1,
 	NODE_STATUS_UP,
 	NODE_STATUS_SHUTTING_DOWN,
 	NODE_STATUS_DOWN,
 	NODE_STATUS_UNCLEAN_SHUTDOWN
 } NodeStatus;
 typedef enum
 {
-	VR_VOTE_REFUSED = -1,
+	CONN_UNKNOWN = -1,
-	VR_POSITIVE_VOTE,
+	CONN_OK,
-	VR_NEGATIVE_VOTE
+	CONN_BAD,
-} VoteRequestResult;
+	CONN_ERROR
-
+} ConnectionStatus;
 typedef enum
 {
@@ -92,6 +97,28 @@ typedef enum
 	SLOT_ACTIVE
 } ReplSlotStatus;
 typedef enum
 {
 	BACKUP_STATE_UNKNOWN = -1,
 	BACKUP_STATE_IN_BACKUP,
 	BACKUP_STATE_NO_BACKUP
 } BackupState;
 /*
 * Struct to store extension version information
 */
 typedef struct s_extension_versions {
 	char		default_version[8];
 	char		installed_version[8];
 } t_extension_versions;
 #define T_EXTENSION_VERSIONS_INITIALIZER { \
 	"", \
 	"", \
 }
 /*
 * Struct to store node information
 */
@@ -181,11 +208,13 @@ typedef struct s_event_info
 {
 	char	   *node_name;
 	char	   *conninfo_str;
 	int			node_id;
 } t_event_info;
 #define T_EVENT_INFO_INITIALIZER { \
 	NULL, \
- 	NULL \
+	NULL, \
 	UNKNOWN_NODE_ID \
 }
@@ -233,18 +262,14 @@ typedef struct s_bdr_node_info
 	char		node_sysid[MAXLEN];
 	uint32		node_timeline;
 	uint32		node_dboid;
 	char		node_status;
 	char		node_name[MAXLEN];
 	char		node_local_dsn[MAXLEN];
-	char		node_init_from_dsn[MAXLEN];
+	char		peer_state_name[MAXLEN];
 	bool		read_only;
 	uint32		node_seq_id;
 } t_bdr_node_info;
 #define T_BDR_NODE_INFO_INITIALIZER { \
 	"", InvalidOid, InvalidOid, \
-	'?', "", "", "", \
+    "", "", "" \
    false, -1 \
 }
@@ -317,6 +342,21 @@ typedef struct
    UNKNOWN_TIMELINE_ID, \
 	InvalidXLogRecPtr \
 }
 typedef struct RepmgrdInfo {
 	int node_id;
 	int pid;
 	char pid_text[MAXLEN];
 	char pid_file[MAXLEN];
 	bool pg_running;
 	char pg_running_text[MAXLEN];
 	bool running;
 	char repmgrd_running[MAXLEN];
 	bool paused;
 } RepmgrdInfo;
 /* global variables */
 extern int	server_version_num;
@@ -336,26 +376,22 @@ __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
 bool		atobool(const char *value);
 /* connection functions */
-PGconn *establish_db_connection(const char *conninfo,
+PGconn	   *establish_db_connection(const char *conninfo,
 						const bool exit_on_error);
 PGconn	   *establish_db_connection_quiet(const char *conninfo);
-PGconn *establish_db_connection_as_user(const char *conninfo,
+PGconn	   *establish_db_connection_by_params(t_conninfo_param_list *param_list,
 								const char *user,
 								const bool exit_on_error);
 PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
 								  const bool exit_on_error);
-PGconn *establish_primary_db_connection(PGconn *conn,
+PGconn	   *establish_primary_db_connection(PGconn *conn,
 								const bool exit_on_error);
 PGconn	   *get_primary_connection(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
 PGconn	   *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
 bool		is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
 void		close_connection(PGconn **conn);
 /* conninfo manipulation functions */
 bool		get_conninfo_value(const char *conninfo, const char *keyword, char *output);
-
+bool		get_conninfo_default_value(const char *param, char *output, int maxlen);
 void		initialize_conninfo_params(t_conninfo_param_list *param_list, bool set_defaults);
 void		free_conninfo_params(t_conninfo_param_list *param_list);
 void		copy_conninfo_params(t_conninfo_param_list *dest_list, t_conninfo_param_list *source_list);
@@ -363,22 +399,21 @@ void		conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
 void		param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
 void		param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
 char	   *param_get(t_conninfo_param_list *param_list, const char *param);
-bool		parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char *errmsg, bool ignore_local_params);
+bool		parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params);
 char	   *param_list_to_string(t_conninfo_param_list *param_list);
 bool		has_passfile(void);
 /* transaction functions */
 bool		begin_transaction(PGconn *conn);
 bool		commit_transaction(PGconn *conn);
 bool		rollback_transaction(PGconn *conn);
 bool		check_cluster_schema(PGconn *conn);
 /* GUC manipulation functions */
 bool		set_config(PGconn *conn, const char *config_param, const char *config_value);
 bool		set_config_bool(PGconn *conn, const char *config_param, bool state);
-int guc_set(PGconn *conn, const char *parameter, const char *op,
+int		    guc_set(PGconn *conn, const char *parameter, const char *op, const char *value);
-		const char *value);
+int			guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype);
 int guc_set_typed(PGconn *conn, const char *parameter, const char *op,
 			  const char *value, const char *datatype);
 bool		get_pg_setting(PGconn *conn, const char *setting, char *output);
 /* server information functions */
@@ -386,13 +421,19 @@ bool		get_cluster_size(PGconn *conn, char *size);
 int			get_server_version(PGconn *conn, char *server_version);
 RecoveryType get_recovery_type(PGconn *conn);
 int			get_primary_node_id(PGconn *conn);
 bool		can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
 int			get_ready_archive_files(PGconn *conn, const char *data_directory);
 bool		identify_system(PGconn *repl_conn, t_system_identification *identification);
 bool		repmgrd_set_local_node_id(PGconn *conn, int local_node_id);
 int			repmgrd_get_local_node_id(PGconn *conn);
 BackupState	server_in_exclusive_backup_mode(PGconn *conn);
 void		repmgrd_set_pid(PGconn *conn, pid_t repmgrd_pid, const char *pidfile);
 pid_t		repmgrd_get_pid(PGconn *conn);
 bool		repmgrd_is_running(PGconn *conn);
 bool		repmgrd_is_paused(PGconn *conn);
 bool		repmgrd_pause(PGconn *conn, bool pause);
 /* extension functions */
-ExtensionStatus get_repmgr_extension_status(PGconn *conn);
+ExtensionStatus get_repmgr_extension_status(PGconn *conn, t_extension_versions *extversions);
 /* node management functions */
 void		checkpoint(PGconn *conn);
@@ -404,27 +445,35 @@ t_server_type parse_node_type(const char *type);
 const char *get_node_type_string(t_server_type type);
 RecordStatus get_node_record(PGconn *conn, int node_id, t_node_info *node_info);
 RecordStatus get_node_record_with_upstream(PGconn *conn, int node_id, t_node_info *node_info);
 RecordStatus get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_info);
 t_node_info *get_node_record_pointer(PGconn *conn, int node_id);
 bool		get_local_node_record(PGconn *conn, int node_id, t_node_info *node_info);
 bool		get_primary_node_record(PGconn *conn, t_node_info *node_info);
-void		get_all_node_records(PGconn *conn, NodeInfoList *node_list);
+bool		get_all_node_records(PGconn *conn, NodeInfoList *node_list);
 void		get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
 void		get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
 void		get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list);
-void		get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
+bool		get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
 bool		get_downstream_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *noede_list);
 bool		create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
 bool		update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
 bool		delete_node_record(PGconn *conn, int node);
 bool		truncate_node_records(PGconn *conn);
 bool		update_node_record_set_active(PGconn *conn, int this_node_id, bool active);
 bool		update_node_record_set_primary(PGconn *conn, int this_node_id);
 bool		update_node_record_set_active_standby(PGconn *conn, int this_node_id);
 bool		update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id);
 bool		update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active);
 bool		update_node_record_conn_priority(PGconn *conn, t_configuration_options *options);
 bool		update_node_record_slot_name(PGconn *primary_conn, int node_id, char *slot_name);
 bool		witness_copy_node_records(PGconn *primary_conn, PGconn *witness_conn);
 void		clear_node_info_list(NodeInfoList *nodes);
@@ -438,11 +487,15 @@ void		config_file_list_add(t_configfile_list *list, const char *file, const char
 bool		create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
 bool		create_event_notification(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
 bool		create_event_notification_extended(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details, t_event_info *event_info);
 PGresult   *get_event_records(PGconn *conn, int node_id, const char *node_name, const char *event, bool all, int limit);
 /* replication slot functions */
 void		create_slot_name(char *slot_name, int node_id);
 bool		create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
 bool		drop_replication_slot(PGconn *conn, char *slot_name);
 RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
 int			get_free_replication_slot_count(PGconn *conn);
 int			get_inactive_replication_slots(PGconn *conn, KeyValueList *list);
 /* tablespace functions */
 bool		get_tablespace_name_by_location(PGconn *conn, const char *location, char *name);
@@ -453,6 +506,8 @@ int			wait_connection_availability(PGconn *conn, long long timeout);
 /* node availability functions */
 bool		is_server_available(const char *conninfo);
 bool		is_server_available_params(t_conninfo_param_list *param_list);
 ExecStatusType	connection_ping(PGconn *conn);
 /* monitoring functions  */
 void
@@ -468,15 +523,15 @@ add_monitoring_record(PGconn *primary_conn,
 					  long long unsigned int apply_lag_bytes
 );
-int			get_number_of_monitoring_records_to_delete(PGconn *primary_conn, int keep_history);
+int			get_number_of_monitoring_records_to_delete(PGconn *primary_conn, int keep_history, int node_id);
-bool		delete_monitoring_records(PGconn *primary_conn, int keep_history);
+bool		delete_monitoring_records(PGconn *primary_conn, int keep_history, int node_id);
 /* node voting functions */
-NodeVotingStatus get_voting_status(PGconn *conn);
+void		initialize_voting_term(PGconn *conn);
-VoteRequestResult request_vote(PGconn *conn, t_node_info *this_node, t_node_info *other_node, int electoral_term);
+int			get_current_term(PGconn *conn);
-int			set_voting_status_initiated(PGconn *conn);
+void		increment_current_term(PGconn *conn);
 bool		announce_candidature(PGconn *conn, t_node_info *this_node, t_node_info *other_node, int electoral_term);
 void		notify_follow_primary(PGconn *conn, int primary_node_id);
 bool		get_new_primary(PGconn *conn, int *primary_node_id);
@@ -487,24 +542,32 @@ XLogRecPtr	get_current_wal_lsn(PGconn *conn);
 XLogRecPtr	get_last_wal_receive_location(PGconn *conn);
 bool		get_replication_info(PGconn *conn, ReplInfo *replication_info);
 int			get_replication_lag_seconds(PGconn *conn);
-void		get_node_replication_stats(PGconn *conn, t_node_info *node_info);
+void		get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *node_info);
 bool		is_downstream_node_attached(PGconn *conn, char *node_name);
 /* BDR functions */
 int			get_bdr_version_num(void);
 void		get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list);
 RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info);
 bool		is_bdr_db(PGconn *conn, PQExpBufferData *output);
 bool		is_bdr_db_quiet(PGconn *conn);
 bool		is_active_bdr_node(PGconn *conn, const char *node_name);
 bool		is_bdr_repmgr(PGconn *conn);
 char	   *get_default_bdr_replication_set(PGconn *conn);
 bool		is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
 bool		add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
 void		add_extension_tables_to_bdr_replication_set(PGconn *conn);
-
+bool		bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name);
 bool		bdr_node_exists(PGconn *conn, const char *node_name);
 ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name);
 void		get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf);
 bool		am_bdr_failover_handler(PGconn *conn, int node_id);
 void		unset_bdr_failover_handler(PGconn *conn);
 bool		bdr_node_has_repmgr_set(PGconn *conn, const char *node_name);
 bool		bdr_node_set_repmgr_set(PGconn *conn, const char *node_name);
 /* miscellaneous debugging functions */
 const char *print_node_status(NodeStatus node_status);
 const char *print_pqping_status(PGPing ping_status);
 #endif							/* _REPMGR_DBUTILS_H_ */
--- a/dirutil.c
+++ b/dirutil.c
@@ -3,7 +3,7 @@
 * dirmod.c
 *	  directory handling functions
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -21,6 +21,7 @@
 #include <unistd.h>
 #include <dirent.h>
 #include <signal.h>
 #include <sys/stat.h>
 #include <errno.h>
 #include <stdio.h>
@@ -34,34 +35,33 @@
 #include "dirutil.h"
 #include "strutil.h"
 #include "log.h"
 #include "controldata.h"
 static int	unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf);
 /* PID can be negative if backend is standalone */
 typedef long pgpid_t;
 /*
- * make sure the directory either doesn't exist or is empty
+ * Check if a directory exists, and if so whether it is empty.
 * we use this function to check the new data directory and
 * the directories for tablespaces
 *
- * This is the same check initdb does on the new PGDATA dir
+ * This function is used for checking both the data directory
- *
+ * and tablespace directories.
 * Returns 0 if nonexistent, 1 if exists and empty, 2 if not empty,
 * or -1 if trouble accessing directory
 */
-int
+DataDirState
 check_dir(char *path)
 {
-	DIR		   *chkdir;
+	DIR		   *chkdir = NULL;
-	struct dirent *file;
+	struct dirent *file = NULL;
-	int			result = 1;
+	int			result = DIR_EMPTY;
 	errno = 0;
 	chkdir = opendir(path);
 	if (!chkdir)
-		return (errno == ENOENT) ? 0 : -1;
+		return (errno == ENOENT) ? DIR_NOENT : DIR_ERROR;
 	while ((file = readdir(chkdir)) != NULL)
 	{
@@ -73,25 +73,15 @@ check_dir(char *path)
 		}
 		else
 		{
-			result = 2;			/* not empty */
+			result = DIR_NOT_EMPTY;
 			break;
 		}
 	}
 #ifdef WIN32
 	/*
 	 * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
 	 * released version
 	 */
 	if (GetLastError() == ERROR_NO_MORE_FILES)
 		errno = 0;
 #endif
 	closedir(chkdir);
 	if (errno != 0)
-		return -1;				/* some kind of I/O error? */
+		return DIR_ERROR;				/* some kind of I/O error? */
 	return result;
 }
@@ -106,12 +96,13 @@ create_dir(char *path)
 	if (mkdir_p(path, 0700) == 0)
 		return true;
-	log_error(_("unable to create directory \"%s\": %s"),
+	log_error(_("unable to create directory \"%s\""), path);
-			  path, strerror(errno));
+	log_detail("%s", strerror(errno));
 	return false;
 }
 bool
 set_dir_permissions(char *path)
 {
@@ -146,26 +137,6 @@ mkdir_p(char *path, mode_t omode)
 	oumask = 0;
 	retval = 0;
 #ifdef WIN32
 	/* skip network and drive specifiers for win32 */
 	if (strlen(p) >= 2)
 	{
 		if (p[0] == '/' && p[1] == '/')
 		{
 			/* network drive */
 			p = strstr(p + 2, "/");
 			if (p == NULL)
 				return 1;
 		}
 		else if (p[1] == ':' &&
 				 ((p[0] >= 'a' && p[0] <= 'z') ||
 				  (p[0] >= 'A' && p[0] <= 'Z')))
 		{
 			/* local drive */
 			p += 2;
 		}
 	}
 #endif
 	if (p[0] == '/')			/* Skip leading '/'. */
 		++p;
@@ -242,17 +213,91 @@ is_pg_dir(char *path)
 	return false;
 }
 /*
 * Attempt to determine if a PostgreSQL data directory is in use
 * by reading the pidfile. This is the same mechanism used by
 * "pg_ctl".
 *
 * This function will abort with appropriate log messages if a file error
 * is encountered, as the user will need to address the situation before
 * any further useful progress can be made.
 */
 PgDirState
 is_pg_running(char *path)
 {
 	long		pid;
 	FILE	   *pidf;
 	char pid_file[MAXPGPATH];
 	/* it's reasonable to assume the pidfile name will not change */
 	snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", path);
 	pidf = fopen(pid_file, "r");
 	if (pidf == NULL)
 	{
 		/*
 		 * No PID file - PostgreSQL shouldn't be running. From 9.3 (the
 		 * earliesty version we care about) removal of the PID file will
 		 * cause the postmaster to shut down, so it's highly unlikely
 		 * that PostgreSQL will still be running.
 		 */
 		if (errno == ENOENT)
 		{
 			return PG_DIR_NOT_RUNNING;
 		}
 		else
 		{
 			log_error(_("unable to open PostgreSQL PID file \"%s\""), pid_file);
 			log_detail("%s", strerror(errno));
 			exit(ERR_BAD_CONFIG);
 		}
 	}
 	/*
 	 * In the unlikely event we're unable to extract a PID from the PID file,
 	 * log a warning but assume we're not dealing with a running instance
 	 * as PostgreSQL should have shut itself down in these cases anyway.
 	 */
 	if (fscanf(pidf, "%ld", &pid) != 1)
 	{
 		/* Is the file empty? */
 		if (ftell(pidf) == 0 && feof(pidf))
 		{
 			log_warning(_("PostgreSQL PID file \"%s\" is empty"), path);
 		}
 		else
 		{
 			log_warning(_("invalid data in PostgreSQL PID file \"%s\""), path);
 		}
 		return PG_DIR_NOT_RUNNING;
 	}
 	fclose(pidf);
 	if (pid == getpid())
 		return PG_DIR_NOT_RUNNING;
 	if (pid == getppid())
 		return PG_DIR_NOT_RUNNING;
 	if (kill(pid, 0) == 0)
 		return PG_DIR_RUNNING;
 	return PG_DIR_NOT_RUNNING;
 }
 bool
 create_pg_dir(char *path, bool force)
 {
-	bool		pg_dir = false;
+	/* Check this directory can be used as a PGDATA dir */
 	/* Check this directory could be used as a PGDATA dir */
 	switch (check_dir(path))
 	{
-		case 0:
+		case DIR_NOENT:
-			/* dir not there, must create it */
+			/* directory does not exist, attempt to create it */
 			log_info(_("creating directory \"%s\"..."), path);
 			if (!create_dir(path))
@@ -262,55 +307,62 @@ create_pg_dir(char *path, bool force)
 				return false;
 			}
 			break;
-		case 1:
+		case DIR_EMPTY:
-			/* Present but empty, fix permissions and use it */
+			/* exists but empty, fix permissions and use it */
-			log_info(_("checking and correcting permissions on existing directory %s"),
+			log_info(_("checking and correcting permissions on existing directory \"%s\""),
 					 path);
 			if (!set_dir_permissions(path))
 			{
-				log_error(_("unable to change permissions of directory \"%s\":\n  %s"),
+				log_error(_("unable to change permissions of directory \"%s\""), path);
-						  path, strerror(errno));
+				log_detail("%s", strerror(errno));
 				return false;
 			}
 			break;
-		case 2:
+		case DIR_NOT_EMPTY:
-			/* Present and not empty */
+			/* exists but is not empty */
 			log_warning(_("directory \"%s\" exists but is not empty"),
 						path);
-			pg_dir = is_pg_dir(path);
+			if (is_pg_dir(path))
 			if (pg_dir && force)
 			{
-				/* TODO: check DB state, if not running overwrite */
+				if (force == true)
 				if (false)
 				{
-					log_notice(_("deleting existing data directory \"%s\""), path);
+					log_notice(_("-F/--force provided - deleting existing data directory \"%s\""), path);
 					nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
 					return true;
 				}
-				/* Let it continue */
+
 				break;
 			}
 			else if (pg_dir && !force)
 			{
 				log_hint(_("This looks like a PostgreSQL directory.\n"
 						   "If you are sure you want to clone here, "
 						   "please check there is no PostgreSQL server "
 						   "running and use the -F/--force option"));
 				return false;
 			}
-
+			else
-			return false;
+			{
-		default:
+				if (force == true)
 				{
 					log_notice(_("deleting existing directory \"%s\""), path);
 					nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
 					return true;
 				}
 				return false;
 			}
 			break;
 		case DIR_ERROR:
 			log_error(_("could not access directory \"%s\": %s"),
 					  path, strerror(errno));
 			return false;
 	}
 	return true;
 }
 int
 rmdir_recursive(char *path)
 {
 	return nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
 }
 static int
 unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf)
 {
--- a/dirutil.h
+++ b/dirutil.h
@@ -1,6 +1,6 @@
 /*
 * dirutil.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -19,12 +19,29 @@
 #ifndef _DIRUTIL_H_
 #define _DIRUTIL_H_
 typedef enum
 {
 	DIR_ERROR = -1,
 	DIR_NOENT,
 	DIR_EMPTY,
 	DIR_NOT_EMPTY
 } DataDirState;
 typedef enum
 {
 	PG_DIR_ERROR = -1,
 	PG_DIR_NOT_RUNNING,
 	PG_DIR_RUNNING
 } PgDirState;
 extern int	mkdir_p(char *path, mode_t omode);
 extern bool set_dir_permissions(char *path);
-extern int	check_dir(char *path);
+extern DataDirState	check_dir(char *path);
 extern bool create_dir(char *path);
 extern bool is_pg_dir(char *path);
 extern PgDirState is_pg_running(char *path);
 extern bool create_pg_dir(char *path, bool force);
 extern int rmdir_recursive(char *path);
 #endif
--- a/doc/.gitignore
+++ b/doc/.gitignore
@@ -0,0 +1,7 @@
 HTML.index
 bookindex.sgml
 html-stamp
 html/
 nochunks.dsl
 repmgr.html
 version.sgml
--- a/doc/Makefile.in
+++ b/doc/Makefile.in
@@ -0,0 +1,76 @@
 repmgr_subdir = doc
 repmgr_top_builddir = ..
 include $(repmgr_top_builddir)/Makefile.global
 ifndef JADE
 JADE = $(missing) jade
 endif
 SGMLINCLUDE = -D . -D ${srcdir}
 SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
 JADE.html.call = $(JADE) $(JADEFLAGS) $(SPFLAGS) $(SGMLINCLUDE) $(CATALOG) -t sgml -i output-html
 ALLSGML := $(wildcard $(srcdir)/*.sgml)
 # to build bookindex
 ALMOSTALLSGML := $(filter-out %bookindex.sgml,$(ALLSGML))
 GENERATED_SGML = version.sgml bookindex.sgml
 Makefile: Makefile.in
 	cd $(repmgr_top_builddir) && ./config.status doc/Makefile
 all: html
 html: html-stamp
 html-stamp: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
 	$(MKDIR_P) html
 	$(JADE.html.call) -d stylesheet.dsl -i include-index $<
 	cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/
 	touch $@
 repmgr.html: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
 	sed '/html-index-filename/a\
 (define nochunks  #t)' <stylesheet.dsl >nochunks.dsl
 	$(JADE.html.call) -d nochunks.dsl -i include-index $< >repmgr.html
 version.sgml: ${repmgr_top_builddir}/repmgr_version.h
 	{ \
 	  echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \
 	} > $@
 HTML.index: repmgr.sgml $(ALMOSTALLSGML) stylesheet.dsl
 	@$(MKDIR_P) html
 	$(JADE.html.call) -d stylesheet.dsl -V html-index $<
 website-docs.css:
 	@$(MKDIR_P) html
 	curl http://www.postgresql.org/media/css/docs.css > ${srcdir}/website-docs.css
 bookindex.sgml: HTML.index
 ifdef COLLATEINDEX
 	LC_ALL=C $(PERL) $(COLLATEINDEX) -f -g -i 'bookindex' -o $@ $<
 else
 	@$(missing) collateindex.pl $< $@
 endif
 clean:
 	rm -f html-stamp
 	rm -f HTML.index $(GENERATED_SGML)
 maintainer-clean:
 	rm -rf html
 	rm -rf Makefile
 zip: html
 	cp -r html repmgr-docs-$(REPMGR_VERSION)
 	zip -r repmgr-docs-$(REPMGR_VERSION).zip repmgr-docs-$(REPMGR_VERSION)
 	rm -rf repmgr-docs-$(REPMGR_VERSION)
 install: html
 	@$(MKDIR_P) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
 	@$(INSTALL_DATA) $(wildcard html/*.html) $(wildcard html/*.css) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
 	@echo Installed docs to $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
 .PHONY: html all
--- a/doc/appendix-faq.sgml
+++ b/doc/appendix-faq.sgml
@@ -0,0 +1,439 @@
 <appendix id="appendix-faq" xreflabel="FAQ">
 <indexterm>
  <primary>FAQ (Frequently Asked Questions)</primary>
 </indexterm>
 <title>FAQ (Frequently Asked Questions)</title>
 <sect1 id="faq-general" xreflabel="General">
  <title>General</title>
  <sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
    <title>What's the difference between the repmgr versions?</title>
    <para>
      &repmgr; 4 is a complete rewrite of the existing &repmgr; code base
      and implements &repmgr; as a PostgreSQL extension. It
      supports all PostgreSQL versions from 9.3 (although some &repmgr;
      features are not available for PostgreSQL 9.3 and 9.4).
     </para>
     <para>
      &repmgr; 3.x builds on the improved replication facilities added
      in PostgreSQL 9.3, as well as improved automated failover support
      via <application>repmgrd</application>, and is not compatible with PostgreSQL 9.2
      and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
      series is no longer maintained.
     </para>
     <para>
      &repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
      with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
      no longer maintained.
     </para>
     <para>
       See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
       and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
     </para>
  </sect2>
  <sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
   <title>What's the advantage of using replication slots?</title>
   <para>
    Replication slots, introduced in PostgreSQL 9.4, ensure that the
    primary server will retain WAL files until they have been consumed
    by all standby servers. This means standby servers should never
    fail due to not being able to retrieve required WAL files from the
    primary.
   </para>
   <para>
    However this does mean that if a standby is no longer connected to the
    primary, the presence of the replication slot will cause WAL files
    to be retained indefinitely, and eventually lead to disk space
    exhaustion.
   </para>
   <tip>
     <para>
       2ndQuadrant's recommended configuration is to configure
       <ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
       source of WAL files, rather than maintain replication slots for
       each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
     </para>
   </tip>
  </sect2>
  <sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
   <title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
   <para>
    Normally at least same number as the number of standbys which will connect
    to the node. Note that changes to <varname>max_replication_slots</varname> require a server
    restart to take effect, and as there is no particular penalty for unused
    replication slots, setting a higher figure will make adding new nodes
    easier.
   </para>
  </sect2>
  <sect2 id="faq-hash-index" xreflabel="Hash indexes">
   <title>Does &repmgr; support hash indexes?</title>
   <para>
    Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
    for use in streaming replication in PostgreSQL 9.6 and earlier. See the
    <ulink url="https://www.postgresql.org/docs/9.6/static/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
    for details.
   </para>
   <para>
    From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
    in a streaming replication cluster.
   </para>
  </sect2>
  <sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
   <title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
   <para>
     For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
     approach is to upgrade a standby to the latest version, perform a
     <link linkend="performing-switchover">switchover</link> promoting it to a primary,
     then upgrade the former primary.
   </para>
   <para>
     For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
     the traditional approach is to "reseed" a cluster by upgrading a single
     node with <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade</ulink>
     and recloning standbys from this.
   </para>
   <para>
     To minimize downtime during major upgrades, for more recent PostgreSQL
     versions (PostgreSQL 9.4 and later),
     <ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
     can be used to set up a parallel cluster using the newer PostgreSQL version,
     which can be kept in sync with the existing production cluster until the
     new cluster is ready to be put into production.
   </para>
  </sect2>
  <sect2 id="faq-libdir-repmgr-error">
   <title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
   <para>
     It means the &repmgr; extension code is not installed in the
     PostgreSQL application directory. This typically happens when using PostgreSQL
     packages provided by a third-party vendor, which often have different
     filesystem layouts.
   </para>
   <para>
     Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
     is not possible, contact your vendor for assistance.
   </para>
  </sect2>
  <sect2 id="faq-old-packages">
   <title>How can I obtain old versions of &repmgr; packages?</title>
   <para>
     See appendix <xref linkend="packages-old-versions"> for details.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-required-for-replication">
    <title>Is &repmgr; required for streaming replication?</title>
    <para>
      No.
    </para>
    <para>
     &repmgr; (together with <application>repmgrd</application>) assists with
     <emphasis>managing</emphasis> replication. It does not actually perform replication, which
     is part of the core PostgreSQL functionality.
    </para>
  </sect2>
  <sect2 id="faq-what-if-repmgr-uninstalled">
   <title>Will replication stop working if &repmgr; is uninstalled?</title>
   <para>
     No. See preceding question.
   </para>
  </sect2>
  <sect2 id="faq-version-mix">
   <title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
   <para>
     Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
     &repmgr; (in particular <application>repmgrd</application>)
     may not run, or run properly, or in the worst case (if different <application>repmgrd</application>
     versions are running and there are differences in the failover implementation) break
     your replication cluster.
   </para>
   <para>
     If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
     &repmgr; will function, but we strongly recommend always running the same version
     to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
     differently to the older version.
   </para>
   <para>
     See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
   </para>
  </sect2>
  <sect2 id="faq-upgrade-repmgr">
    <title>Should I upgrade &repmgr;?</title>
    <para>
      Yes.
    </para>
    <para>
      We don't release new versions for fun, you know. Upgrading may require a little effort,
      but running an older &repmgr; version with bugs which have since been fixed may end up
      costing you more effort. The same applies to PostgreSQL itself.
    </para>
  </sect2>
  <sect2 id="faq-repmgr-conf-data-directory">
    <title>Why do I need to specify the data directory location in repmgr.conf?</title>
    <para>
      In some circumstances &repmgr; may need to access a PostgreSQL data
      directory while the PostgreSQL server is not running, e.g. to confirm
      it shut down cleanly during a <link linkend="performing-switchover">switchover</link>.
    </para>
    <para>
      Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
      earlier, where the <literal>repmgr</literal> user is not a superuser; in that
      case the <literal>repmgr</literal> user will not be able to access the
      <literal>data_directory</literal> configuration setting, access to which is restricted
      to superusers. (In PostgreSQL 10 and later, non-superusers can be added to the
      group <option>pg_read_all_settings</option> which will enable them to read this setting).
    </para>
  </sect2>
 </sect1>
 <sect1 id="faq-repmgr" xreflabel="repmgr">
  <title><command>repmgr</command></title>
  <sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
   <title>Can I register an existing PostgreSQL server with repmgr?</title>
   <para>
    Yes, any existing PostgreSQL server which is part of the same replication
    cluster can be registered with &repmgr;. There's no requirement for a
    standby to have been cloned using &repmgr;.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-clone-other-source" >
   <title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
   <para>
     For a standby which has been manually cloned or recovered from an external
     backup manager such as Barman, the command
     <command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
     can be used to create the correct <filename>recovery.conf</filename> file for
     use with &repmgr; (and will create a replication slot if required). Once this has been done,
     <link linkend="repmgr-standby-register">register the node</link> as usual.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-recovery-conf" >
    <title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
   <para>
     See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
   <title>How can a failed primary be re-added as a standby?</title>
   <para>
    This is a two-stage process. First, the failed primary's data directory
    must be re-synced with the current primary; secondly the failed primary
    needs to be re-registered as a standby.
   </para>
   <para>
    It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
    directory, which will usually be much
    faster than re-cloning the server. However <command>pg_rewind</command> can only
    be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
    data checksums were enabled when the cluster was initialized.
   </para>
   <para>
     Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
     distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
   </para>
   <para>
    &repmgr; provides the command <command>repmgr node rejoin</command> which can
    optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin">
    documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind">.
   </para>
   <para>
    If <command>pg_rewind</command> cannot be used, then the data directory will need
    to be re-cloned from scratch.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
   <title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
   <para>
    Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
    with the <literal>--dry-run</literal> option; this will report any configuration problems
    which need to be rectified.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
   <title>When cloning a standby, how can I get &repmgr; to copy
     <filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
     directory in <filename>/etc</filename>?</title>
   <para>
    Use the command line option <literal>--copy-external-config-files</literal>. For more details
    see <xref linkend="repmgr-standby-clone-config-file-copying">.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
    <title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
      in <filename>postgresql.conf</filename> if I'm not using <application>repmgrd</application>?</title>
   <para>
    No, the <literal>repmgr</literal> shared library is only needed when running <application>repmgrd</application>.
    If you later decide to run <application>repmgrd</application>, you just need to add
    <literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
   <title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
     but <command>repmgr</command>/<application>repmgrd</application> complains it can't connect to the server... Why?</title>
   <para>
    <command>repmgr</command> and <application>repmgrd</application> need to be able to connect to the repmgr database
    with a normal connection to query metadata. The <literal>replication</literal> connection
    permission is for PostgreSQL's streaming replication (and doesn't  necessarily need to be the <literal>repmgr</literal> user).
   </para>
  </sect2>
  <sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
   <title>When cloning a standby, why do I need to provide the connection parameters
     for the primary server on the command line, not in the configuration file?</title>
   <para>
    Cloning a standby is a one-time action; the role of the server being cloned
    from could change, so fixing it in the configuration file would create
    confusion. If &repmgr; needs to establish a connection to the primary
    server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
    node, and if necessary scan the replication cluster until it locates the active primary.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
   <title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
   <para>
     Provide the option <literal>--waldir</literal>  (<literal>--xlogdir</literal> in PostgreSQL 9.6
     and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
     For more details see <xref linkend="cloning-advanced-pg-basebackup-options">.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
   <title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
     table?</title>
   <para>
     Under some circumstances event notifications can be generated for servers
     which have not yet been registered; it's also useful to retain a record
     of events which includes servers removed from the replication cluster
     which no longer have an entry in the <literal>repmgr.nodes</literal> table.
   </para>
  </sect2>
  <sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf">
    <title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title>
    <para>
      This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
      are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed.
    </para>
    <para>
      The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
      of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
      even if the string does not contain any characters which need escaping.
    </para>
  </sect2>
 </sect1>
 <sect1 id="faq-repmgrd" xreflabel="repmgrd">
  <title><application>repmgrd</application></title>
  <sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
   <title>How can I prevent a node from ever being promoted to primary?</title>
   <para>
     In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
    <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
   </para>
   <para>
    Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
    be considered as a promotion candidate.
   </para>
  </sect2>
  <sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
   <title>Does <application>repmgrd</application> support delayed standbys?</title>
   <para>
    <application>repmgrd</application> can monitor delayed standbys - those set up with
    <varname>recovery_min_apply_delay</varname> set to a non-zero value
    in <filename>recovery.conf</filename> - but as it's not currently possible
    to directly examine the value applied to the standby, <application>repmgrd</application>
    may not be able to properly evaluate the node as a promotion candidate.
   </para>
   <para>
    We recommend that delayed standbys are explicitly excluded from promotion
    by setting <varname>priority</varname> to <literal>0</literal> in
    <filename>repmgr.conf</filename>.
   </para>
   <para>
    Note that after registering a delayed standby, <application>repmgrd</application> will only start
    once the metadata added in the primary node has been replicated.
   </para>
  </sect2>
  <sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
   <title>How can I get <application>repmgrd</application> to rotate its logfile?</title>
   <para>
     Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation">.
   </para>
  </sect2>
  <sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
   <title>I've recloned a failed primary as a standby, but <application>repmgrd</application> refuses to start?</title>
   <para>
    Check you registered the standby after recloning. If unregistered, the standby
    cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
    <literal>automatic</literal>, which is probably not what you want. <application>repmgrd</application> will start if
    <varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
  be monitored, if desired.
   </para>
  </sect2>
  <sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
    <title>
      <application>repmgrd</application> ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
    </title>
    <para>
      <varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
      so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
      path; see <xref linkend="repmgrd-automatic-failover-configuration"> for more details.
    </para>
  </sect2>
  <sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
    <title>
      <application>repmgrd</application> aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
    </title>
    <para>
      <application>repmgrd</application> does this to avoid starting up on a replication cluster
      which is not in a healthy state. If the upstream is unavailable, <application>repmgrd</application>
      may initiate a failover immediately after starting up, which could have unintended side-effects,
      particularly if <application>repmgrd</application> is not running on other nodes.
    </para>
    <para>
      In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
      is out-of-date, which may lead to incorrect failover behaviour.
    </para>
    <para>
      The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
      starting <application>repmgrd</application>.
    </para>
  </sect2>
 </sect1>
 </appendix>
--- a/doc/appendix-packages.sgml
+++ b/doc/appendix-packages.sgml
@@ -0,0 +1,562 @@
 <appendix id="appendix-packages" xreflabel="Package details">
  <indexterm>
    <primary>packages</primary>
  </indexterm>
  <title>&repmgr; package details</title>
  <para>
    This section provides technical details about various &repmgr; binary
    packages, such as location of the installed binaries and
    configuration files.
  </para>
  <sect1 id="packages-centos" xreflabel="CentOS packages">
    <title>CentOS Packages</title>
    <indexterm>
      <primary>packages</primary>
      <secondary>CentOS packages</secondary>
    </indexterm>
    <indexterm>
      <primary>CentOS</primary>
      <secondary>package information</secondary>
    </indexterm>
    <para>
      Currently, &repmgr; RPM packages are provided for versions 6.x and 7.x of CentOS. These should also
      work on matching versions of Red Hat Enterprise Linux, Scientific Linux and Oracle Enterprise Linux;
      together with CentOS, these are the same RedHat-based distributions for which the main community project
      (PGDG) provides packages (see the <ulink url="https://yum.postgresql.org/">PostgreSQL RPM Building Project</ulink>
      page for details).
    </para>
    <para>
      Note these &repmgr; RPM packages are not designed to work with SuSE/OpenSuSE.
    </para>
    <note>
      <para>
        &repmgr; packages are designed to be compatible with community-provided PostgreSQL packages.
        They may not work with vendor-specific packages such as those provided by RedHat for RHEL
        customers, as the filesystem layout may be different to the community RPMs.
        Please contact your support vendor for assistance.
      </para>
    </note>
    <sect2 id="packages-centos-repositories">
      <title>CentOS repositories</title>
      <para>
        &repmgr; packages are available from the public 2ndQuadrant repository, and also the
        PostgreSQL community repository. The 2ndQuadrant repository is updated immediately
        after each
        &repmgr; release.
      </para>
      <table id="centos-2ndquadrant-repository">
        <title>2ndQuadrant public repository</title>
        <tgroup cols="2">
          <tbody>
            <row>
              <entry>Repository URL:</entry>
              <entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
            </row>
            <row>
              <entry>Repository documentation:</entry>
              <entry><ulink url="https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ">https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ</ulink></entry>
            </row>
          </tbody>
        </tgroup>
      </table>
      <table id="centos-pgdg-repository">
        <title>PostgreSQL community repository (PGDG)</title>
        <tgroup cols="2">
          <tbody>
            <row>
              <entry>Repository URL:</entry>
              <entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
            </row>
            <row>
              <entry>Repository documentation:</entry>
              <entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
            </row>
          </tbody>
        </tgroup>
      </table>
    </sect2>
    <sect2 id="packages-centos-details">
      <title>CentOS package details</title>
      <para>
        The two tables below list relevant information, paths, commands etc. for the &repmgr; packages on
        CentOS 7 (with systemd) and CentOS 6 (no systemd). Substitute the appropriate PostgreSQL major
        version number for your installation.
      </para>
      <note>
        <para>
          For PostgreSQL 9.6 and lower, the CentOS packages use a mixture of <literal>9.6</literal>
          and <literal>96</literal> in various places to designate the major version; e.g. the
          package name is <literal>repmgr96</literal>, but the binary directory is
          <filename>/var/lib/pgsql/9.6/data</filename>.
        </para>
        <para>
          From PostgreSQL 10, the first part of the version number (e.g. <literal>10</literal>) is
          the major version, so there is more consistency in file/path/package naming
          (package <literal>repmgr10</literal>, binary directory <filename>/var/lib/pgsql/10/data</filename>).
        </para>
      </note>
  <table id="centos-7-packages">
   <title>CentOS 7 packages</title>
   <tgroup cols="2">
    <tbody>
     <row>
      <entry>Package name example:</entry>
      <entry><filename>repmgr10-4.0.4-1.rhel7.x86_64</filename></entry>
     </row>
     <row>
      <entry>Metapackage:</entry>
      <entry>(none)</entry>
     </row>
     <row>
      <entry>Installation command:</entry>
      <entry><literal>yum install repmgr10</literal></entry>
     </row>
     <row>
      <entry>Binary location:</entry>
      <entry><filename>/usr/pgsql-10/bin</filename></entry>
     </row>
     <row>
      <entry>repmgr in default path:</entry>
      <entry>NO</entry>
     </row>
     <row>
      <entry>Configuration file location:</entry>
      <entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
     </row>
     <row>
      <entry>Data directory:</entry>
      <entry><filename>/var/lib/pgsql/10/data</filename></entry>
     </row>
     <row>
      <entry>repmgrd service command:</entry>
      <entry><command>systemctl [start|stop|restart|reload] repmgr10</command></entry>
     </row>
     <row>
      <entry>repmgrd service file location:</entry>
      <entry><filename>/usr/lib/systemd/system/repmgr10.service</filename></entry>
     </row>
     <row>
      <entry>repmgrd log file location:</entry>
      <entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry>
     </row>
    </tbody>
   </tgroup>
  </table>
  <table id="centos-6-packages">
   <title>CentOS 6 packages</title>
   <tgroup cols="2">
    <tbody>
     <row>
      <entry>Package name example:</entry>
      <entry><filename>repmgr96-4.0.4-1.rhel6.x86_64</filename></entry>
     </row>
     <row>
      <entry>Metapackage:</entry>
      <entry>(none)</entry>
     </row>
     <row>
      <entry>Installation command:</entry>
      <entry><literal>yum install repmgr96</literal></entry>
     </row>
     <row>
      <entry>Binary location:</entry>
      <entry><filename>/usr/pgsql-9.6/bin</filename></entry>
     </row>
     <row>
      <entry>repmgr in default path:</entry>
      <entry>NO</entry>
     </row>
     <row>
      <entry>Configuration file location:</entry>
      <entry><filename>/etc/repmgr/9.6/repmgr.conf</filename></entry>
     </row>
     <row>
      <entry>Data directory:</entry>
      <entry><filename>/var/lib/pgsql/9.6/data</filename></entry>
     </row>
     <row>
      <entry>repmgrd service command:</entry>
      <entry><literal>service [start|stop|restart|reload] repmgr-9.6</literal></entry>
     </row>
     <row>
      <entry>repmgrd service file location:</entry>
      <entry><literal>/etc/init.d/repmgr-9.6</literal></entry>
     </row>
     <row>
      <entry>repmgrd log file location:</entry>
      <entry><filename>/var/log/repmgr/repmgrd-9.6.log</filename></entry>
     </row>
    </tbody>
   </tgroup>
  </table>
    </sect2>
 </sect1>
  <sect1 id="packages-debian-ubuntu" xreflabel="Debian/Ubuntu packages">
    <title>Debian/Ubuntu Packages</title>
    <indexterm>
      <primary>packages</primary>
      <secondary>Debian/Ubuntu packages</secondary>
    </indexterm>
    <indexterm>
      <primary>Debian/Ubuntu</primary>
      <secondary>package information</secondary>
    </indexterm>
    <para>
      &repmgr; <literal>.deb</literal> packages are provided via the
      PostgreSQL Community APT repository, and are available for each community-supported
      PostgreSQL version, currently supported Debian releases, and currently supported
      Ubuntu LTS releases.
    </para>
    <sect2 id="packages-apt-repository">
      <title>APT repository</title>
      <para>
        &repmgr; packages are available from the  PostgreSQL Community APT repository,
        which is updated immediately after each &repmgr; release.
      </para>
      <table id="apt-2ndquadrant-repository">
        <title>2ndQuadrant public repository</title>
        <tgroup cols="2">
          <tbody>
            <row>
              <entry>Repository URL:</entry>
              <entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
            </row>
            <row>
              <entry>Repository documentation:</entry>
              <entry><ulink url="https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN">https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN</ulink></entry>
            </row>
          </tbody>
        </tgroup>
      </table>
      <table id="apt-repository">
        <title>PostgreSQL Community APT repository (PGDG)</title>
        <tgroup cols="2">
          <tbody>
            <row>
              <entry>Repository URL:</entry>
              <entry><ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink></entry>
            </row>
            <row>
              <entry>Repository documentation:</entry>
              <entry><ulink url="https://wiki.postgresql.org/wiki/Apt)">https://wiki.postgresql.org/wiki/Apt)</ulink></entry>
            </row>
          </tbody>
        </tgroup>
      </table>
    </sect2>
   <sect2 id="packages-debian-details">
      <title>Debian/Ubuntu package details</title>
      <para>
        The table below lists relevant information, paths, commands etc. for the &repmgr; packages on
        Debian 9.x ("Stretch"). Substitute the appropriate PostgreSQL major
        version number for your installation.
      </para>
      <para>
        See also <xref linkend="repmgrd-configuration-debian-ubuntu"> for some specifics related
        to configuring the <application>repmgrd</application> daemon.
      </para>
      <table id="debian-9-packages">
        <title>Debian 9.x packages</title>
        <tgroup cols="2">
          <tbody>
            <row>
              <entry>Package name example:</entry>
              <entry><filename>postgresql-10-repmgr</filename></entry>
            </row>
            <row>
              <entry>Metapackage:</entry>
              <entry><filename>repmgr-common</filename></entry>
            </row>
            <row>
              <entry>Installation command:</entry>
              <entry><literal>apt-get install postgresql-10-repmgr</literal></entry>
            </row>
            <row>
              <entry>Binary location:</entry>
              <entry><filename>/usr/lib/postgresql/10/bin</filename></entry>
            </row>
            <row>
              <entry>repmgr in default path:</entry>
              <entry>Yes (via wrapper script <filename>/usr/bin/repmgr</filename>)</entry>
            </row>
            <row>
              <entry>Configuration file location:</entry>
              <entry>(not set by package)</entry>
            </row>
            <row>
              <entry>Data directory:</entry>
              <entry><filename>/var/lib/postgresql/10/main</filename></entry>
            </row>
            <row>
              <entry>PostgreSQL service command:</entry>
              <entry><command>systemctl [start|stop|restart|reload] postgresql@10-main</command></entry>
            </row>
            <row>
              <entry>repmgrd service command:</entry>
              <entry><command>systemctl [start|stop|restart|reload] repmgrd</command></entry>
            </row>
            <row>
              <entry>repmgrd service file location:</entry>
              <entry><filename>/etc/init.d/repmgrd</filename> (defaults in: <filename>/etc/defaults/repmgrd</filename>)</entry>
            </row>
            <row>
              <entry>repmgrd log file location:</entry>
              <entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry>
            </row>
          </tbody>
        </tgroup>
      </table>
      <note>
        <para>
          Instead of using the <application>systemd</application> service command directly,
          it's recommended to execute <command>pg_ctlcluster</command> (as <literal>root</literal>,
          either directly or via <command>sudo</command>), e.g.:
          <programlisting>
            <command>pg_ctlcluster 10 main [start|stop|restart|reload]</command></programlisting>
        </para>
        <para>
          For pre-<application>systemd</application> systems, <command>pg_ctlcluster</command>
          can be executed directly by the <literal>postgres</literal> user.
        </para>
      </note>
   </sect2>
  </sect1>
  <sect1 id="packages-snapshot" xreflabel="Snapshot packages">
    <title>Snapshot packages</title>
    <indexterm>
      <primary>snapshot packages</primary>
    </indexterm>
    <indexterm>
      <primary>packages</primary>
      <secondary>snaphots</secondary>
    </indexterm>
    <para>
      For testing new features and bug fixes, from time to time 2ndQuadrant provides
      so-called &quot;snapshot packages&quot; via its public repository. These packages
      are built from the &repmgr; source at a particular point in time, and are not formal
      releases.
    </para>
    <note>
      <para>
        We do not recommend installing these packages in a production environment
        unless specifically advised.
      </para>
    </note>
    <para>
      To install a snapshot package, it's necessary to install the 2ndQuadrant public snapshot repository,
      following the instructions here: <ulink url="https://dl.2ndquadrant.com/default/release/site/">https://dl.2ndquadrant.com/default/release/site/</ulink> but replace <literal>release</literal> with <literal>snapshot</literal>
      in the appropriate URL.
    </para>
    <para>
      For example, to install the snapshot RPM repository for PostgreSQL 9.6, execute (as <literal>root</literal>):
      <programlisting>
 curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | bash</programlisting>
      or as a normal user with root sudo access:
      <programlisting>
 curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | sudo bash</programlisting>
    </para>
    <para>
      Alternatively you can browse the repository here:
      <ulink url="https://dl.2ndquadrant.com/default/snapshot/browse/">https://dl.2ndquadrant.com/default/snapshot/browse/</ulink>.
    </para>
    <para>
      Once the repository is installed, installing or updating &repmgr; will result in the latest snapshot
      package being installed.
    </para>
    <para>
      The package name will be formatted like this:
      <programlisting>
 repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
      containg the snapshot build number (here: <literal>320</literal>) and the hash
      of the <application>git</application> commit it was built from (here: <literal>g5113ab0</literal>).
    </para>
    <para>
      Note that the next formal release (in the above example <literal>4.1.1</literal>), once available,
      will install in place of any snapshot builds.
    </para>
  </sect1>
  <sect1 id="packages-old-versions" xreflabel="Installing old package versions">
    <title>Installing old package versions</title>
    <indexterm>
      <primary>old packages</primary>
    </indexterm>
    <indexterm>
      <primary>packages</primary>
      <secondary>old versions</secondary>
    </indexterm>
    <sect2 id="packages-old-versions-debian" xreflabel="old Debian package versions">
      <title>Debian/Ubuntu</title>
      <para>
        An archive of old packages (<literal>3.3.2</literal> and later) for Debian/Ubuntu-based systems is available here:
        <ulink url="http://atalia.postgresql.org/morgue/r/repmgr/">http://atalia.postgresql.org/morgue/r/repmgr/</ulink>
      </para>
    </sect2>
    <sect2 id="packages-old-versions-rhel-centos" xreflabel="old RHEL/CentOS package versions">
      <title>RHEL/CentOS</title>
      <para>
        Old RPM packages (<literal>3.2</literal> and later) can be retrieved from the
        (deprecated) 2ndQuadrant repository at
        <ulink url="http://packages.2ndquadrant.com/">http://packages.2ndquadrant.com/</ulink>
        by installing the appropriate repository RPM:
      </para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm</ulink>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</ulink>
          </simpara>
        </listitem>
      </itemizedlist>
      <para>
        Old versions can be located with e.g.:
        <programlisting>
          yum --showduplicates list repmgr96</programlisting>
        (substitute the appropriate package name; see <xref linkend="packages-centos">) and installed with:
        <programlisting>
          yum install {package_name}-{version}</programlisting>
        where <literal>{package_name}</literal> is the base package name (e.g. <literal>repmgr96</literal>)
        and <literal>{version}</literal> is the version listed by the
        <command> yum --showduplicates list ...</command> command, e.g. <literal>4.0.6-1.rhel6</literal>.
      </para>
      <para>For example:
        <programlisting>
          yum install repmgr96-4.0.6-1.rhel6</programlisting>
      </para>
    </sect2>
  </sect1>
  <sect1 id="packages-packager-info" xreflabel="Information for packagers">
    <title>Information for packagers</title>
    <indexterm>
      <primary>packages</primary>
      <secondary>information for packagers</secondary>
    </indexterm>
    <para>
      We recommend patching the following parameters when
      building the package as built-in default values for user convenience.
      These values can nevertheless be overridden by the user, if desired.
    </para>
    <itemizedlist>
      <listitem>
        <para>
          Configuration file location: the default configuration file location
          can be hard-coded by patching <varname>package_conf_file</varname>
          in <filename>configfile.c</filename>:
          <programlisting>
 		/* packagers: if feasible, patch configuration file path into "package_conf_file" */
 		char		package_conf_file[MAXPGPATH] = "";</programlisting>
        </para>
        <para>
          See also: <xref linkend="configuration-file">
        </para>
      </listitem>
      <listitem>
        <para>
          PID file location: the default <application>repmgrd</application> PID file
          location can be hard-coded by patching <varname>package_pid_file</varname>
          in <filename>repmgrd.c</filename>:
          <programlisting>
 		/* packagers: if feasible, patch PID file path into "package_pid_file" */
 		char		package_pid_file[MAXPGPATH] = "";</programlisting>
        </para>
        <para>
          See also: <xref linkend="repmgrd-pid-file">
        </para>
      </listitem>
    </itemizedlist>
  </sect1>
 </appendix>
--- a/doc/appendix-release-notes.sgml
+++ b/doc/appendix-release-notes.sgml
--- a/doc/appendix-signatures.sgml
+++ b/doc/appendix-signatures.sgml
@@ -0,0 +1,37 @@
 <appendix id="appendix-signatures" xreflabel="Verifying digital signatures">
 <title>Verifying digital signatures</title>
 <sect1 id="repmgr-source-key" xreflabel="repmgr source key">
   <title>repmgr source code signing key</title>
   <para>
     The signing key ID used for <application>repmgr</application> source code bundles is:
     <ulink url="https://repmgr.org/download/SOURCE-GPG-KEY-repmgr">
       <literal>0x297F1DCC</literal></ulink>.
   </para>
   <para>
     To download the <application>repmgr</application> source key to your computer:
     <programlisting>
       curl -s https://repmgr.org/download/SOURCE-GPG-KEY-repmgr | gpg --import
       gpg --fingerprint 0x297F1DCC
     </programlisting>
     then verify that the fingerprint is the expected value:
     <programlisting>
       085A BE38 6FD9 72CE 6365  340D 8365 683D 297F 1DCC</programlisting>
   </para>
   <para>
     For checking tarballs, first download and import the <application>repmgr</application>
     source signing key as shown above. Then download both source tarball and the detached
     key (e.g. <filename>repmgr-4.0beta1.tar.gz</filename> and
     <filename>repmgr-4.0beta1.tar.gz.asc</filename>) from
     <ulink url="https://repmgr.org/download/">https://repmgr.org/download/</ulink>
     and use <application>gpg</application> to verify the key, e.g.:
     <programlisting>
       gpg --verify repmgr-4.0beta1.tar.gz.asc</programlisting>
   </para>
 </sect1>
 </appendix>
--- a/doc/bdr-failover.md
+++ b/doc/bdr-failover.md
@@ -1,288 +1,8 @@
 BDR failover with repmgrd
 =========================
-`repmgr 4` provides support for monitoring BDR nodes and taking action in case
+This document has been integrated into the main `repmgr` documentation
-one of the nodes fails.
+and is now located here:
-    *NOTE* Due to the nature of BDR, it's only safe to use this solution for
+> [BDR failover with repmgrd](https://repmgr.org/docs/current/repmgrd-bdr.html)
    a two-node scenario. Introducing additional nodes will create an inherent
    risk of node desynchronisation if a node goes down without being cleanly
    removed from the cluster.
 In contrast to streaming replication, there's no concept of "promoting" a new
 primary node with BDR. Instead, "failover" involves monitoring both nodes
 with `repmgrd` and redirecting queries from the failed node to the remaining
 active node. This can be done by using the event notification script generated by
 `repmgrd` to dynamically reconfigure a proxy server/connection pooler such
 as PgBouncer.
 Prerequisites
 -------------
 `repmgr 4` requires PostgreSQL 9.6 with the BDR 2 extension enabled and
 configured for a two-node BDR network. `repmgr 4` packages
 must be installed on each node before attempting to configure repmgr.
    *NOTE* `repmgr 4` will refuse to install if it detects more than two
    BDR nodes.
 Application database connections *must* be passed through a proxy server/
 connection pooler such as PgBouncer, and it must be possible to dynamically
 reconfigure that from `repmgrd`. The example demonstrated in this document
 will use PgBouncer.
 The proxy server / connection poolers must not be installed on the database
 servers.
 For this example, it's assumed password-less SSH connections are available
 from the PostgreSQL servers to the servers where PgBouncer runs, and
 that the user on those servers has permission to alter the PgBouncer
 configuration files.
 PostgreSQL connections must be possible between each node, and each node
 must be able to connect to each PgBouncer instance.
 Configuration
 -------------
 Sample configuration for `repmgr.conf`:
    node_id=1
    node_name='node1'
    conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
    replication_type='bdr'
    event_notifications=bdr_failover
    event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
    # repmgrd options
    monitor_interval_secs=5
    reconnect_attempts=6
    reconnect_interval=5
 Adjust settings as appropriate; copy and adjust for the second node (particularly
 the values `node_id`, `node_name` and `conninfo`).
 Note that the values provided for the `conninfo` string must be valid for
 connections from *both* nodes in the cluster. The database must be the BDR
 database.
 If defined, `event_notifications` will restrict execution of `event_notification_command`
 to the specified events.
 `event_notification_command` is the script which does the actual "heavy lifting"
 of reconfiguring the proxy server/ connection pooler. It is fully user-definable;
 a sample implementation is documented below.
 repmgr user permissions
 -----------------------
 `repmgr` will create an extension in the BDR database containing objects
 for administering `repmgr` metadata. The user defined in the `conninfo`
 setting must be able to access all objects. Additionally, superuser permissions
 are required to install the `repmgr` extension. The easiest way to do this
 is create the `repmgr` user as a superuser, however if this is not
 desirable, the `repmgr` user can be created as a normal user and a
 superuser specified with `--superuser` when registering a BDR node.
 repmgr setup
 ------------
 Register both nodes:
    $ repmgr -f /etc/repmgr.conf bdr register
    NOTICE: attempting to install extension "repmgr"
    NOTICE: "repmgr" extension successfully installed
    NOTICE: node record created for node 'node1' (ID: 1)
    NOTICE: BDR node 1 registered (conninfo: host=localhost dbname=bdrtest user=repmgr port=5501)
    $ repmgr -f /etc/repmgr.conf bdr register
    NOTICE: node record created for node 'node2' (ID: 2)
    NOTICE: BDR node 2 registered (conninfo: host=localhost dbname=bdrtest user=repmgr port=5502)
 The `repmgr` extension will be automatically created when the first
 node is registered, and will be propagated to the second node.
    *IMPORTANT* ensure the repmgr package is available on both nodes before
    attempting to register the first node
 At this point the meta data for both nodes has been created; executing
 `repmgr cluster show` (on either node) should produce output like this:
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role | Status    | Upstream | Connection string
    ----+-------+------+-----------+----------+--------------------------------------------------------
     1  | node1 | bdr  | * running |          | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
     2  | node2 | bdr  | * running |          | host=node2 dbname=bdrtest user=repmgr connect_timeout=2
 Additionally it's possible to see a log of significant events; so far
 this will only record the two node registrations (in reverse chronological order):
     Node ID | Event        | OK | Timestamp           | Details
    ---------+--------------+----+---------------------+----------------------------------------------
     2       | bdr_register | t  | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
     1       | bdr_register | t  | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
 Defining the "event_notification_command"
 -----------------------------------------
 Key to "failover" execution is the `event_notification_command`, which is a
 user-definable script which should reconfigure the  proxy server/
 connection pooler.
 Each time `repmgr` (or `repmgrd`) records an event, it can optionally
 execute the script defined in `event_notification_command` to
 take further action; details of the event will be passed as parameters.
 Following placeholders are available to the script:
    %n - node ID
    %e - event type
    %s - success (1 or 0)
    %t - timestamp
    %d - details
    %c - conninfo string of the next available node
    %a - name of the next available node
 Note that `%c` and `%a` will only be provided during `bdr_failover`
 events, which is what is of interest here.
 The provided sample script (`scripts/bdr-pgbouncer.sh`) is configured like
 this:
    event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'
 and parses the configures parameters like this:
    NODE_ID=$1
    EVENT_TYPE=$2
    SUCCESS=$3
    NEXT_CONNINFO=$4
    NEXT_NODE_NAME=$5
 It also contains some hard-coded values about the PgBouncer configuration for
 both nodes; these will need to be adjusted for your local environment of course
 (ideally the scripts would be maintained as templates and generated by some
 kind of provisioning system).
 The script performs following steps:
 - pauses PgBouncer on all nodes
 - recreates the PgBouncer configuration file on each node using the information
   provided by `repmgrd` (mainly the `conninfo` string) to configure PgBouncer
   to point to the remaining node
 - reloads the PgBouncer configuration
 - resumes PgBouncer
 From that point, any connections to PgBouncer on the failed BDR node will be redirected
 to the active node.
 repmgrd
 -------
 Node monitoring and failover
 ----------------------------
 At the intervals specified by `monitor_interval_secs` in `repmgr.conf`, `repmgrd`
 will ping each node to check if it's available. If a node isn't available,
 `repmgrd` will enter failover mode and  check `reconnect_attempts` times
 at intervals of `reconnect_interval` to confirm the node is definitely unreachable.
 This buffer period is necessary to avoid false positives caused by transient
 network outages.
 If the node is still unavailable, `repmgrd` will enter failover mode and execute
 the script defined in `event_notification_command`; an entry will be logged
 in the `repmgr.events` table and `repmgrd` will (unless otherwise configured)
 resume monitoring of the node in "degraded" mode until it reappears.
 `repmgrd` logfile output during a failover event will look something like this
 one one node (usually the node which has failed, here "node2"):
    ...
    [2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
    [2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
    [2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
    [2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
    [2017-07-27 21:09:28] [DETAIL] command is:
      /path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
    [2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
    ...
 Output on the other node ("node1") during the same event will look like this:
    [2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
    [2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
    [2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
 This assumes only the PostgreSQL instance on "node2" has failed. In this case the
 `repmgrd` instance running on "node2" has performed the failover. However if
 the entire server becomes unavailable, `repmgrd` on "node1" will perform
 the failover.
 Node recovery
 -------------
 Following failure of a BDR node, if the node subsequently becomes available again,
 a `bdr_recovery` event will be generated. This could potentially be used to
 reconfigure PgBouncer automatically to bring the node back into the available pool,
 however it would be prudent to manually verify the node's status before
 exposing it to the application.
 If the failed node comes back up and connects correctly, output similar to this
 will be visible in the `repmgrd` log:
    [2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
    [2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
    [2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
    [2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds
 Shutdown of both nodes
 ----------------------
 If both PostgreSQL instances are shut down, `repmgrd` will try and handle the
 situation as gracefully as possible, though with no failover candidates available
 there's not much it can do. Should this case ever occur, we recommend shutting
 down `repmgrd` on both nodes and restarting it once the PostgreSQL instances
 are running properly.
--- a/doc/changes-in-repmgr4.md
+++ b/doc/changes-in-repmgr4.md
@@ -1,106 +1,7 @@
 Changes in repmgr 4
 ===================
-Standardisation on `primary`
+This document has been integrated into the main `repmgr` documentation
----------------------------
+and is now located here:
-To standardise terminology, `primary` is used to denote the read/write
+> [Release notes](https://repmgr.org/docs/current/release-4.0.html)
 node in a streaming replication cluster. `master` is still accepted
 as a synonym (e.g. `repmgr master register`).
 New command line options
 ------------------------
 - `--dry-run`: repmgr will attempt to perform the action as far as possible
   without making any changes to the database
 - `--upstream-node-id`: use to specify the upstream node the standby will
  connect later stream from, when cloning a standby. This replaces the configuration
  file parameter `upstream_node`, as the upstream node is set when the standby
  is initially cloned, but can change over the lifetime of an installation (due
  to failovers, switchovers etc.) so it's pointless/confusing keeping the original
  value around in the config file.
 Changed command line options
 ----------------------------
 ### repmgr
 - `--replication-user` has been deprecated; it has been replaced by the
  configuration file option `replication_user`.  The value (which defaults
  to the user in the `conninfo` string) will be stored in the repmgr metadata
  for use by  standby clone/follow..
 - `--recovery-min-apply-delay` is now a configuration file parameter
  `recovery_min_apply_delay, to ensure the setting does not get lost when
  a standby follows a new upstream.
 ### repmgrd
 - `--monitoring-history` is deprecated and has been replaced by the
  configuration file option `monitoring_history`. This enables the
  setting to be changed without having to modify system service files.
 Changes to repmgr commands
 --------------------------
 ### `repmgr cluster show`
 This now displays the role of each node (e.g. `primary`, `standby`)
 and its status in separate columns.
 The `--csv` option now emits a third column indicating the recovery
 status of the node.
 Configuration file changes
 --------------------------
 ### Required settings
 The following 4 parameters are mandatory in `repmgr.conf`:
 - `node_id`
 - `node_name`
 - `conninfo`
 - `data_directory`
 ### Renamed settings
 Some settings have been renamed for clarity and consistency:
 - `node`: now `node_id`
 - `name`: now `node_name`
 - `master_reponse_timeout`: now `async_query_timeout` to better indicate its
   purpose
 - The following configuration file parameters have been renamed for consistency
  with other parameters (and conform to the pattern used by PostgreSQL itself,
  which uses the prefix `log_` for logging parameters):
  - `loglevel` has been renamed to `log_level`
  - `logfile` has been renamed to `log_file`
  - `logfacility` has been renamed to `log_facility`
 ### Removed settings
 - `cluster`: has been removed
 - `upstream_node`: see note about `--upstream-node-id` above.
 - `retry_promote_interval_secs`: this is now redundant due to changes in the
   failover/promotion mechanism; the new equivalent is `primary_notification_timeout`
 ### Logging changes
 - default value for `log_level` is `INFO` rather than `NOTICE`.
 - new parameter `log_status_interval`, which causes `repmgrd` to emit a status log
  line at the specified interval
 repmgrd
 -------
 The `repmgr` shared library has been renamed from `repmgr_funcs` to `repmgr`,
 meaning `shared_preload_libraries` needs to be updated to the new name:
    shared_preload_libraries = 'repmgr'
--- a/doc/cloning-standbys.sgml
+++ b/doc/cloning-standbys.sgml
@@ -0,0 +1,469 @@
 <chapter id="cloning-standbys" xreflabel="cloning standbys">
 <title>Cloning standbys</title>
 <sect1 id="cloning-from-barman" xreflabel="Cloning from Barman">
   <indexterm>
    <primary>cloning</primary>
    <secondary>from Barman</secondary>
   </indexterm>
   <indexterm>
    <primary>Barman</primary>
    <secondary>cloning a standby</secondary>
   </indexterm>
   <title>Cloning a standby from Barman</title>
   <para>
    <xref linkend="repmgr-standby-clone"> can use
    <ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s
    <ulink url="https://www.pgbarman.org/">Barman</ulink> application
    to clone a standby (and also as a fallback source for WAL files).
   </para>
   <tip>
    <simpara>
     Barman (aka PgBarman) should be considered as an integral part of any
     PostgreSQL replication cluster. For more details see:
     <ulink url="https://www.pgbarman.org/">https://www.pgbarman.org/</ulink>.
    </simpara>
   </tip>
   <para>
    Barman support provides the following advantages:
    <itemizedlist spacing="compact" mark="bullet">
     <listitem>
      <para>
       the primary node does not need to perform a new backup every time a
       new standby is cloned
      </para>
     </listitem>
     <listitem>
      <para>
       a standby node can be disconnected for longer periods without losing
       the ability to catch up, and without causing accumulation of WAL
       files on the primary node
      </para>
     </listitem>
     <listitem>
      <para>
       WAL management on the primary becomes much easier as there's no need
       to use replication slots, and <varname>wal_keep_segments</varname>
       does not need to be set.
     </para>
    </listitem>
   </itemizedlist>
   </para>
  <sect2 id="cloning-from-barman-prerequisites">
   <title>Prerequisites for cloning from Barman</title>
   <para>
    In order to enable Barman support for <command>repmgr standby clone</command>, following
    prerequisites must be met:
   <itemizedlist spacing="compact" mark="bullet">
     <listitem>
      <para>
        the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the
        server configured in Barman;
      </para>
     </listitem>
     <listitem>
      <para>
        the <varname>barman_host</varname> setting in <filename>repmgr.conf</filename> is set to the SSH
        hostname of the Barman server;
      </para>
     </listitem>
     <listitem>
      <para>
        the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename> is configured to
        use a copy of the <command>barman-wal-restore</command> script shipped with the
        <literal>barman-cli</literal> package (see section <xref linkend="cloning-from-barman-restore-command">
        below).
      </para>
     </listitem>
     <listitem>
      <para>
        the Barman catalogue includes at least one valid backup for this  server.
      </para>
     </listitem>
   </itemizedlist>
   </para>
   <note>
    <simpara>
     Barman support is automatically enabled if <varname>barman_server</varname>
     is set. Normally it is good practice to use Barman, for instance
     when fetching a base backup while cloning a standby; in any case,
     Barman mode can be disabled using the <literal>--without-barman</literal>
     command line option.
    </simpara>
   </note>
   <tip>
    <simpara>
      If you have a non-default SSH configuration on the Barman
      server, e.g. using a port other than 22, then you can set those
      parameters in a dedicated Host section in <filename>~/.ssh/config</filename>
      corresponding to the value of<varname>barman_host</varname> in
      <filename>repmgr.conf</filename>. See the <literal>Host</literal>
      section in <command>man 5 ssh_config</command> for more details.
    </simpara>
   </tip>
   <para>
    It's now possible to clone a standby from Barman, e.g.:
    <programlisting>
    NOTICE: using configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to Barman server to verify backup for test_cluster
    INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data"
    INFO: creating directory "/var/lib/postgresql/data/repmgr"...
    INFO: connecting to Barman server to fetch server parameters
    INFO: connecting to upstream node
    INFO: connected to source node, checking its state
    INFO: successfully connected to source node
    DETAIL: current installation size is 29 MB
    NOTICE: retrieving backup from Barman...
    receiving file list ...
    (...)
    NOTICE: standby clone (from Barman) complete
    NOTICE: you can now start your PostgreSQL server
    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
   </para>
  </sect2>
  <sect2 id="cloning-from-barman-restore-command" xreflabel="Using Barman as a WAL file source">
  <indexterm>
    <primary>Barman</primary>
    <secondary>fetching archived WAL</secondary>
   </indexterm>
   <title>Using Barman as a WAL file source</title>
   <para>
    As a fallback in case streaming replication is interrupted, PostgreSQL can optionally
    retrieve WAL files from an archive, such as that provided by Barman. This is done by
    setting <varname>restore_command</varname> in <filename>recovery.conf</filename> to
    a valid shell command which can retrieve a specified WAL file from the archive.
   </para>
   <para>
     <command>barman-wal-restore</command> is a Python script provided as part of the <literal>barman-cli</literal>
     package (Barman 2.0 and later; for Barman 1.x the script is provided separately as
     <command>barman-wal-restore.py</command>) which performs this function for Barman.
   </para>
   <para>
    To use <command>barman-wal-restore</command> with &repmgr;
    and assuming Barman is located on the <literal>barmansrv</literal> host
    and that <command>barman-wal-restore</command> is located as an executable at
    <filename>/usr/bin/barman-wal-restore</filename>,
    <filename>repmgr.conf</filename> should include the following lines:
    <programlisting>
    barman_host=barmansrv
    barman_server=somedb
    restore_command=/usr/bin/barman-wal-restore barmansrv somedb %f %p</programlisting>
   </para>
   <note>
    <simpara>
      <command>barman-wal-restore</command> supports command line switches to
      control parallelism (<literal>--parallel=N</literal>) and compression (
      <literal>--bzip2</literal>, <literal>--gzip</literal>).
    </simpara>
   </note>
   <note>
    <para>
     To use a non-default Barman configuration file on the Barman server,
     specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
     <programlisting>
      barman_config=/path/to/barman.conf</programlisting>
    </para>
   </note>
  </sect2>
 </sect1>
 <sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots">
   <indexterm>
     <primary>cloning</primary>
     <secondary>replication slots</secondary>
   </indexterm>
   <indexterm>
     <primary>replication slots</primary>
     <secondary>cloning</secondary>
   </indexterm>
   <title>Cloning and replication slots</title>
   <para>
    Replication slots were introduced with PostgreSQL 9.4 and are designed to ensure
    that any standby connected to the primary using a replication slot will always
    be able to retrieve the required WAL files. This removes the need to manually
    manage WAL file retention by estimating the number of WAL files that need to
    be maintained on the primary using <varname>wal_keep_segments</varname>.
    Do however be aware that if a standby is disconnected, WAL will continue to
    accumulate on the primary until either the standby reconnects or the replication
    slot is dropped.
   </para>
   <para>
     To enable &repmgr; to use replication slots, set the boolean parameter
     <varname>use_replication_slots</varname> in <filename>repmgr.conf</filename>:
     <programlisting>
       use_replication_slots=true</programlisting>
   </para>
   <para>
    Replication slots must be enabled in <filename>postgresql.conf</filename> by
    setting the parameter <varname>max_replication_slots</varname> to at least the
    number of expected standbys (changes to this parameter require a server restart).
   </para>
   <para>
    When cloning a standby, &repmgr; will automatically generate an appropriate
    slot name, which is stored in the <literal>repmgr.nodes</literal> table, and create the slot
    on the upstream node:
     <programlisting>
    repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name
               FROM repmgr.nodes ORDER BY node_id;
     node_id | upstream_node_id | active | node_name |  type   | priority |   slot_name
    ---------+------------------+--------+-----------+---------+----------+---------------
           1 |                  | t      | node1     | primary |      100 | repmgr_slot_1
           2 |                1 | t      | node2     | standby |      100 | repmgr_slot_2
           3 |                1 | t      | node3     | standby |      100 | repmgr_slot_3
     (3 rows)</programlisting>
    <programlisting>
    repmgr=# SELECT slot_name, slot_type, active, active_pid FROM pg_replication_slots ;
       slot_name   | slot_type | active | active_pid
    ---------------+-----------+--------+------------
     repmgr_slot_2 | physical  | t      |      23658
     repmgr_slot_3 | physical  | t      |      23687
    (2 rows)</programlisting>
   </para>
   <para>
    Note that a slot name will be created by default for the primary but not
    actually used unless the primary is converted to a standby using e.g.
    <command>repmgr standby switchover</command>.
   </para>
   <para>
    Further information on replication slots in the PostgreSQL documentation:
    <ulink url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS">https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS</ulink>
   </para>
   <tip>
    <simpara>
     While replication slots can be useful for streaming replication, it's
     recommended to monitor for inactive slots as these will cause WAL files to
     build up indefinitely, possibly leading to server failure.
    </simpara>
    <simpara>
     As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
     which offloads WAL management to a separate server, removing the requirement to use a replication
     slot for each individual standby to reserve WAL. See section <xref linkend="cloning-from-barman">
     for more details on using &repmgr; together with Barman.
    </simpara>
   </tip>
 </sect1>
 <sect1 id="cloning-cascading" xreflabel="Cloning and cascading replication">
   <indexterm>
     <primary>cloning</primary>
     <secondary>cascading replication</secondary>
   </indexterm>
   <title>Cloning and cascading replication</title>
   <para>
    Cascading replication, introduced with PostgreSQL 9.2, enables a standby server
    to replicate from another standby server rather than directly from the primary,
    meaning replication changes "cascade" down through a hierarchy of servers. This
    can be used to reduce load on the primary and minimize bandwith usage between
    sites. For more details, see the
    <ulink url="https://www.postgresql.org/docs/current/static/warm-standby.html#CASCADING-REPLICATION">
    PostgreSQL cascading replication documentation</ulink>.
   </para>
   <para>
    &repmgr; supports cascading replication. When cloning a standby,
    set the command-line parameter <literal>--upstream-node-id</literal> to the
    <varname>node_id</varname> of the server the standby should connect to, and
    &repmgr; will create <filename>recovery.conf</filename> to point to it. Note
    that if <literal>--upstream-node-id</literal> is not explicitly provided,
    &repmgr; will set the standby's <filename>recovery.conf</filename> to
    point to the primary node.
   </para>
   <para>
    To demonstrate cascading replication, first ensure you have a primary and standby
    set up as shown in the <xref linkend="quickstart">.
    Then create an additional standby server with <filename>repmgr.conf</filename> looking
    like this:
    <programlisting>
    node_id=3
    node_name=node3
    conninfo='host=node3 user=repmgr dbname=repmgr'
    data_directory='/var/lib/postgresql/data'</programlisting>
   </para>
   <para>
    Clone this standby (using the connection parameters for the existing standby),
    ensuring <literal>--upstream-node-id</literal> is provide with the <varname>node_id</varname>
    of the previously created standby (if following the example, this will be <literal>2</literal>):
    <programlisting>
    $ repmgr -h node2 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=2
    NOTICE: using configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to upstream node
    INFO: connected to source node, checking its state
    NOTICE: checking for available walsenders on upstream node (2 required)
    INFO: sufficient walsenders available on upstream node (2 required)
    INFO: successfully connected to source node
    DETAIL: current installation size is 29 MB
    INFO: creating directory "/var/lib/postgresql/data"...
    NOTICE: starting backup (using pg_basebackup)...
    HINT: this may take some time; consider using the -c/--fast-checkpoint option
    INFO: executing: 'pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node2 -U repmgr -X stream '
    NOTICE: standby clone (using pg_basebackup) complete
    NOTICE: you can now start your PostgreSQL server
    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
    then register it (note that <literal>--upstream-node-id</literal> must be provided here
    too):
    <programlisting>
     $ repmgr -f /etc/repmgr.conf standby register --upstream-node-id=2
     NOTICE: standby node "node2" (ID: 2) successfully registered
    </programlisting>
   </para>
   <para>
    After starting the standby, the cluster will look like this, showing that <literal>node3</literal>
    is attached to <literal>node2</literal>, not the primary (<literal>node1</literal>).
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr
    </programlisting>
   </para>
   <tip>
    <simpara>
     Under some circumstances when setting up a cascading replication
     cluster, you may wish to clone a downstream standby whose upstream node
     does not yet exist. In this case you can clone from the primary (or
     another upstream node); provide the parameter <literal>--upstream-conninfo</literal>
     to explictly set the upstream's <varname>primary_conninfo</varname> string
     in <filename>recovery.conf</filename>.
    </simpara>
   </tip>
 </sect1>
 <sect1 id="cloning-advanced" xreflabel="Advanced cloning options">
   <indexterm>
     <primary>cloning</primary>
     <secondary>advanced options</secondary>
   </indexterm>
   <title>Advanced cloning options</title>
   <sect2 id="cloning-advanced-pg-basebackup-options" xreflabel="pg_basebackup options when cloning a standby">
    <title>pg_basebackup options when cloning a standby</title>
    <para>
      As &repmgr; uses <command>pg_basebackup</command> to clone a standby, it's possible to
      provide additional parameters for <command>pg_basebackup</command> to customise the
      cloning process.
    </para>
    <para>
     By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup
     process. However, a normal checkpoint may take some time to complete;
     a fast checkpoint can be forced with <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>'s
     <literal>-c/--fast-checkpoint</literal> option.
     Note that this may impact performance of the server being cloned from (typically the primary)
     so should be used with care.
    </para>
    <tip>
      <simpara>
        If <application>Barman</application> is set up for the cluster, it's possible to
        clone the standby directly from Barman, without any impact on the server the standby
        is being cloned from. For more details see <xref linkend="cloning-from-barman">.
      </simpara>
    </tip>
    <para>
      Other options can be passed to <command>pg_basebackup</command> by including them
      in the <filename>repmgr.conf</filename> setting <varname>pg_basebackup_options</varname>.
    </para>
    <para>
      Not that by default, &repmgr; executes <command>pg_basebackup</command> with <option>-X/--wal-method</option>
      (PostgreSQL 9.6 and earlier: <option>-X/--xlog-method</option>) set to <literal>stream</literal>.
      From PostgreSQL 9.6, if replication slots are in use, it will also create a replication slot before
      running the base backup, and execute <command>pg_basebackup</command> with the
      <option>-S/--slot</option> option set to the name of the previously created replication slot.
    </para>
    <para>
      These parameters can set by the user in <varname>pg_basebackup_options</varname>, in which case they
      will override the &repmgr; default values. However normally there's no reason to do this.
    </para>
    <para>
      If using a separate directory to store WAL files, provide the option <literal>--waldir</literal>
      (<literal>--xlogdir</literal> in PostgreSQL 9.6 and earlier) with the absolute path to the
      WAL directory. Any WALs generated during the cloning process will be copied here, and
      a symlink will automatically be created from the main data directory.
    </para>
    <para>
     See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
     for more details of available options.
    </para>
   </sect2>
   <sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords">
    <title>Managing passwords</title>
    <indexterm>
      <primary>cloning</primary>
      <secondary>using passwords</secondary>
    </indexterm>
    <para>
     If replication connections to a standby's upstream server are password-protected,
     the standby must be able to provide the password so it can begin streaming replication.
    </para>
    <para>
     The recommended way to do this is to store the password in the <literal>postgres</literal> system
     user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the
     environment variable <varname>PGPASSWORD</varname>, however this is not recommended for
     security reasons. For more details see the
     <ulink url="https://www.postgresql.org/docs/current/static/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
    </para>
    <note>
      <para>
        If using a <filename>pgpass</filename> file, an entry for the replication user (by default the
        user who connects to the <literal>repmgr</literal> database) <emphasis>must</emphasis>
        be provided, with database name set to <literal>replication</literal>, e.g.:
        <programlisting>
          node1:5432:replication:repmgr:12345</programlisting>
      </para>
    </note>
    <para>
     If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>,
     set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in
     <filename>repmgr.conf</filename>. This will read a password set in <varname>PGPASSWORD</varname>
     (but not <filename>~/.pgpass</filename>) and place it into the <varname>primary_conninfo</varname>
     string in <filename>recovery.conf</filename>. Note that <varname>PGPASSWORD</varname>
     will need to be set during any action which causes <filename>recovery.conf</filename> to be
     rewritten, e.g. <xref linkend="repmgr-standby-follow">.
    </para>
    <para>
     It is of course also possible to include the password value in the <varname>conninfo</varname>
     string for each node, but this is obviously a security risk and should be avoided.
    </para>
    <para>
      From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname>
      parameter in connection strings, which can be used to specify a password file other than
      the default <filename>~/.pgpass</filename>.
    </para>
    <para>
      To have &repmgr; write a custom password file in <varname>primary_conninfo</varname>,
      specify its location in <varname>passfile</varname> in <filename>repmgr.conf</filename>.
    </para>
   </sect2>
   <sect2 id="cloning-advanced-replication-user" xreflabel="Separate replication user">
    <title>Separate replication user</title>
    <para>
     In some circumstances it might be desirable to create a dedicated replication-only
     user (in addition to the user who manages the &repmgr; metadata). In this case,
     the replication user should be set in <filename>repmgr.conf</filename> via the parameter
     <varname>replication_user</varname>; &repmgr; will use this value when making
     replication connections and generating <filename>recovery.conf</filename>. This
     value will also be stored in the parameter <literal>repmgr.nodes</literal>
     table for each node; it no longer needs to be explicitly specified when
     cloning a node or executing <xref linkend="repmgr-standby-follow">.
    </para>
   </sect2>
 </sect1>
 </chapter>
--- a/doc/configuration-file-log-settings.sgml
+++ b/doc/configuration-file-log-settings.sgml
@@ -0,0 +1,107 @@
 <sect1 id="configuration-file-log-settings" xreflabel="log settings">
  <indexterm>
    <primary>repmgr.conf</primary>
    <secondary>log settings</secondary>
  </indexterm>
  <indexterm>
    <primary>log settings</primary>
    <secondary>configuration in repmgr.conf</secondary>
  </indexterm>
  <title>Log settings</title>
  <para>
    By default, &repmgr; and <application>repmgrd</application> write log output to
    <literal>STDERR</literal>. An alternative log destination can be specified
    (either a file or <literal>syslog</literal>).
  </para>
  <note>
    <para>
      The &repmgr; application itself will continue to write log output to <literal>STDERR</literal>
      even if another log destination is configured, as otherwise any output resulting from a command
      line operation will "disappear" into the log.
    </para>
    <para>
      This behaviour can be overriden with the command line option <option>--log-to-file</option>,
      which will redirect all logging output to the configured log destination. This is recommended
      when &repmgr; is executed by another application, particularly <application>repmgrd</application>,
      to enable log output generated by the &repmgr; application to be stored for later reference.
    </para>
  </note>
  <variablelist>
   <varlistentry id="repmgr-conf-log-level" xreflabel="log_level">
    <term><varname>log_level</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>log_level</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       One of <option>DEBUG</option>, <option>INFO</option>, <option>NOTICE</option>,
       <option>WARNING</option>, <option>ERROR</option>, <option>ALERT</option>, <option>CRIT</option>
       or <option>EMERG</option>.
     </para>
     <para>
       Default is <option>INFO</option>.
     </para>
     <para>
       Note that <option>DEBUG</option> will produce a substantial amount of log output
       and should not be enabled in normal use.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-log-facility" xreflabel="log_facility">
    <term><varname>log_facility</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>log_facility</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       Logging facility: possible values are <option>STDERR</option> (default), or for
       syslog integration, one of <option>LOCAL0</option>, <option>LOCAL1</option>, <option>...</option>,
       <option>LOCAL7</option>, <option>USER</option>.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-log-file" xreflabel="log_file">
    <term><varname>log_file</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>log_file</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       If <xref linkend="repmgr-conf-log-facility"> is set to <option>STDERR</option>, log output
       can be redirected to the specified file.
     </para>
     <para>
       See <xref linkend="repmgrd-log-rotation"> for information on configuring log rotation.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-log-status-interval" xreflabel="log_status_interval">
    <term><varname>log_status_interval</varname> (<type>integer</type>)
     <indexterm>
      <primary><varname>log_status_interval</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       This setting causes <application>repmgrd</application> to emit a status log
       line at the specified interval (in seconds, default <literal>300</literal>)
       describing <application>repmgrd</application>'s current state, e.g.:
     </para>
     <programlisting>
      [2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
    </listitem>
   </varlistentry>
  </variablelist>
 </sect1>
--- a/doc/configuration-file-required-settings.sgml
+++ b/doc/configuration-file-required-settings.sgml
@@ -0,0 +1,122 @@
 <sect1 id="configuration-file-settings" xreflabel="required configuration file settings">
  <indexterm>
    <primary>repmgr.conf</primary>
    <secondary>required settings</secondary>
  </indexterm>
 <title>Required configuration file settings</title>
 <para>
   Each <filename>repmgr.conf</filename> file must contain the following parameters:
 </para>
 <para>
  <variablelist>
   <varlistentry id="repmgr-conf-node-id" xreflabel="node_id">
    <term><varname>node_id</varname> (<type>int</type>)
     <indexterm>
      <primary><varname>node_id</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
      A unique integer greater than zero which identifies the node.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-node-name" xreflabel="node_name">
    <term><varname>node_name</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>node_name</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       An arbitrary (but unique) string; we recommend using the server's hostname
       or another identifier unambiguously associated with the server to avoid
       confusion. Avoid choosing names which reflect the node's current role,
       e.g. <varname>primary</varname> or <varname>standby1</varname>
       as roles can change and if you end up in a solution where the current primary is
       called <varname>standby1</varname> (for example), things will be confusing
       to say the least.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-conninfo" xreflabel="conninfo">
    <term><varname>conninfo</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>conninfo</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
      Database connection information as a conninfo string.
      All servers in the cluster must be able to connect to
      the local node using this string.
     </para>
     <para>
       For details on conninfo strings, see section <ulink
       url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">Connection Strings</>
        in the PosgreSQL documentation.
     </para>
     <para>
        If repmgrd is in use, consider explicitly setting
        <varname>connect_timeout</varname> in the <varname>conninfo</varname>
        string to determine the length of time which elapses before a network
        connection attempt is abandoned; for details see <ulink
        url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT">
        the PostgreSQL documentation</>.
     </para>
    </listitem>
   </varlistentry>
   <varlistentry id="repmgr-conf-data-directory" xreflabel="data_directory">
    <term><varname>data_directory</varname> (<type>string</type>)
     <indexterm>
      <primary><varname>data_directory</varname> configuration file parameter</primary>
     </indexterm>
    </term>
    <listitem>
     <para>
       The node's data directory. This is needed by repmgr
       when performing operations when the PostgreSQL instance
       is not running and there's no other way of determining
       the data directory.
     </para>
    </listitem>
   </varlistentry>
  </variablelist>
 </para>
  <para>
    For a full list of annotated configuration items, see the file
    <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
  </para>
  <para>
    For <application>repmgrd</application>-specific settings, see <xref linkend="repmgrd-configuration">.
  </para>
  <note>
    <para>
    The following parameters in the configuration file can be overridden with
    command line options:
    <itemizedlist>
     <listitem>
       <simpara>
         <literal>-L/--log-level</literal> overrides <literal>log_level</literal> in
         <filename>repmgr.conf</filename>
       </simpara>
     </listitem>
     <listitem>
       <simpara>
         <literal>-b/--pg_bindir</literal> overrides <literal>pg_bindir</literal> in
         <filename>repmgr.conf</filename>
       </simpara>
     </listitem>
    </itemizedlist>
    </para>
  </note>
 </sect1>
--- a/doc/configuration-file-service-commands.sgml
+++ b/doc/configuration-file-service-commands.sgml
@@ -0,0 +1,130 @@
 <sect1 id="configuration-file-service-commands" xreflabel="service command settings">
  <indexterm>
    <primary>repmgr.conf</primary>
    <secondary>service command settings</secondary>
  </indexterm>
  <indexterm>
    <primary>service command settings</primary>
    <secondary>configuration in repmgr.conf</secondary>
  </indexterm>
  <title>Service command settings</title>
  <para>
    In some circumstances, &repmgr; (and <application>repmgrd</application>) need to
    be able to stop, start or restart PostgreSQL. &repmgr; commands which need to do this
    include <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>,
    <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link> and
    <link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link>.
  </para>
  <para>
    By default, &repmgr; will use PostgreSQL's <command>pg_ctl</command> utility to control the PostgreSQL
    server. However this can lead to various problems, particularly when PostgreSQL has been
    installed from packages, and especially so if <application>systemd</application> is in use.
  </para>
  <note>
    <para>
      If using <application>systemd</application>, ensure you have <varname>RemoveIPC</varname> set to <literal>off</literal>.
      See the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
      entry in the <ulink url="https://wiki.postgresql.org/wiki/Main_Page">PostgreSQL wiki</ulink> for details.
    </para>
  </note>
  <para>
    With this in mind, we recommend to <emphasis>always</emphasis> configure &repmgr; to use the
    available system service commands.
  </para>
  <para>
    To do this, specify the appropriate command for each action
    in <filename>repmgr.conf</filename> using the following configuration
    parameters:
    <programlisting>
    service_start_command
    service_stop_command
    service_restart_command
    service_reload_command</programlisting>
  </para>
  <note>
    <para>
      &repmgr; will not apply <option>pg_bindir</option> when executing any of these commands;
      these can be user-defined scripts so must always be specified with the full path.
    </para>
  </note>
  <note>
    <para>
      It's also possible to specify a <varname>service_promote_command</varname>.
      This is intended for systems which provide a package-level promote command,
      such as Debian's <application>pg_ctlcluster</application>, to promote the
      PostgreSQL from standby to primary.
    </para>
    <para>
      If your packaging system does not provide such a command, it can be left empty,
      and &repmgr; will generate the appropriate `pg_ctl ... promote` command.
    </para>
    <para>
      Do not confuse this with <varname>promote_command</varname>, which is used
      by <application>repmgrd</application> to execute <xref linkend="repmgr-standby-promote">.
    </para>
  </note>
  <para>
    To confirm which command &repmgr; will execute for each action, use
    <command><link linkend="repmgr-node-service">repmgr node service --list-actions --action=...</link></command>, e.g.:
    <programlisting>
      repmgr -f /etc/repmgr.conf node service --list-actions --action=stop
      repmgr -f /etc/repmgr.conf node service --list-actions --action=start
      repmgr -f /etc/repmgr.conf node service --list-actions --action=restart
      repmgr -f /etc/repmgr.conf node service --list-actions --action=reload</programlisting>
  </para>
  <para>
     These commands will be executed by the system user which &repmgr; runs as (usually <literal>postgres</literal>)
     and will probably require passwordless sudo access to be able to execute the command.
  </para>
  <para>
    For example, using <application>systemd</application> on CentOS 7, the service commands can be
    set as follows:
    <programlisting>
      service_start_command   = 'sudo systemctl start postgresql-9.6'
      service_stop_command    = 'sudo systemctl stop postgresql-9.6'
      service_restart_command = 'sudo systemctl restart postgresql-9.6'
      service_reload_command  = 'sudo systemctl reload postgresql-9.6'</programlisting>
    and <filename>/etc/sudoers</filename> should be set as follows:
    <programlisting>
      Defaults:postgres !requiretty
      postgres ALL = NOPASSWD: /usr/bin/systemctl stop postgresql-9.6, \
        /usr/bin/systemctl start postgresql-9.6, \
        /usr/bin/systemctl restart postgresql-9.6, \
        /usr/bin/systemctl reload postgresql-9.6</programlisting>
  </para>
  <important>
    <indexterm>
      <primary>pg_ctlcluster</primary>
      <secondary>service command settings</secondary>
    </indexterm>
    <para>
      Debian/Ubuntu users: instead of calling <command>sudo systemctl</command> directly, use
      <command>sudo pg_ctlcluster</command>, e.g.:
    <programlisting>
      service_start_command   = 'sudo pg_ctlcluster 9.6 main start'
      service_stop_command    = 'sudo pg_ctlcluster 9.6 main stop'
      service_restart_command = 'sudo pg_ctlcluster 9.6 main restart'
      service_reload_command  = 'sudo pg_ctlcluster 9.6 main reload'</programlisting>
      and set <filename>/etc/sudoers</filename> accordingly.
    </para>
    <para>
      While <command>pg_ctlcluster</command> will work when executed as user <literal>postgres</literal>,
      it's strongly recommended to use <command>sudo pg_ctlcluster</command> on <application>systemd</application>
      systems, to ensure <application>systemd</application> has a correct picture of
      the PostgreSQL application state.
    </para>
  </important>
 </sect1>
--- a/doc/configuration-file.sgml
+++ b/doc/configuration-file.sgml
@@ -0,0 +1,120 @@
 <sect1 id="configuration-file" xreflabel="configuration file">
  <indexterm>
    <primary>repmgr.conf</primary>
  </indexterm>
  <indexterm>
    <primary>configuration</primary>
    <secondary>repmgr.conf</secondary>
  </indexterm>
  <title>Configuration file</title>
  <para>
    <application>repmgr</application> and <application>repmgrd</application>
    use a common configuration file, by default called
    <filename>repmgr.conf</filename> (although any name can be used if explicitly specified).
    <filename>repmgr.conf</filename> must contain a number of required parameters, including
    the database connection string for the local node and the location
    of its data directory; other values will be inferred from defaults if
    not explicitly supplied. See section <xref linkend="configuration-file-settings">
    for more details.
  </para>
  <sect2 id="configuration-file-format" xreflabel="configuration file format">
    <indexterm>
      <primary>repmgr.conf</primary>
      <secondary>format</secondary>
    </indexterm>
    <title>Configuration file format</title>
    <para>
      <filename>repmgr.conf</filename> is a plain text file with one parameter/value
      combination per line.
    </para>
    <para>
      Whitespace is insignificant (except within a quoted parameter value) and blank lines are ignored.
      Hash marks (#) designate the remainder of the line as a comment. Parameter values that are not simple
      identifiers or numbers should be single-quoted. Note that single quote can not be embedded
      in a parameter value.
    </para>
    <important>
      <para>
        &repmgr; will interpret double-quotes as being part of a string value; only use single quotes
        to quote parameter values.
      </para>
    </important>
    <para>
      Example of a valid <filename>repmgr.conf</filename> file:
      <programlisting>
 # repmgr.conf
 node_id=1
 node_name= node1
 conninfo ='host=node1 dbname=repmgr user=repmgr connect_timeout=2'
 data_directory = /var/lib/pgsql/11/data</programlisting>
    </para>
  </sect2>
  <sect2 id="configuration-file-location" xreflabel="configuration file location">
  <indexterm>
    <primary>repmgr.conf</primary>
    <secondary>location</secondary>
  </indexterm>
  <title>Configuration file location</title>
  <para>
   The configuration file will be searched for in the following locations:
   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <para>a configuration file specified by the <literal>-f/--config-file</literal> command line option</para>
    </listitem>
    <listitem>
     <para>
      a location specified by the package maintainer (if <application>repmgr</application>
      as installed from a package and the package maintainer has specified the configuration
      file location)
     </para>
    </listitem>
    <listitem>
     <para><filename>repmgr.conf</filename> in the local directory</para>
    </listitem>
    <listitem>
      <para><filename>/etc/repmgr.conf</filename></para>
    </listitem>
    <listitem>
     <para>the directory reported by <application>pg_config --sysconfdir</application></para>
    </listitem>
   </itemizedlist>
  </para>
  <para>
   Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
   an error will be raised if it is not found or not readable, and no attempt will be made to
   check default locations; this is to prevent <application>repmgr</application> unexpectedly
   reading the wrong configuration file.
  </para>
  <note>
    <para>
      If providing the configuration file location with <literal>-f/--config-file</literal>,
      avoid using a relative path, particularly when executing <xref linkend="repmgr-primary-register">
      and <xref linkend="repmgr-standby-register">, as &repmgr; stores the configuration file location
      in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
      <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
      a relative path into an absolute one, but this may not be the same as the path you
      would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
      to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
      <filename>/path/to/repmgr.conf</filename>).
    </para>
   </note>
   </sect2>
 </sect1>
--- a/doc/configuration.sgml
+++ b/doc/configuration.sgml
@@ -0,0 +1,312 @@
 <chapter id="configuration" xreflabel="Configuration">
  <title>repmgr configuration</title>
  <sect1 id="configuration-prerequisites" xreflabel="Prerequisites for configuration">
    <indexterm>
      <primary>configuration</primary>
      <secondary>prerequisites</secondary>
    </indexterm>
    <indexterm>
      <primary>configuration</primary>
      <secondary>ssh</secondary>
    </indexterm>
    <title>Prerequisites for configuration</title>
    <para>
     Following software must be installed on both servers:
     <itemizedlist spacing="compact" mark="bullet">
      <listitem>
       <simpara><application>PostgreSQL</application></simpara>
      </listitem>
      <listitem>
       <simpara>
        <application>repmgr</application>
       </simpara>
      </listitem>
     </itemizedlist>
    </para>
    <para>
      At network level, connections between the PostgreSQL port (default: <literal>5432</literal>)
      must be possible between all nodes.
    </para>
    <para>
      Passwordless <command>SSH</command> connectivity between all servers in the replication cluster
      is not required, but is necessary in the following cases:
      <itemizedlist>
        <listitem>
          <simpara>if you need &repmgr; to copy configuration files from outside the PostgreSQL
            data directory (as is the case with e.g. <link linkend="packages-debian-ubuntu">Debian packages</link>);
            in this case <command>rsync</command> must also be installed on all servers.
          </simpara>
        </listitem>
        <listitem>
          <simpara>to perform <link linkend="performing-switchover">switchover operations</link></simpara>
        </listitem>
        <listitem>
          <simpara>
            when executing <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command>
            and <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
    <tip>
      <simpara>
        Consider setting <varname>ConnectTimeout</varname> to a low value in your SSH configuration.
        This will make it faster to detect any SSH connection errors.
      </simpara>
    </tip>
  <sect2 id="configuration-postgresql" xreflabel="PostgreSQL configuration">
    <indexterm>
      <primary>configuration</primary>
      <secondary>PostgreSQL</secondary>
    </indexterm>
    <indexterm>
      <primary>PostgreSQL configuration</primary>
    </indexterm>
    <title>PostgreSQL configuration for &repmgr;</title>
    <para>
      The following PostgreSQL configuration parameters may need to be changed in order
      for &repmgr; (and replication itself) to function correctly.
    </para>
    <variablelist>
      <varlistentry>
        <indexterm>
          <primary>hot_standby</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>hot_standby</option></term>
        <listitem>
          <para>
            <option>hot_standby</option> must always be set to <literal>on</literal>, as &repmgr; needs
            to be able to connect to each server it manages.
          </para>
          <para>
            Note that <option>hot_standby</option> defaults to <literal>on</literal> from PostgreSQL 10
            and later; in PostgreSQL 9.6 and earlier, the default was <literal>off</literal>.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-HOT-STANDBY">hot_standby</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>wal_level</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>wal_level</option></term>
        <listitem>
          <para>
            <option>wal_level</option> must be one of <option>replica</option> or <option>logical</option>
            (PostgreSQL 9.5 and earlier: one of <option>hot_standby</option> or <option>logical</option>).
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL">wal_level</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>max_wal_senders</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>max_wal_senders</option></term>
        <listitem>
          <para>
            <option>max_wal_senders</option> must be set to a value of <literal>2</literal> or greater.
            In general you will need one WAL sender for each standby which will attach to the PostgreSQL
            instance; additionally &repmgr; will require two free WAL senders in order to clone further
            standbys.
          </para>
          <para>
            <option>max_wal_senders</option> should be set to an appropriate value on all PostgreSQL
            instances in the replication cluster which may potentially become a primary server or
            (in cascading replication) the upstream server of a standby.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-WAL-SENDERS">max_wal_senders</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>max_replication_slots</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>max_replication_slots</option></term>
        <listitem>
          <para>
            If you are intending to use replication slots, <option>max_replication_slots</option>
            must be set to a non-zero value.
          </para>
          <para>
            <option>max_replication_slots</option> should be set to an appropriate value on all PostgreSQL
            instances in the replication cluster which may potentially become a primary server or
            (in cascading replication) the upstream server of a standby.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS">max_replication_slots</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>wal_log_hints</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>wal_log_hints</option></term>
        <listitem>
          <para>If you are intending to use <application>pg_rewind</application>,
            and the cluster was not initialised using data checksums, you may want to consider enabling
            <option>wal_log_hints</option>.
          </para>
          <para>
            For more details see <xref linkend="repmgr-node-rejoin-pg-rewind">.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LOG-HINTS">wal_log_hints</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>archive_mode</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>archive_mode</option></term>
        <listitem>
          <para>
            We suggest setting <option>archive_mode</option> to <literal>on</literal> (and
            <option>archive_command</option> to <literal>/bin/true</literal>; see below)
            even if you are currently not planning to use WAL file archiving.
          </para>
          <para>
            This will make it simpler to set up WAL file archiving if it is ever required,
            as changes to <option>archive_mode</option> require a full PostgreSQL server
            restart, while <option>archive_command</option> changes can be applied via a normal
            configuration reload.
          </para>
          <para>
            However, &repmgr; itself does not require WAL file archiving.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE">archive_mode</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>archive_command</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>archive_command</option></term>
        <listitem>
          <para>
            If you have set  <option>archive_mode</option> to <literal>on</literal> but are not currently planning
            to use WAL file archiving, set <option>archive_command</option> to a command which does nothing but returns
            <literal>true</literal>, such as <command>/bin/true</command>. See above for details.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-COMMAND">archive_command</ulink>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <indexterm>
          <primary>wal_keep_segments</primary>
          <secondary>PostgreSQL configuration</secondary>
        </indexterm>
        <term><option>wal_keep_segments</option></term>
        <listitem>
          <para>
            Normally there is no need to set <option>wal_keep_segments</option> (default: <literal>0</literal>), as it
            is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL segments are available to standbys.
            Replication slots and/or an archiving solution such as Barman are recommended to ensure standbys have a reliable
            source of WAL segments at all times.
          </para>
          <para>
            The only reason ever to set  <option>wal_keep_segments</option> is you have
            you have configured <option>pg_basebackup_options</option>
            in <filename>repmgr.conf</filename> to include the setting <literal>--wal-method=fetch</literal>
            (PostgreSQL 9.6 and earlier: <literal>--xlog-method=fetch</literal>)
            <emphasis>and</emphasis> you have <emphasis>not</emphasis> set <option>restore_command</option>
            in <filename>repmgr.conf</filename> to fetch WAL files from a reliable source such as Barman,
            in which case you'll need to set <option>wal_keep_segments</option>
            to a sufficiently high number to ensure that all WAL files required by the standby
            are retained. However we do not recommend managing replication in this way.
          </para>
          <para>
            PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS">wal_keep_segments</ulink>.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
    <para>
      See also the <link linkend="quickstart-postgresql-configuration">PostgreSQL configuration</link> section in the
      <link linkend="quickstart">Quick-start guide</link>.
    </para>
  </sect2>
  </sect1>
  &configuration-file;
  &configuration-file-required-settings;
  &configuration-file-log-settings;
  &configuration-file-service-commands;
  <sect1 id="configuration-permissions" xreflabel="Database user permissions">
    <indexterm>
      <primary>configuration</primary>
      <secondary>database user permissions</secondary>
    </indexterm>
    <title>repmgr database user permissions</title>
    <para>
      &repmgr; will create an extension database containing objects
      for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname>
      setting must be able to access all objects. Additionally, superuser permissions
      are required to install the &repmgr; extension. The easiest way to do this
      is create the &repmgr; user as a superuser, however if this is not
      desirable, the &repmgr; user can be created as a normal user and a
      superuser specified with <literal>--superuser</literal> when registering a &repmgr; node.
    </para>
  </sect1>
 </chapter>
--- a/doc/configuring-witness-server.sgml
+++ b/doc/configuring-witness-server.sgml
@@ -0,0 +1,93 @@
 <chapter id="using-witness-server">
 <indexterm>
  <primary>witness server</primary>
  <seealso>Using a witness server with repmgrd</seealso>
 </indexterm>
 <title>Using a witness server</title>
 <para>
   A <xref linkend="witness-server"> is a normal PostgreSQL instance which
   is not part of the streaming replication cluster; its purpose is, if a
   failover situation occurs, to provide proof that the primary server
   itself is unavailable.
 </para>
 <para>
   A typical use case for a witness server is a two-node streaming replication
   setup, where the primary and standby are in different locations (data centres).
   By creating a witness server in the same location (data centre) as the primary,
   if the primary becomes unavailable it's possible for the standby to decide whether
   it can promote itself without risking a "split brain" scenario: if it can't see either the
   witness or the primary server, it's likely there's a network-level interruption
   and it should not promote itself. If it can seen the witness but not the primary,
   this proves there is no network interruption and the primary itself is unavailable,
   and it can therefore promote itself (and ideally take action to fence the
   former primary).
 </para>
 <note>
   <para>
     <emphasis>Never</emphasis> install a witness server on the same physical host
     as another node in the replication cluster managed by &repmgr; - it's essential
     the witness is not affected in any way by failure of another node.
   </para>
 </note>
 <para>
   For more complex replication scenarios,e.g. with multiple datacentres, it may
   be preferable to use location-based failover, which ensures that only nodes
   in the same location as the primary will ever be promotion candidates;
   see <xref linkend="repmgrd-network-split"> for more details.
 </para>
 <note>
   <simpara>
     A witness server will only be useful if <application>repmgrd</application>
     is in use.
   </simpara>
 </note>
 <sect1 id="creating-witness-server">
   <title>Creating a witness server</title>
 <para>
   To create a witness server, set up a normal PostgreSQL instance on a server
   in the same physical location as the cluster's primary server.
 </para>
 <para>
   This instance should *not* be on the same physical host as the primary server,
   as otherwise if the primary server fails due to hardware issues, the witness
   server will be lost too.
 </para>
 <note>
   <simpara>
     &repmgr; 3.3 and earlier provided a <command>repmgr create witness</command>
     command, which would automatically create a PostgreSQL instance. However
     this often resulted in an unsatisfactory, hard-to-customise instance.
   </simpara>
 </note>
 <para>
   The witness server should be configured in the same way as a normal
   &repmgr; node; see section <xref linkend="configuration">.
 </para>
 <para>
   Register the witness server with <xref linkend="repmgr-witness-register">.
   This will create the &repmgr; extension on the witness server, and make
   a copy of the &repmgr; metadata.
 </para>
 <note>
   <simpara>
    As the witness server is not part of the replication cluster, further
    changes to the &repmgr; metadata will be synchronised by
    <application>repmgrd</application>.
   </simpara>
 </note>
 <para>
   Once the witness server has been configured, <application>repmgrd</application>
   should be started; for more details see <xref linkend="repmgrd-witness-server">.
 </para>
 <para>
  To unregister a witness server, use <xref linkend="repmgr-witness-unregister">.
 </para>
 </sect1>
 </chapter>
--- a/doc/event-notifications.sgml
+++ b/doc/event-notifications.sgml
@@ -0,0 +1,292 @@
 <chapter id="event-notifications" xreflabel="event notifications">
 <indexterm>
   <primary>event notifications</primary>
 </indexterm>
 <title>Event Notifications</title>
 <para>
  Each time &repmgr; or <application>repmgrd</application> perform a significant event, a record
  of that event is written into the <literal>repmgr.events</literal> table together with
  a timestamp, an indication of failure or success, and further details
  if appropriate. This is useful for gaining an overview of events
  affecting the replication cluster. However note that this table has
  advisory character and should be used in combination with the &repmgr;
  and PostgreSQL logs to obtain details of any events.
 </para>
 <para>
  Example output after a primary was registered and a standby cloned
  and registered:
  <programlisting>
    repmgr=# SELECT * from repmgr.events ;
     node_id |      event       | successful |        event_timestamp        |                                       details
    ---------+------------------+------------+-------------------------------+-------------------------------------------------------------------------------------
           1 | primary_register | t          | 2016-01-08 15:04:39.781733+09 |
           2 | standby_clone    | t          | 2016-01-08 15:04:49.530001+09 | Cloned from host 'repmgr_node1', port 5432; backup method: pg_basebackup; --force: N
           2 | standby_register | t          | 2016-01-08 15:04:50.621292+09 |
    (3 rows)</programlisting>
 </para>
 <para>
  Alternatively, use <xref linkend="repmgr-cluster-event"> to output a
  formatted list of events.
 </para>
 <para>
  Additionally, event notifications can be passed to a user-defined program
  or script which can take further action, e.g. send email notifications.
  This is done by setting the <literal>event_notification_command</literal> parameter in
  <filename>repmgr.conf</filename>.
 </para>
 <para>
  The following format placeholders are provided for all event notifications:
 </para>
 <variablelist>
  <varlistentry>
   <term><option>%n</option></term>
   <listitem>
    <para>
      node ID
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%e</option></term>
   <listitem>
    <para>
     event type
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%s</option></term>
   <listitem>
    <para>
     success (1) or failure (0)
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%t</option></term>
   <listitem>
    <para>
     timestamp
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%d</option></term>
   <listitem>
    <para>
     details
    </para>
   </listitem>
  </varlistentry>
 </variablelist>
 <para>
  The values provided for <literal>%t</literal> and <literal>%d</literal>
  will probably contain spaces, so should be quoted in the provided command
  configuration, e.g.:
  <programlisting>
    event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
  </programlisting>
 </para>
 <para>
   The following parameters are provided for a subset of event notifications:
 </para>
 <variablelist>
  <varlistentry>
   <term><option>%p</option></term>
   <listitem>
    <para>
     node ID of the current primary (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
    </para>
    <para>
     node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"> only)
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%c</option></term>
   <listitem>
    <para>
     <literal>conninfo</literal> string of the primary node
     (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
    </para>
    <para>
      <literal>conninfo</literal> string of the next available node
      (<varname>bdr_failover</varname> and  <varname>bdr_recovery</varname>)
    </para>
   </listitem>
  </varlistentry>
  <varlistentry>
   <term><option>%a</option></term>
   <listitem>
    <para>
     name of the current primary node (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
    </para>
    <para>
     name of the next available node (<varname>bdr_failover</varname> and  <varname>bdr_recovery</varname>)
    </para>
   </listitem>
  </varlistentry>
 </variablelist>
 <para>
  The values provided for <literal>%c</literal> and <literal>%a</literal>
  will probably contain spaces, so should always be quoted.
 </para>
 <para>
  By default, all notification types will be passed to the designated script;
  the notification types can be filtered to explicitly named ones using the
  <varname>event_notifications</varname> parameter.
 </para>
 <para>
   Events generated by the &repmgr; command:
  <itemizedlist spacing="compact" mark="bullet">
   <listitem>
     <simpara><literal><link linkend="repmgr-primary-register-events">cluster_created</link></literal></simpara>
   </listitem>
   <listitem>
     <simpara><literal><link linkend="repmgr-primary-register-events">primary_register</link></literal></simpara>
   </listitem>
   <listitem>
     <simpara><literal><link linkend="repmgr-primary-unregister-events">primary_unregister</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-clone-events">standby_clone</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-register-events">standby_register</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-register-events">standby_register_sync</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-unregister-events">standby_unregister</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-promote-events">standby_promote</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-standby-follow-events">standby_follow</link></literal></simpara>
   </listitem>
   <listitem>
     <simpara><literal><link linkend="repmgr-standby-switchover-events">standby_switchover</link></literal></simpara>
   </listitem>
   <listitem>
     <simpara><literal><link linkend="repmgr-witness-register-events">witness_register</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-witness-unregister-events">witness_unregister</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-node-rejoin-events">node_rejoin</link></literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal><link linkend="repmgr-cluster-cleanup-events">cluster_cleanup</link></literal></simpara>
   </listitem>
  </itemizedlist>
 </para>
 <para>
   Events generated by <application>repmgrd</application> (streaming replication mode):
   <itemizedlist spacing="compact" mark="bullet">
   <listitem>
    <simpara><literal>repmgrd_start</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_shutdown</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_reload</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_failover_promote</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_failover_follow</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_failover_aborted</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_standby_reconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_promote_error</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_local_disconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_local_reconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_upstream_disconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>repmgrd_upstream_reconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>standby_disconnect_manual</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>standby_failure</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>standby_recovery</literal></simpara>
   </listitem>
   </itemizedlist>
 </para>
  <para>
   Events generated by <application>repmgrd</application> (BDR mode):
   <itemizedlist spacing="compact" mark="bullet">
   <listitem>
    <simpara><literal>bdr_failover</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>bdr_reconnect</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>bdr_recovery</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>bdr_register</literal></simpara>
   </listitem>
   <listitem>
    <simpara><literal>bdr_unregister</literal></simpara>
   </listitem>
  </itemizedlist>
 </para>
 <para>
  Note that under some circumstances (e.g. when no replication cluster primary
  could be located), it will not be possible to write an entry into the
  <literal>repmgr.events</literal>
  table, in which case executing a script via <varname>event_notification_command</varname>
  can serve as a fallback by generating some form of notification.
 </para>
 </chapter>
--- a/doc/filelist.sgml
+++ b/doc/filelist.sgml
@@ -0,0 +1,93 @@
 <!-- doc/filelist.sgml -->
 <!ENTITY legal      SYSTEM "legal.sgml">
 <!ENTITY bookindex  SYSTEM "bookindex.sgml">
 <!--
 Some parts of the documentation are also source for some plain-text
 files used during installation.  To selectively ignore or include
 some parts (e.g., external xref's) when generating these files we use
 these parameter entities.  See also standalone-install.sgml.
 -->
 <!ENTITY % standalone-ignore  "INCLUDE">
 <!ENTITY % standalone-include "IGNORE">
 <!-- doc/filelist.sgml -->
 <!--
 By default, no index is included.  Use -i include-index on the command line
 to include it.
 -->
 <!ENTITY % include-index "IGNORE">
 <!--
 Create empty index element for processing by XSLT stylesheet.
 -->
 <!ENTITY % include-xslt-index "IGNORE">
 <!--
 Include external documentation sections
 -->
 <!ENTITY overview      SYSTEM "overview.sgml">
 <!ENTITY install SYSTEM "install.sgml">
 <!ENTITY install-requirements      SYSTEM "install-requirements.sgml">
 <!ENTITY install-packages      SYSTEM "install-packages.sgml">
 <!ENTITY install-source      SYSTEM "install-source.sgml">
 <!ENTITY quickstart      SYSTEM "quickstart.sgml">
 <!ENTITY configuration      SYSTEM "configuration.sgml">
 <!ENTITY configuration-file      SYSTEM "configuration-file.sgml">
 <!ENTITY configuration-file-required-settings      SYSTEM "configuration-file-required-settings.sgml">
 <!ENTITY configuration-file-log-settings      SYSTEM "configuration-file-log-settings.sgml">
 <!ENTITY configuration-file-service-commands   SYSTEM "configuration-file-service-commands.sgml">
 <!ENTITY cloning-standbys  SYSTEM "cloning-standbys.sgml">
 <!ENTITY promoting-standby  SYSTEM "promoting-standby.sgml">
 <!ENTITY follow-new-primary  SYSTEM "follow-new-primary.sgml">
 <!ENTITY switchover  SYSTEM "switchover.sgml">
 <!ENTITY configuring-witness-server SYSTEM "configuring-witness-server.sgml">
 <!ENTITY event-notifications  SYSTEM "event-notifications.sgml">
 <!ENTITY upgrading-repmgr  SYSTEM "upgrading-repmgr.sgml">
 <!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.sgml">
 <!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.sgml">
 <!ENTITY repmgrd-demonstration SYSTEM "repmgrd-demonstration.sgml">
 <!ENTITY repmgrd-monitoring SYSTEM "repmgrd-monitoring.sgml">
 <!ENTITY repmgrd-degraded-monitoring SYSTEM "repmgrd-degraded-monitoring.sgml">
 <!ENTITY repmgrd-cascading-replication SYSTEM "repmgrd-cascading-replication.sgml">
 <!ENTITY repmgrd-network-split SYSTEM "repmgrd-network-split.sgml">
 <!ENTITY repmgrd-witness-server SYSTEM "repmgrd-witness-server.sgml">
 <!ENTITY repmgrd-pausing SYSTEM "repmgrd-pausing.sgml">
 <!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.sgml">
 <!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.sgml">
 <!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.sgml">
 <!ENTITY repmgr-standby-clone SYSTEM "repmgr-standby-clone.sgml">
 <!ENTITY repmgr-standby-register SYSTEM "repmgr-standby-register.sgml">
 <!ENTITY repmgr-standby-unregister SYSTEM "repmgr-standby-unregister.sgml">
 <!ENTITY repmgr-standby-promote SYSTEM "repmgr-standby-promote.sgml">
 <!ENTITY repmgr-standby-follow SYSTEM "repmgr-standby-follow.sgml">
 <!ENTITY repmgr-standby-switchover SYSTEM "repmgr-standby-switchover.sgml">
 <!ENTITY repmgr-witness-register SYSTEM "repmgr-witness-register.sgml">
 <!ENTITY repmgr-witness-unregister SYSTEM "repmgr-witness-unregister.sgml">
 <!ENTITY repmgr-node-status SYSTEM "repmgr-node-status.sgml">
 <!ENTITY repmgr-node-check SYSTEM "repmgr-node-check.sgml">
 <!ENTITY repmgr-node-rejoin SYSTEM "repmgr-node-rejoin.sgml">
 <!ENTITY repmgr-node-service SYSTEM "repmgr-node-service.sgml">
 <!ENTITY repmgr-cluster-show SYSTEM "repmgr-cluster-show.sgml">
 <!ENTITY repmgr-cluster-matrix SYSTEM "repmgr-cluster-matrix.sgml">
 <!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.sgml">
 <!ENTITY repmgr-cluster-event SYSTEM "repmgr-cluster-event.sgml">
 <!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.sgml">
 <!ENTITY repmgr-daemon-status SYSTEM "repmgr-daemon-status.sgml">
 <!ENTITY repmgr-daemon-pause SYSTEM "repmgr-daemon-pause.sgml">
 <!ENTITY repmgr-daemon-unpause SYSTEM "repmgr-daemon-unpause.sgml">
 <!ENTITY appendix-release-notes  SYSTEM "appendix-release-notes.sgml">
 <!ENTITY appendix-faq      SYSTEM "appendix-faq.sgml">
 <!ENTITY appendix-signatures      SYSTEM "appendix-signatures.sgml">
 <!ENTITY appendix-packages      SYSTEM "appendix-packages.sgml">
 <!ENTITY bookindex  SYSTEM "bookindex.sgml">
--- a/doc/follow-new-primary.sgml
+++ b/doc/follow-new-primary.sgml
@@ -0,0 +1,48 @@
 <chapter id="follow-new-primary">
 <indexterm>
  <primary>Following a new primary</primary>
  <seealso>repmgr standby follow</seealso>
 </indexterm>
 <title>Following a new primary</title>
 <para>
   Following the failure or removal of the replication cluster's existing primary
   server, <xref linkend="repmgr-standby-follow"> can be used to make 'orphaned' standbys
   follow the new primary and catch up to its current state.
 </para>
 <para>
  To demonstrate this, assuming a replication cluster in the same state as the
  end of the preceding section (<xref linkend="promoting-standby">),
  execute this:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf repmgr standby follow
    INFO: changing node 3's primary to node 2
    NOTICE: restarting server using "pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/postgresql/data' restart"
    waiting for server to shut down......... done
    server stopped
    waiting for server to start.... done
    server started
    NOTICE: STANDBY FOLLOW successful
    DETAIL: node 3 is now attached to node 2
  </programlisting>
 </para>
 <para>
   The standby is now replicating from the new primary and
   <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>
   output reflects this:
   <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
 </para>
 <para>
  Note that with cascading replication, <command>repmgr standby follow</command> can also be
  used to detach a standby from its current upstream server and follow the
  primary. However it's currently not possible to have it follow another standby;
  we hope to improve this in a future release.
 </para>
 </chapter>
--- a/doc/install-packages.sgml
+++ b/doc/install-packages.sgml
@@ -0,0 +1,250 @@
 <sect1 id="installation-packages" xreflabel="Installing from packages">
 <title>Installing &repmgr; from packages</title>
 <para>
  We recommend installing &repmgr; using the available packages for your
  system.
 </para>
 <sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, CentOS and Fedora">
  <indexterm>
   <primary>installation</primary>
   <secondary>on Red Hat/CentOS/Fedora etc.</secondary>
  </indexterm>
  <title>RedHat/CentOS/Fedora</title>
  <para>
 	&repmgr; RPM packages for RedHat/CentOS variants and Fedora are available from the
 	<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
 	<ulink url="https://dl.2ndquadrant.com/">public repository</ulink>; see following
 	section for details.
  </para>
  <para>
   RPM packages for &repmgr; are also available via Yum through
   the PostgreSQL Global Development Group RPM repository
   (<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>).
   Follow the instructions for your distribution (RedHat, CentOS,
   Fedora, etc.) and architecture as detailed there. Note that it can take some days
   for new &repmgr; packages to become available via the this repository.
  </para>
  <note>
    <para>
      &repmgr; RPM packages are designed to be compatible with the community-provided PostgreSQL packages
      and 2ndQuadrant's <ulink url="https://www.2ndquadrant.com/en/resources/2ndqpostgres/">2ndQPostgres</ulink>.
      They may not work with vendor-specific packages such as those provided by RedHat for RHEL
      customers, as the PostgreSQL filesystem layout may be different to the community RPMs.
      Please contact your support vendor for assistance.
    </para>
  </note>
  <para>
    For more information on the package contents, including details of installation
    paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
    see the appendix section <xref linkend="packages-centos">.
  </para>
  <sect3 id="installation-packages-redhat-2ndq">
    <title>2ndQuadrant public RPM yum repository</title>
    <para>
      <ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal>
      <ulink url="https://dl.2ndquadrant.com/">public repository</ulink> for 2ndQuadrant software,
      including &repmgr;. We recommend using this for all future &repmgr; releases.
    </para>
    <para>
      General instructions for using this repository can be found on its
      <ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
      for installing &repmgr; follow below.
    </para>
    <para>
      <emphasis>Installation</emphasis>
      <itemizedlist>
 	<listitem>
 	  <para>
 	    Locate the repository RPM for your PostgreSQL version from the list at:
 	    <ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink>
 	  </para>
 	</listitem>
        <listitem>
          <para>
            Install the repository definition for your distribution and PostgreSQL version
 	    (this enables the 2ndQuadrant repository as a source of &repmgr; packages).
 	  </para>
 	  <para>
 	    For example, for PostgreSQL 10 on CentOS, execute:
 	    <programlisting>
 curl https://dl.2ndquadrant.com/default/release/get/10/rpm | sudo bash</programlisting>
 	  </para>
 	  <para>
 	    For PostgreSQL 9.6 on CentOS, execute:
 	    <programlisting>
 curl https://dl.2ndquadrant.com/default/release/get/9.6/rpm | sudo bash</programlisting>
 	  </para>
 	  <para>
 	    Verify that the repository is installed with:
 	    <programlisting>
 sudo yum repolist</programlisting>
 	    The output should contain two entries like this:
 	    <programlisting>
 2ndquadrant-dl-default-release-pg10/7/x86_64        2ndQuadrant packages (PG10) for 7 - x86_64          4
 2ndquadrant-dl-default-release-pg10-debug/7/x86_64  2ndQuadrant packages (PG10) for 7 - x86_64 - Debug  3</programlisting>
 	  </para>
 	</listitem>
        <listitem>
          <para>
            Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
            <programlisting>
 sudo yum install repmgr10</programlisting>
          </para>
          <note>
            <para>
              For packages for PostgreSQL 9.6 and earlier, the package name does not contain
              a period between major and minor version numbers, e.g.
              <literal>repmgr96</literal>.
            </para>
          </note>
          <tip>
            <para>
              To determine the names of available packages, execute:
              <programlisting>
 yum search repmgr</programlisting>
            </para>
          </tip>
        </listitem>
      </itemizedlist>
    </para>
    <para>
      <emphasis>Compatibility with PGDG Repositories</emphasis>
    </para>
    <para>
      The 2ndQuadrant &repmgr; yum repository packages use the same definitions and file system layout as the
      main PGDG repository.
    </para>
    <para>
      Normally <application>yum</application> will prioritize the repository with the most recent &repmgr; version.
      Once the PGDG repository has been updated, it doesn't matter which repository
      the packages are installed from.
    </para>
    <para>
      To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal>
      and set the repository priorities accordingly.
    </para>
    <para>
      <emphasis>Installing a specific package version</emphasis>
    </para>
    <para>
      To install a specific package version, execute <command>yum --showduplicates list</command>
      for the package in question:
      <programlisting>
        [root@localhost ~]# yum --showduplicates list repmgr10
        Loaded plugins: fastestmirror
        Loading mirror speeds from cached hostfile
         * base: ftp.iij.ad.jp
         * extras: ftp.iij.ad.jp
         * updates: ftp.iij.ad.jp
        Available Packages
 		repmgr10.x86_64                       4.0.3-1.rhel7                        pgdg10
 		repmgr10.x86_64                       4.0.4-1.rhel7                        pgdg10
 		repmgr10.x86_64                       4.0.5-1.el7                          2ndquadrant-repo-10</programlisting>
      then append the appropriate version number to the package name with a hyphen, e.g.:
      <programlisting>
        [root@localhost ~]# yum install repmgr10-4.0.3-1.rhel7</programlisting>
    </para>
  </sect3>
 </sect2>
 <sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu">
  <indexterm>
   <primary>installation</primary>
   <secondary>on Debian/Ubuntu etc.</secondary>
  </indexterm>
  <title>Debian/Ubuntu</title>
  <para>.deb packages for &repmgr; are available from the
  PostgreSQL Community APT repository (<ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink>).
  Instructions can be found in the APT section of the PostgreSQL Wiki
  (<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
  </para>
  <para>
    For more information on the package contents, including details of installation
    paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
    see the appendix section <xref linkend="packages-debian-ubuntu">.
  </para>
  <sect3 id="installation-packages-debian-ubuntu-2ndq">
    <title>2ndQuadrant public apt repository for Debian/Ubuntu</title>
    <para>
      <ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a
      <ulink url="https://dl.2ndquadrant.com/">public apt repository</ulink> for 2ndQuadrant software,
      including &repmgr;.
    </para>
    <para>
      General instructions for using this repository can be found on its
      <ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
      for installing &repmgr; follow below.
    </para>
    <para>
      <emphasis>Installation</emphasis>
      <itemizedlist>
 	<listitem>
 	  <para>
            Install the repository definition for your distribution and PostgreSQL version
 	    (this enables the 2ndQuadrant repository as a source of &repmgr; packages) by executing:
            <programlisting>
 curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash</programlisting>
 	  </para>
          <note>
            <para>
              This will automatically install the following additional packages, if not already present:
              <itemizedlist spacing="compact" mark="bullet">
                <listitem>
                  <simpara><literal>lsb-release</literal></simpara>
                </listitem>
                <listitem>
                  <simpara><literal>apt-transport-https</literal></simpara>
                </listitem>
              </itemizedlist>
            </para>
          </note>
        </listitem>
 	<listitem>
 	  <para>
            Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
            <programlisting>
 sudo apt-get install postgresql-10-repmgr</programlisting>
 	  </para>
          <note>
            <para>
            For packages for PostgreSQL 9.6 and earlier, the package name includes
            a period between major and minor version numbers, e.g.
            <literal>postgresql-9.6-repmgr</literal>.
            </para>
          </note>
 	</listitem>
      </itemizedlist>
    </para>
  </sect3>
 </sect2>
 </sect1>
--- a/doc/install-requirements.sgml
+++ b/doc/install-requirements.sgml
@@ -0,0 +1,179 @@
 <sect1 id="install-requirements" xreflabel="installation requirements">
  <indexterm>
   <primary>installation</primary>
   <secondary>requirements</secondary>
  </indexterm>
  <title>Requirements for installing repmgr</title>
  <para>
    repmgr is developed and tested on Linux and OS X, but should work on any
    UNIX-like system supported by PostgreSQL itself. There is no support for
    Microsoft Windows.
  </para>
  <para>
   &repmgr; 4.x is compatible with all PostgreSQL versions from 9.3. See
   section <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
   for an overview of version compatibility.
  </para>
  <note>
   <simpara>
    If upgrading from &repmgr; 3.x, please see the section <xref linkend="upgrading-from-repmgr-3">.
   </simpara>
  </note>
  <para>
   All servers in the replication cluster must be running the same major version of
   PostgreSQL, and we recommend that they also run the same minor version.
  </para>
  <para>
   &repmgr; must be installed on each server in the replication cluster.
   If installing repmgr from packages, the package version must match the PostgreSQL
   version. If installing from source, &repmgr; must be compiled against the same
   major version.
  </para>
  <note>
   <simpara>
     The same &quot;major&quot; &repmgr; version (e.g. <literal>4.2.x</literal>) <emphasis>must</emphasis>
     be installed on all node in the replication cluster. We strongly recommend keeping all
     nodes on the same (preferably latest) &quot;minor&quot; &repmgr; version to minimize the risk
     of incompatibilities.
   </simpara>
   <simpara>
     If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x)
     are installed on different nodes, in the best case &repmgr; (in particular <application>repmgrd</application>)
     will not run. In the worst case, you will end up with a broken cluster.
   </simpara>
  </note>
  <para>
   A dedicated system user for &repmgr; is <emphasis>not</emphasis> required; as many &repmgr; and
   <application>repmgrd</application> actions require direct access to the PostgreSQL data directory,
   these commands should be executed by the <literal>postgres</literal> user.
  </para>
  <para>
    See also <link linkend="configuration-prerequisites">Prerequisites for configuration</link>
    for information on networking requirements.
  </para>
  <tip>
   <simpara>
    We recommend using a session multiplexer utility such as <command>screen</command> or
    <command>tmux</command> when performing long-running actions (such as cloning a database)
    on a remote server - this will ensure the &repmgr; action won't be prematurely
    terminated if your <command>ssh</command> session to the server is interrupted or closed.
    </simpara>
  </tip>
  <sect2 id="install-compatibility-matrix">
    <indexterm>
      <primary>repmgr</primary>
      <secondary>compatibility matrix</secondary>
    </indexterm>
    <indexterm>
      <primary>compatibility matrix</primary>
    </indexterm>
    <title>&repmgr; compatibility matrix</title>
    <para>
      The following table provides an overview of which &repmgr; version supports
      which PostgreSQL version.
    </para>
    <table id="repmgr-compatibility-matrix">
      <title>&repmgr; compatibility matrix</title>
      <tgroup cols="2">
        <thead>
          <row>
            <entry>
              &repmgr; version
            </entry>
            <entry>
              Latest release
            </entry>
            <entry>
              Supported PostgreSQL versions
            </entry>
          </row>
        </thead>
        <tbody>
          <row>
            <entry>
              &repmgr; 4.x
            </entry>
            <entry>
              <link linkend="release-4.2">4.2</link> (2018-10-24)
            </entry>
            <entry>
              9.3, 9.4, 9.5, 9.6, 10, 11
            </entry>
          </row>
          <row>
            <entry>
              &repmgr; 3.x
            </entry>
            <entry>
              <ulink url="https://repmgr.org/release-notes-3.3.2.html">3.3.2</ulink> (2017-05-30)
            </entry>
            <entry>
              9.3, 9.4, 9.5, 9.6
            </entry>
          </row>
          <row>
            <entry>
              &repmgr; 2.x
            </entry>
            <entry>
              <ulink url="https://repmgr.org/release-notes-2.0.3.html">2.0.3</ulink> (2015-04-16)
            </entry>
            <entry>
              9.0, 9.1, 9.2, 9.3, 9.4
            </entry>
          </row>
        </tbody>
      </tgroup>
    </table>
    <important>
      <para>
        The &repmgr; 2.x and 3.x series are no longer maintained or supported.
        We strongly recommend  upgrading to the latest &repmgr; version.
      </para>
    </important>
    <para>
      Note that some &repmgr; functionality is not available in PostgreSQL 9.3 and PostgreSQL 9.4.
    </para>
    <itemizedlist spacing="compact" mark="bullet">
      <listitem>
        <para>
          PostgreSQL 9.3 does not support replication slots, so corresponding &repmgr; functionality
          is not available.
        </para>
      </listitem>
      <listitem>
        <para>
          In PostgreSQL 9.3 and PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
          distribution. <command>pg_rewind</command> will need to be compiled separately to be able
          to use any &repmgr; functionality which takes advantage of it.
        </para>
      </listitem>
    </itemizedlist>
  </sect2>
 </sect1>
--- a/doc/install-source.sgml
+++ b/doc/install-source.sgml
@@ -0,0 +1,261 @@
 <sect1 id="installation-source" xreflabel="Installing from source code">
  <indexterm>
   <primary>installation</primary>
   <secondary>from source</secondary>
  </indexterm>
 <title>Installing &repmgr; from source</title>
 <sect2 id="installation-source-prereqs">
  <title>Prerequisites for installing from source</title>
  <para>
   To install &repmgr; the prerequisites for compiling
   &postgres; must be installed. These are described in &postgres;'s
   documentation
   on <ulink url="https://www.postgresql.org/docs/current/static/install-requirements.html">build requirements</ulink>
   and <ulink url="https://www.postgresql.org/docs/current/static/docguide-toolsets.html">build requirements for documentation</ulink>.
  </para>
  <para>
   Most mainstream Linux distributions and other UNIX variants provide simple
   ways to install the prerequisites from packages.
   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <para>
      <literal>Debian</literal> and <literal>Ubuntu</literal>: First
      add the <ulink
      url="http://apt.postgresql.org/">apt.postgresql.org</ulink>
      repository to your <filename>sources.list</filename> if you
      have not already done so, and ensure the source repository is enabled.
     </para>
     <tip>
       <para>
         If not configured, the source repository can be added by including
         a <literal>deb-src</literal> line as a copy of the existing <literal>deb</literal>
         line in the repository file, which is usually
         <filename>/etc/apt/sources.list.d/pgdg.list</filename>, e.g.:
         <programlisting>
 deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main
 deb-src http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlisting>
       </para>
     </tip>
     <para>
      Then install the prerequisites for
      building PostgreSQL with e.g.:
      <programlisting>
       sudo apt-get update
       sudo apt-get build-dep postgresql-9.6</programlisting>
      </para>
     <important>
       <simpara>
         Select the appropriate PostgreSQL version for your target repmgr version.
       </simpara>
     </important>
     <note>
       <para>
       If using <command>apt-get build-dep</command> is not possible, the
       following packages may need to be installed manually:
         <itemizedlist spacing="compact" mark="bullet">
           <listitem>
             <simpara><literal>llibedit-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibkrb5-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibpam0g-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibreadline-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibselinux1-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibssl-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibxml2-dev</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>llibxslt1-dev</literal></simpara>
           </listitem>
         </itemizedlist>
       </para>
     </note>
    </listitem>
    <listitem>
     <para>
      <literal>RHEL or CentOS 6.x or 7.x</literal>: install the appropriate repository RPM
      for your system from <ulink url="https://yum.postgresql.org/repopackages.php">
      yum.postgresql.org</ulink>. Then install the prerequisites for building
      PostgreSQL with:
      <programlisting>
       sudo yum check-update
       sudo yum groupinstall "Development Tools"
       sudo yum install yum-utils openjade docbook-dtds docbook-style-dsssl docbook-style-xsl
       sudo yum-builddep postgresql96</programlisting>
     </para>
     <important>
       <simpara>
         Select the appropriate PostgreSQL version for your target repmgr version.
       </simpara>
     </important>
     <note>
       <para>
         If using <command>yum-builddep</command> is not possible, the
         following packages may need to be installed manually:
         <itemizedlist spacing="compact" mark="bullet">
           <listitem>
             <simpara><literal>libselinux-devel</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>libxml2-devel</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>libxslt-devel</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>openssl-devel</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>pam-devel</literal></simpara>
           </listitem>
           <listitem>
             <simpara><literal>readline-devel</literal></simpara>
           </listitem>
         </itemizedlist>
       </para>
     </note>
    </listitem>
   </itemizedlist>
  </para>
 </sect2>
 <sect2 id="installation-get-source">
  <title>Getting &repmgr; source code</title>
  <para>
   There are two ways to get the &repmgr; source code: with git, or by downloading tarballs of released versions.
  </para>
  <sect3>
   <title>Using <application>git</application> to get the &repmgr; sources</title>
   <para>
    Use <application><ulink url="https://git-scm.com">git</ulink></application> if you expect
    to update often, you want to keep track of development or if you want to contribute
    changes to &repmgr;. There is no reason <emphasis>not</emphasis> to use <application>git</application>
    if you're familiar with it.
   </para>
   <para>
    The source for &repmgr; is maintained at
    <ulink url="https://github.com/2ndQuadrant/repmgr">https://github.com/2ndQuadrant/repmgr</ulink>.
   </para>
   <para>
    There are also tags for each &repmgr; release, e.g. <literal>v4.2.0</literal>.
   </para>
   <para>
    Clone the source code using <application>git</application>:
    <programlisting>
     git clone https://github.com/2ndQuadrant/repmgr</programlisting>
   </para>
   <para>
    For more information on using <application>git</application> see
    <ulink url="https://git-scm.com/">git-scm.com</ulink>.
   </para>
  </sect3>
  <sect3>
   <title>Downloading release source tarballs</title>
   <para>
    Official release source code is uploaded as tarballs to the
    &repmgr; website along with a tarball checksum and a matching GnuPG
    signature. See
    <ulink url="http://repmgr.org/">http://repmgr.org/</ulink>
    for the download information. See <xref linkend="appendix-signatures">
    for information on verifying digital signatures.
   </para>
   <para>
    You will need to download the repmgr source, e.g. <filename>repmgr-4.0.tar.gz</filename>.
    You may optionally verify the package checksums from the
    <literal>.md5</literal> files and/or verify the GnuPG signatures
    per <xref linkend="appendix-signatures">.
   </para>
   <para>
    After you unpack the source code archives using <literal>tar xf</literal>
    the installation process is the same as if you were installing from a git
    clone.
   </para>
  </sect3>
 </sect2>
 <sect2 id="installation-repmgr-source">
  <title>Installation of &repmgr; from source</title>
  <para>
   To installing &repmgr; from source, simply execute:
   <programlisting>
    ./configure && make install</programlisting>
   Ensure <command>pg_config</command> for the target PostgreSQL version is in
   <varname>$PATH</varname>.
  </para>
 </sect2>
 <sect2 id="installation-build-repmgr-docs">
   <title>Building &repmgr; documentation</title>
   <para>
    The &repmgr; documentation is (like the main PostgreSQL project)
    written in DocBook format. To build it locally as HTML, you'll need to
    install the required packages as described in the
    <ulink url="https://www.postgresql.org/docs/9.6/static/docguide-toolsets.html">
      PostgreSQL documentation</ulink> then execute:
   <programlisting>
    ./configure && make install-doc</programlisting>
   </para>
   <para>
     The generated HTML files will be placed in the <filename>doc/html</filename>
     subdirectory of your source tree.
   </para>
   <para>
     To build the documentation as a single HTML file, execute:
   <programlisting>
    cd doc/ && make repmgr.html</programlisting>
   </para>
   <note>
     <simpara>
       Due to changes in PostgreSQL's documentation build system from PostgreSQL 10,
       the documentation can currently only be built against PostgreSQL 9.6 or earlier.
       This limitation will be fixed when time and resources permit.
     </simpara>
   </note>
 </sect2>
 </sect1>
--- a/doc/install.sgml
+++ b/doc/install.sgml
@@ -0,0 +1,28 @@
 <chapter id="installation" xreflabel="Installation">
 <indexterm>
  <primary>installation</primary>
 </indexterm>
 <title>Installation</title>
 <para>
  &repmgr; can be installed from binary packages provided by your operating
  system's packaging system, or from source.
 </para>
 <para>
  In general we recommend using binary packages, unless unavailable for your operating system.
 </para>
 <para>
  Source installs are mainly useful if you want to keep track of the very
  latest repmgr development and contribute to development.  They're also the
  only option if there are no packages for your operating system yet.
 </para>
 <para>
  Before installing &repmgr; make sure you satisfy the <xref linkend="install-requirements">.
 </para>
 &install-requirements;
 &install-packages;
 &install-source;
 </chapter>
--- a/doc/legal.sgml
+++ b/doc/legal.sgml
@@ -0,0 +1,37 @@
 <!-- doc/legal.sgml -->
 <date>2017</date>
 <copyright>
 <year>2010-2018</year>
 <holder>2ndQuadrant, Ltd.</holder>
 </copyright>
 <legalnotice id="legalnotice">
 <title>Legal Notice</title>
 <para>
  <productname>repmgr</productname> is Copyright &copy; 2010-2018
  by 2ndQuadrant, Ltd. All rights reserved.
 </para>
 <para>
   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.
 </para>
 <para>
   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.
 </para>
 <para>
   You should have received a copy of the GNU General Public License
   along with this program.  If not, see
   <ulink url="https://www.gnu.org/licenses/">https://www.gnu.org/licenses/</ulink>
   to obtain one.
 </para>
 </legalnotice>
--- a/doc/overview.sgml
+++ b/doc/overview.sgml
@@ -0,0 +1,241 @@
 <chapter id="overview" xreflabel="Overview">
 <title>repmgr overview</title>
 <para>
  This chapter provides a high-level overview of &repmgr;'s components and
  functionality.
 </para>
 <sect1 id="repmgr-concepts" xreflabel="Concepts">
  <indexterm>
    <primary>concepts</primary>
  </indexterm>
  <title>Concepts</title>
  <para>
   This guide assumes that you are familiar with PostgreSQL administration and
   streaming replication concepts. For further details on streaming
   replication, see the PostgreSQL documentation section on <ulink
   url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION">
   streaming replication</>.
  </para>
  <para>
   The following terms are used throughout the &repmgr; documentation.
   <variablelist>
    <varlistentry>
     <term>replication cluster</term>
     <listitem>
      <simpara>
       In the &repmgr; documentation, "replication cluster" refers to the network
       of PostgreSQL servers connected by streaming replication.
      </simpara>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>node</term>
     <listitem>
      <simpara>
       A node is a single PostgreSQL server within a replication cluster.
      </simpara>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>upstream node</term>
     <listitem>
      <simpara>
       The node a standby server connects to, in order to receive streaming replication.
       This is either the primary server, or in the case of cascading replication, another
       standby.
      </simpara>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>failover</term>
     <listitem>
      <simpara>
       This is the action which occurs if a primary server fails and a suitable standby
       is  promoted as the new primary. The <application>repmgrd</application> daemon supports automatic failover
       to minimise downtime.
      </simpara>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>switchover</term>
     <listitem>
      <simpara>
       In certain circumstances, such as hardware or operating system maintenance,
       it's necessary to take a primary server offline; in this case a controlled
       switchover is necessary, whereby a suitable standby is promoted and the
       existing primary removed from the replication cluster in a controlled manner.
       The &repmgr; command line client provides this functionality.
      </simpara>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>fencing</term>
     <listitem>
      <simpara>
       In a failover situation, following the promotion of a new standby, it's
       essential that the previous primary does not unexpectedly come back on
       line, which would result in a split-brain situation. To prevent this,
       the failed primary should be isolated from applications, i.e. "fenced off".
      </simpara>
     </listitem>
    </varlistentry>
   <varlistentry id="witness-server">
     <term>witness server</term>
     <listitem>
      <para>
        &repmgr; provides functionality to set up a so-called "witness server" to
        assist in determining a new primary server in a failover situation with more
        than one standby. The witness server itself is not part of the replication
        cluster, although it does contain a copy of the repmgr metadata schema.
      </para>
      <para>
        The purpose of a witness server is to provide a "casting vote" where servers
        in the replication cluster are split over more than one location. In the event
        of a loss of connectivity between locations, the presence or absence of
        the witness server will decide whether a server at that location is promoted
        to primary; this is to prevent a "split-brain" situation where an isolated
        location interprets a network outage as a failure of the (remote) primary and
        promotes a (local) standby.
      </para>
      <para>
        A witness server only needs to be created if <application>repmgrd</application>
        is in use.
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>
 </sect1>
 <sect1 id="repmgr-components" xreflabel="Components">
  <title>Components</title>
  <para>
  &repmgr; is a suite of open-source tools to manage replication and failover
  within a cluster of PostgreSQL servers. It supports and enhances PostgreSQL's
  built-in streaming replication, which provides a single read/write primary server
  and one or more read-only standbys containing near-real time copies of the primary
  server's database. It provides two main tools:
   <variablelist>
    <varlistentry>
     <term>repmgr</term>
     <listitem>
      <para>
       A command-line tool used to perform administrative tasks such as:
       <itemizedlist>
        <listitem>
          <simpara>setting up standby servers</simpara>
        </listitem>
        <listitem>
          <simpara>promoting a standby server to primary</simpara>
        </listitem>
        <listitem>
          <simpara>switching over primary and standby servers</simpara>
        </listitem>
        <listitem>
          <simpara>displaying the status of servers in the replication cluster</simpara>
        </listitem>
       </itemizedlist>
      </para>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>repmgrd</term>
     <listitem>
      <para>
       A daemon which actively monitors servers in a replication cluster
       and performs the following tasks:
       <itemizedlist>
        <listitem>
          <simpara>monitoring and recording replication performance</simpara>
        </listitem>
        <listitem>
          <simpara>performing failover by detecting failure of the primary and
            promoting the most suitable standby server
          </simpara>
        </listitem>
        <listitem>
          <simpara>provide notifications about events in the cluster to a user-defined
      script which can perform tasks such as sending alerts by email</simpara>
        </listitem>
       </itemizedlist>
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>
 </sect1>
 <sect1 id="repmgr-user-metadata" xreflabel="Repmgr user and metadata">
  <title>Repmgr user and metadata</title>
  <para>
   In order to effectively manage a replication cluster, &repmgr; needs to store
   information about the servers in the cluster in a dedicated database schema.
   This schema is automatically created by the &repmgr; extension, which is installed
   during the first step in initializing a &repmgr;-administered cluster
   (<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
   and contains the following objects:
   <variablelist>
    <varlistentry>
     <term>Tables</term>
     <listitem>
      <para>
       <itemizedlist>
        <listitem>
          <simpara><literal>repmgr.events</literal>: records events of interest</simpara>
        </listitem>
        <listitem>
          <simpara><literal>repmgr.nodes</literal>: connection and status information for each server in the
    replication cluster</simpara>
        </listitem>
        <listitem>
          <simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
            written by <application>repmgrd</application></simpara>
        </listitem>
       </itemizedlist>
      </para>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>Views</term>
     <listitem>
      <para>
       <itemizedlist>
        <listitem>
          <simpara>repmgr.show_nodes: based on the table <literal>repmgr.nodes</literal>, additionally showing the
           name of the server's upstream node</simpara>
        </listitem>
        <listitem>
          <simpara>repmgr.replication_status: when <application>repmgrd</application>'s monitoring is enabled, shows
            current monitoring status for each standby.</simpara>
        </listitem>
       </itemizedlist>
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>
  <para>
   The &repmgr; metadata schema can be stored in an existing database or in its own
   dedicated database. Note that the &repmgr; metadata schema cannot reside on a database
   server which is not part of the replication cluster managed by &repmgr;.
  </para>
  <para>
   A database user must be available for &repmgr; to access this database and perform
   necessary changes. This user does not need to be a superuser, however some operations
   such as initial installation of the &repmgr; extension will require a superuser
   connection (this can be specified where required with the command line option
   <literal>--superuser</literal>).
  </para>
 </sect1>
 </chapter>
--- a/doc/promoting-standby.sgml
+++ b/doc/promoting-standby.sgml
@@ -0,0 +1,79 @@
 <chapter id="promoting-standby" xreflabel="Promoting a standby">
 <indexterm>
   <primary>promoting a standby</primary>
   <seealso>repmgr standby promote</seealso>
 </indexterm>
 <title>Promoting a standby server with repmgr</title>
 <para>
   If a primary server fails or needs to be removed from the replication cluster,
   a new primary server must be designated, to ensure the cluster continues
   to function correctly. This can be done with <xref linkend="repmgr-standby-promote">,
   which promotes the standby on the current server to primary.
 </para>
 <para>
  To demonstrate this, set up a replication cluster with a primary and two attached
  standby servers so that the cluster looks like this:
  <programlisting>
     $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
 </para>
 <para>
  Stop the current primary with e.g.:
  <programlisting>
   $ pg_ctl -D /var/lib/postgresql/data -m fast stop</programlisting>
 </para>
 <para>
  At this point the replication cluster will be in a partially disabled state, with
  both standbys accepting read-only connections while attempting to connect to the
  stopped primary. Note that the &repmgr; metadata table will not yet have been updated;
  executing <xref linkend="repmgr-cluster-show"> will note the discrepancy:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status        | Upstream | Location | Connection string
    ----+-------+---------+---------------+----------+----------+--------------------------------------
     1  | node1 | primary | ? unreachable |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running     | node1    | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running     | node1    | default  | host=node3 dbname=repmgr user=repmgr
    WARNING: following issues were detected
    node "node1" (ID: 1) is registered as an active primary but is unreachable</programlisting>
 </para>
 <para>
  Now promote the first standby with:
  <programlisting>
   $ repmgr -f /etc/repmgr.conf standby promote</programlisting>
 </para>
 <para>
  This will produce output similar to the following:
  <programlisting>
    INFO: connecting to standby database
    NOTICE: promoting standby
    DETAIL: promoting server using "pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/postgresql/data' promote"
    server promoting
    INFO: reconnecting to promoted server
    NOTICE: STANDBY PROMOTE successful
    DETAIL: node 2 was successfully promoted to primary</programlisting>
 </para>
 <para>
  Executing <xref linkend="repmgr-cluster-show"> will show the current state; as there is now an
  active primary, the previous warning will not be displayed:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
 </para>
 <para>
  However the sole remaining standby (<literal>node3</literal>) is still trying to replicate from the failed
  primary; <xref linkend="repmgr-standby-follow"> must now be executed to rectify this situation
  (see <xref linkend="follow-new-primary"> for example).
 </para>
 </chapter>
--- a/doc/quickstart.sgml
+++ b/doc/quickstart.sgml
@@ -0,0 +1,497 @@
 <chapter id="quickstart" xreflabel="Quick-start guide">
 <title>Quick-start guide</title>
 <indexterm>
   <primary>quickstart</primary>
 </indexterm>
 <para>
  This section gives a quick introduction to &repmgr;, including setting up a
  sample &repmgr; installation and a basic replication cluster.
 </para>
 <para>
  These instructions for demonstration purposes and are not suitable for a production
  install, as issues such as account security considerations, and system administration
  best practices are omitted.
 </para>
 <note>
   <simpara>
     To upgrade an existing &repmgr; 3.x installation, see section
     <xref linkend="upgrading-from-repmgr-3">.
   </simpara>
 </note>
 <sect1 id="quickstart-prerequisites">
   <title>Prerequisites for setting up a basic replication cluster with &repmgr;</title>
    <para>
     The following section will describe how to set up a basic replication cluster
     with a primary and a standby server using the <application>repmgr</application>
     command line tool.
    </para>
    <para>
      We'll assume the primary is called <literal>node1</literal> with IP address
      <literal>192.168.1.11</literal>, and the standby is called <literal>node2</literal>
      with IP address <literal>192.168.1.12</literal>
    </para>
    <para>
     Following software must be installed on both servers:
     <itemizedlist spacing="compact" mark="bullet">
      <listitem>
       <simpara><application>PostgreSQL</application></simpara>
      </listitem>
      <listitem>
       <simpara>
        <application>repmgr</application> (matching the installed
        <application>PostgreSQL</application> major version)
       </simpara>
      </listitem>
     </itemizedlist>
    </para>
    <para>
      At network level, connections between the PostgreSQL port (default: <literal>5432</literal>)
      must be possible in both directions.
    </para>
    <para>
      If you want <application>repmgr</application> to copy configuration files which are
      located outside the PostgreSQL data directory, and/or to test
      <command><link linkend="repmgr-standby-switchover">switchover</link></command>
      functionality, you will also need passwordless SSH connections between both servers, and
      <application>rsync</application> should be installed.
    </para>
    <tip>
     <simpara>
      For testing <application>repmgr</application>, it's possible to use multiple PostgreSQL
      instances running on different ports on the same computer, with
      passwordless SSH access to <filename>localhost</filename> enabled.
     </simpara>
    </tip>
 </sect1>
 <sect1 id="quickstart-postgresql-configuration" xreflabel="PostgreSQL configuration">
   <title>PostgreSQL configuration</title>
   <para>
    On the primary server, a PostgreSQL instance must be initialised and running.
    The following replication settings may need to be adjusted:
   </para>
   <programlisting>
    # Enable replication connections; set this figure to at least one more
    # than the number of standbys which will connect to this server
    # (note that repmgr will execute `pg_basebackup` in WAL streaming mode,
    # which requires two free WAL senders)
    max_wal_senders = 10
    # Enable replication slots; set this figure to at least one more
    # than the number of standbys which will connect to this server.
    # Note that repmgr will only make use of replication slots if
    # "use_replication_slots" is set to "true" in repmgr.conf
    max_replication_slots = 0
    # Ensure WAL files contain enough information to enable read-only queries
    # on the standby.
    #
    #  PostgreSQL 9.5 and earlier: one of 'hot_standby' or 'logical'
    #  PostgreSQL 9.6 and later: one of 'replica' or 'logical'
    #    ('hot_standby' will still be accepted as an alias for 'replica')
    #
    # See: https://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-LEVEL
    wal_level = 'hot_standby'
    # Enable read-only queries on a standby
    # (Note: this will be ignored on a primary but we recommend including
    # it anyway)
    hot_standby = on
    # Enable WAL file archiving
    archive_mode = on
    # Set archive command to a script or application that will safely store
    # you WALs in a secure place. /bin/true is an example of a command that
    # ignores archiving. Use something more sensible.
    archive_command = '/bin/true'
   </programlisting>
   <tip>
    <simpara>
      Rather than editing these settings in the default <filename>postgresql.conf</filename>
     file, create a separate file such as <filename>postgresql.replication.conf</filename> and
      include it from the end of the main configuration file with:
     <command>include 'postgresql.replication.conf</command>.
    </simpara>
   </tip>
   <para>
     Additionally, if you are intending to use <application>pg_rewind</application>,
     and the cluster was not initialised using data checksums, you may want to consider enabling
     <varname>wal_log_hints</varname>; for more details see <xref linkend="repmgr-node-rejoin-pg-rewind">.
   </para>
    <para>
      See also the <link linkend="configuration-postgresql">PostgreSQL configuration</link> section in the <link linkend="configuration">repmgr configuaration guide</link>.
    </para>
 </sect1>
 <sect1 id="quickstart-repmgr-user-database">
  <title>Create the repmgr user and database</title>
  <para>
   Create a dedicated PostgreSQL superuser account and a database for
   the &repmgr; metadata, e.g.
  </para>
  <programlisting>
   createuser -s repmgr
   createdb repmgr -O repmgr
  </programlisting>
  <para>
   For the examples in this document, the name <literal>repmgr</literal> will be
   used for both user and database, but any names can be used.
  </para>
  <note>
   <para>
    For the sake of simplicity, the <literal>repmgr</literal> user is created
    as a superuser. If desired, it's possible to create the <literal>repmgr</literal>
    user as a normal user. However for certain operations superuser permissions
    are requiredl; in this case the command line option <command>--superuser</command>
    can be provided to specify a superuser.
   </para>
   <para>
    It's also assumed that the <literal>repmgr</literal> user will be used to make the
    replication connection from the standby to the primary; again this can be
    overridden by specifying a separate replication user when registering each node.
   </para>
  </note>
  <tip>
    <para>
     &repmgr; will install the <literal>repmgr</literal> extension, which creates a
     <literal>repmgr</literal> schema containing the &repmgr;'s metadata tables as
     well as other functions and views. We also recommend that you set the
     <literal>repmgr</literal> user's search path to include this schema name, e.g.
     <programlisting>
       ALTER USER repmgr SET search_path TO repmgr, "$user", public;</programlisting>
    </para>
  </tip>
 </sect1>
 <sect1 id="quickstart-authentication">
  <title>Configuring authentication in pg_hba.conf</title>
  <para>
   Ensure the <literal>repmgr</literal> user has appropriate permissions in <filename>pg_hba.conf</filename> and
   can connect in replication mode; <filename>pg_hba.conf</filename> should contain entries
   similar to the following:
  </para>
  <programlisting>
    local   replication   repmgr                              trust
    host    replication   repmgr      127.0.0.1/32            trust
    host    replication   repmgr      192.168.1.0/24          trust
    local   repmgr        repmgr                              trust
    host    repmgr        repmgr      127.0.0.1/32            trust
    host    repmgr        repmgr      192.168.1.0/24          trust
  </programlisting>
  <para>
   Note that these are simple settings for testing purposes.
   Adjust according to your network environment and authentication requirements.
  </para>
 </sect1>
 <sect1 id="quickstart-standby-preparation">
  <title>Preparing the standby</title>
  <para>
   On the standby, do <emphasis>not</emphasis> create a PostgreSQL instance (i.e.
   do not execute <application>initdb</application> or any database creation
   scripts provided by packages), but do ensure the destination
   data directory (and any other directories which you want PostgreSQL to use)
   exist and are owned by the <literal>postgres</literal> system user. Permissions
   must be set to <literal>0700</literal> (<literal>drwx------</literal>).
  </para>
  <tip>
    <simpara>
      &repmgr; will place a copy of the primary's database files in this directory.
      It will however refuse to run if a PostgreSQL instance has already been
      created there.
    </simpara>
  </tip>
  <para>
   Check the primary database is reachable from the standby using <application>psql</application>:
  </para>
  <programlisting>
    psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>
  <note>
   <para>
    &repmgr; stores connection information as <ulink
    url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">libpq
    connection strings</ulink> throughout. This documentation refers to them as <literal>conninfo</literal>
    strings; an alternative name is <literal>DSN</literal> (<literal>data source name</literal>).
    We'll use these in place of the <command>-h hostname -d databasename -U username</command> syntax.
   </para>
  </note>
 </sect1>
 <sect1 id="quickstart-repmgr-conf">
  <title>repmgr configuration file</title>
  <para>
   Create a <filename>repmgr.conf</filename> file on the primary server. The file must
   contain at least the following parameters:
  </para>
  <programlisting>
    node_id=1
    node_name=node1
    conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
    data_directory='/var/lib/postgresql/data'
  </programlisting>
  <para>
   <filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory,
   as it could be overwritten when setting up or reinitialising the PostgreSQL
   server. See sections <xref linkend="configuration"> and <xref linkend="configuration-file">
   for further details about <filename>repmgr.conf</filename>.
  </para>
  <note>
    <para>
      &repmgr; only uses <option>pg_bindir</option> when it executes
      PostgreSQL binaries directly.
    </para>
    <para>
      For user-defined scripts such as <option>promote_command</option> and the
      various <option>service_*_command</option>s, you <emphasis>must</emphasis>
      always explicitly provide the full path to the binary or script being
      executed, even if it is &repmgr; itself.
    </para>
    <para>
      This is because these options can contain user-defined scripts in arbitrary
      locations, so prepending <option>pg_bindir</option> may break them.
    </para>
  </note>
  <tip>
   <simpara>
    For Debian-based distributions we recommend explictly setting
    <option>pg_bindir</option> to the directory where <command>pg_ctl</command> and other binaries
    not in the standard path are located. For PostgreSQL 9.6 this would be <filename>/usr/lib/postgresql/9.6/bin/</filename>.
   </simpara>
  </tip>
  <tip>
    <simpara>
      If your distribution places the &repmgr; binaries in a location other than the
      PostgreSQL installation directory, specify this with <option>repmgr_bindir</option>
      to enable &repmgr; to perform operations (e.g.
      <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>)
      on other nodes.
    </simpara>
  </tip>
  <para>
   See the file
   <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</>
    for details of all available configuration parameters.
  </para>
 </sect1>
 <sect1 id="quickstart-primary-register">
  <title>Register the primary server</title>
  <para>
   To enable &repmgr; to support a replication cluster, the primary node must
   be registered with &repmgr;. This installs the <literal>repmgr</literal>
   extension and metadata objects, and adds a metadata record for the primary server:
  </para>
  <programlisting>
    $ repmgr -f /etc/repmgr.conf primary register
    INFO: connecting to primary database...
    NOTICE: attempting to install extension "repmgr"
    NOTICE: "repmgr" extension successfully installed
    NOTICE: primary node record (id: 1) registered</programlisting>
  <para>
    Verify status of the cluster like this:
  </para>
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Connection string
    ----+-------+---------+-----------+----------+--------------------------------------------------------
     1  | node1 | primary | * running |          | host=node1 dbname=repmgr user=repmgr connect_timeout=2
  </programlisting>
  <para>
    The record in the <literal>repmgr</literal> metadata table will look like this:
  </para>
  <programlisting>
    repmgr=# SELECT * FROM repmgr.nodes;
    -[ RECORD 1 ]----+-------------------------------------------------------
    node_id          | 1
    upstream_node_id |
    active           | t
    node_name        | node1
    type             | primary
    location         | default
    priority         | 100
    conninfo         | host=node1 dbname=repmgr user=repmgr connect_timeout=2
    repluser         | repmgr
    slot_name        |
    config_file      | /etc/repmgr.conf</programlisting>
  <para>
    Each server in the replication cluster will have its own record. If <application>repmgrd</application>
    is in use, the fields <literal>upstream_node_id</literal>, <literal>active</literal> and
    <literal>type</literal> will be updated when the node's status or role changes.
  </para>
 </sect1>
 <sect1 id="quickstart-standby-clone">
  <title>Clone the standby server</title>
  <para>
   Create a <filename>repmgr.conf</filename> file on the standby server. It must contain at
   least the same parameters as the primary's <filename>repmgr.conf</filename>, but with
   the mandatory values <literal>node</literal>, <literal>node_name</literal>, <literal>conninfo</literal>
   (and possibly <literal>data_directory</literal>) adjusted accordingly, e.g.:
  </para>
  <programlisting>
    node_id=2
    node_name=node2
    conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
    data_directory='/var/lib/postgresql/data'</programlisting>
  <para>
   Use the <command>--dry-run</command> option to check the standby can be cloned:
  </para>
  <programlisting>
    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run
    NOTICE: using provided configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to source node
    NOTICE: checking for available walsenders on source node (2 required)
    INFO: sufficient walsenders available on source node (2 required)
    NOTICE: standby will attach to upstream node 1
    HINT: consider using the -c/--fast-checkpoint option
    INFO: all prerequisites for "standby clone" are met</programlisting>
  <para>
    If no problems are reported, the standby can then be cloned with:
  </para>
  <programlisting>
    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone
    NOTICE: using configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to source node
    NOTICE: checking for available walsenders on source node (2 required)
    INFO: sufficient walsenders available on source node (2 required)
    INFO: creating directory "/var/lib/postgresql/data"...
    NOTICE: starting backup (using pg_basebackup)...
    HINT: this may take some time; consider using the -c/--fast-checkpoint option
    INFO: executing:
      pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream
    NOTICE: standby clone (using pg_basebackup) complete
    NOTICE: you can now start your PostgreSQL server
    HINT: for example: pg_ctl -D /var/lib/postgresql/data start
  </programlisting>
  <para>
   This has cloned the PostgreSQL data directory files from the primary <literal>node1</literal>
   using PostgreSQL's <command>pg_basebackup</command> utility. A <filename>recovery.conf</filename>
   file containing the correct parameters to start streaming from this primary server will be created
   automatically.
  </para>
  <note>
   <simpara>
    By default, any configuration files in the primary's data directory will be
    copied to the standby. Typically these will be <filename>postgresql.conf</filename>,
    <filename>postgresql.auto.conf</filename>, <filename>pg_hba.conf</filename> and
    <filename>pg_ident.conf</filename>. These may require modification before the standby
    is started.
   </simpara>
  </note>
  <para>
   Make any adjustments to the standby's PostgreSQL configuration files now,
   then start the server.
  </para>
  <para>
   For more details on <command>repmgr standby clone</command>, see the
   <link linkend="repmgr-standby-clone">command reference</link>.
   A more detailed overview of cloning options is available in the
   <link linkend="cloning-standbys">administration manual</link>.
  </para>
 </sect1>
 <sect1 id="quickstart-verify-replication">
  <title>Verify replication is functioning</title>
  <para>
   Connect to the primary server and execute:
   <programlisting>
    repmgr=# SELECT * FROM pg_stat_replication;
    -[ RECORD 1 ]----+------------------------------
    pid              | 19111
    usesysid         | 16384
    usename          | repmgr
    application_name | node2
    client_addr      | 192.168.1.12
    client_hostname  |
    client_port      | 50378
    backend_start    | 2017-08-28 15:14:19.851581+09
    backend_xmin     |
    state            | streaming
    sent_location    | 0/7000318
    write_location   | 0/7000318
    flush_location   | 0/7000318
    replay_location  | 0/7000318
    sync_priority    | 0
    sync_state       | async</programlisting>
   This shows that the previously cloned standby (<literal>node2</literal> shown in the field
   <literal>application_name</literal>) has connected to the primary from IP address
   <literal>192.168.1.12</literal>.
  </para>
  <para>
    From PostgreSQL 9.6 you can also use the view
    <ulink url="https://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW">
    <literal>pg_stat_wal_receiver</literal></ulink> to check the replication status from the standby.
   <programlisting>
    repmgr=# SELECT * FROM pg_stat_wal_receiver;
    Expanded display is on.
    -[ RECORD 1 ]---------+--------------------------------------------------------------------------------
    pid                   | 18236
    status                | streaming
    receive_start_lsn     | 0/3000000
    receive_start_tli     | 1
    received_lsn          | 0/7000538
    received_tli          | 1
    last_msg_send_time    | 2017-08-28 15:21:26.465728+09
    last_msg_receipt_time | 2017-08-28 15:21:26.465774+09
    latest_end_lsn        | 0/7000538
    latest_end_time       | 2017-08-28 15:20:56.418735+09
    slot_name             |
    conninfo              | user=repmgr dbname=replication host=node1 application_name=node2
   </programlisting>
   Note that the <varname>conninfo</varname> value is that generated in <filename>recovery.conf</filename>
   and will differ slightly from the primary's <varname>conninfo</varname> as set in <filename>repmgr.conf</filename> -
   among others it will contain the connecting node's name as <varname>application_name</varname>.
  </para>
 </sect1>
 <sect1 id="quickstart-register-standby">
  <title>Register the standby</title>
  <para>
    Register the standby server with:
    <programlisting>
    $ repmgr -f /etc/repmgr.conf standby register
    NOTICE: standby node "node2" (ID: 2) successfully registered</programlisting>
  </para>
  <para>
    Check the node is registered by executing <command>repmgr cluster show</command> on the standby:
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr</programlisting>
  </para>
  <para>
   Both nodes are now registered with &repmgr; and the records have been copied to the standby server.
  </para>
 </sect1>
 </chapter>
--- a/doc/repmgr-cluster-cleanup.sgml
+++ b/doc/repmgr-cluster-cleanup.sgml
@@ -0,0 +1,77 @@
 <refentry id="repmgr-cluster-cleanup">
  <indexterm>
    <primary>repmgr cluster cleanup</primary>
  </indexterm>
 <refmeta>
    <refentrytitle>repmgr cluster cleanup</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr cluster cleanup</refname>
    <refpurpose>purge monitoring history</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Purges monitoring history from the <literal>repmgr.monitoring_history</literal> table to
      prevent excessive table growth.
    </para>
    <para>
      By default <emphasis>all</emphasis> data will be removed; Use the <option>-k/--keep-history</option>
      option to specify the number of days of monitoring history to retain.
    </para>
    <para>
      This command can be executed manually or as a cronjob.
    </para>
  </refsect1>
  <refsect1>
    <title>Usage</title>
    <para>
      This command requires a valid <filename>repmgr.conf</filename> file for the node on which it is
      executed; no additional arguments are required.
    </para>
  </refsect1>
  <refsect1>
    <title>Notes</title>
    <para>
      Monitoring history will only be written if <application>repmgrd</application> is active, and
      <varname>monitoring_history</varname> is set to <literal>true</literal> in
      <filename>repmgr.conf</filename>.
    </para>
  </refsect1>
  <refsect1 id="repmgr-cluster-cleanup-events">
    <title>Event notifications</title>
    <para>
      A <literal>cluster_cleanup</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--node-id</option></term>
        <listitem>
          <para>
            Only delete monitoring records for the specified node.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      For more details see the sections <xref linkend="repmgrd-monitoring"> and
      <xref linkend="repmgrd-monitoring-configuration">.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-cluster-crosscheck.sgml
+++ b/doc/repmgr-cluster-crosscheck.sgml
@@ -0,0 +1,96 @@
 <refentry id="repmgr-cluster-crosscheck">
  <indexterm>
    <primary>repmgr cluster crosscheck</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr cluster crosscheck</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr cluster crosscheck</refname>
    <refpurpose>cross-checks connections between each combination of nodes</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
        but cross-checks connections between each combination of nodes. In "Example 3" in
        <xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
        However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
        overview of the cluster situation:
          <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster crosscheck
    Name   | Id |  1 |  2 |  3
    -------+----+----+----+----
     node1 |  1 |  * |  * |  x
     node2 |  2 |  * |  * |  *
     node3 |  3 |  * |  * |  *</programlisting>
    </para>
    <para>
      What happened is that <command>repmgr cluster crosscheck</command> merged its own
      <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command> with the
      <command>repmgr cluster matrix</command> output from <literal>node2</literal>; the latter is
      able to connect to <literal>node3</literal>
      and therefore determine the state of outbound connections from that node.
    </para>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr cluster crosscheck</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            The check completed successfully and all nodes are reachable.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_BAD_SSH (12)</option></term>
        <listitem>
          <para>
            One or more nodes could not be accessed via SSH.
          </para>
          <note>
            <simpara>
              This only applies to nodes unreachable from the node where
              this command is executed.
            </simpara>
            <simpara>
              It's also possible that the crosscheck establishes that
              connections between PostgreSQL on all nodes are functioning,
              even if SSH access between some nodes is not possible.
            </simpara>
          </note>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_NODE_STATUS (25)</option></term>
        <listitem>
          <para>
            PostgreSQL on one or more nodes could not be reached.
          </para>
          <note>
            <simpara>
              This error code overrides <option>ERR_BAD_SSH</option>.
            </simpara>
          </note>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
 </refentry>
--- a/doc/repmgr-cluster-event.sgml
+++ b/doc/repmgr-cluster-event.sgml
@@ -0,0 +1,79 @@
 <refentry id="repmgr-cluster-event">
  <indexterm>
    <primary>repmgr cluster event</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr cluster event</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr cluster event</refname>
    <refpurpose>output a formatted list of cluster events</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Outputs a formatted list of cluster events, as stored in the <literal>repmgr.events</literal> table.
    </para>
  </refsect1>
  <refsect1>
    <title>Usage</title>
    <para>
      Output is in reverse chronological order, and
      can be filtered with the following options:
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara><literal>--all</literal>: outputs all entries</simpara>
        </listitem>
        <listitem>
          <simpara><literal>--limit</literal>: set the maximum number of entries to output (default: 20)</simpara>
        </listitem>
        <listitem>
          <simpara><literal>--node-id</literal>: restrict entries to node with this ID</simpara>
        </listitem>
        <listitem>
          <simpara><literal>--node-name</literal>: restrict entries to node with this name</simpara>
        </listitem>
        <listitem>
          <simpara><literal>--event</literal>: filter specific event (see <xref linkend="event-notifications"> for a full list)</simpara>
        </listitem>
      </itemizedlist>
    </para>
    <para>
      The "Details" column can be omitted by providing <literal>--terse</literal>.
    </para>
  </refsect1>
  <refsect1>
    <title>Output format</title>
    <para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <literal>--csv</literal>: generate output in CSV format. Note that the <literal>Details</literal>
            column will currently not be emitted in CSV format.
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
      <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster event --event=standby_register
     Node ID | Name  | Event            | OK | Timestamp           | Details
    ---------+-------+------------------+----+---------------------+--------------------------------
     3       | node3 | standby_register | t  | 2017-08-17 10:28:55 | standby registration succeeded
     2       | node2 | standby_register | t  | 2017-08-17 10:28:53 | standby registration succeeded</programlisting>
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-cluster-matrix.sgml
+++ b/doc/repmgr-cluster-matrix.sgml
@@ -0,0 +1,145 @@
 <refentry id="repmgr-cluster-matrix">
  <indexterm>
    <primary>repmgr cluster matrix</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr cluster matrix</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr cluster matrix</refname>
    <refpurpose>
      runs repmgr cluster show on each node and summarizes output
    </refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr cluster matrix</command> runs <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command> on each
      node and arranges the results in a matrix, recording success or failure.
    </para>
    <para>
      <command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
      file on each node. Additionally, passwordless <command>ssh</command> connections are required between
      all nodes.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
    Example 1 (all nodes up):
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster matrix
    Name   | Id |  1 |  2 |  3
    -------+----+----+----+----
     node1 |  1 |  * |  * |  *
     node2 |  2 |  * |  * |  *
     node3 |  3 |  * |  * |  *</programlisting>
  </para>
  <para>
    Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster matrix
    Name   | Id |  1 |  2 |  3
    -------+----+----+----+----
     node1 |  1 |  * |  * |  x
     node2 |  2 |  * |  * |  x
     node3 |  3 |  ? |  ? |  ?
    </programlisting>
  </para>
  <para>
   Each row corresponds to one server, and indicates the result of
   testing an outbound connection from that server.
  </para>
  <para>
    Since <literal>node3</literal> is down, all the entries in its row are filled with
    <literal>?</literal>, meaning that there we cannot test outbound connections.
  </para>
  <para>
    The other two nodes are up; the corresponding rows have <literal>x</literal> in the
    column corresponding to <literal>node3</literal>, meaning that inbound connections to
    that node have failed, and <literal>*</literal> in the columns corresponding to
    <literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
    to these nodes have succeeded.
  </para>
  <para>
    Example 3 (all nodes up, firewall dropping packets originating
    from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
    running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster matrix
    Name   | Id |  1 |  2 |  3
    -------+----+----+----+----
     node1 |  1 |  * |  * |  x
     node2 |  2 |  * |  * |  *
     node3 |  3 |  ? |  ? |  ?</programlisting>
  </para>
  <para>
    Note this may take some time depending on the <varname>connect_timeout</varname>
    setting in the node <varname>conninfo</varname> strings; default is
    <literal>1 minute</literal> which means without modification the above
    command would take around 2 minutes to run; see comment elsewhere about setting
    <varname>connect_timeout</varname>)
  </para>
  <para>
   The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
   and that (therefore) we don't know the state of any outbound
   connection from <literal>node3</literal>.
  </para>
  <para>
    In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
    useful result.
  </para>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr cluster matrix</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            The check completed successfully and all nodes are reachable.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_BAD_SSH (12)</option></term>
        <listitem>
          <para>
            One or more nodes could not be accessed via SSH.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_NODE_STATUS (25)</option></term>
        <listitem>
          <para>
            PostgreSQL on one or more nodes could not be reached.
          </para>
          <note>
            <simpara>
              This error code overrides <option>ERR_BAD_SSH</option>.
            </simpara>
          </note>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
 </refentry>
--- a/doc/repmgr-cluster-show.sgml
+++ b/doc/repmgr-cluster-show.sgml
@@ -0,0 +1,192 @@
 <refentry id="repmgr-cluster-show">
  <indexterm>
    <primary>repmgr cluster show</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr cluster show</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr cluster show</refname>
    <refpurpose>display information about each registered node in the replication cluster</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Displays information about each registered node in the replication cluster. This
      command polls each registered server and shows its role (<literal>primary</literal> /
      <literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
      directly and can be run on any node in the cluster; this is also useful when analyzing
      connectivity from a particular node.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      This command requires either a valid <filename>repmgr.conf</filename> file or a database
      connection string to one of the registered nodes; no additional arguments are needed.
    </para>
    <para>
      To show database connection errors when polling nodes, run the command in
      <literal>--verbose</literal> mode.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+-----------------------------------------
     1  | node1 | primary | * running |          | default  | host=db_node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=db_node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node1    | default  | host=db_node3 dbname=repmgr user=repmgr</programlisting>
  </para>
  </refsect1>
  <refsect1>
    <title>Notes</title>
    <para>
      The column <literal>Role</literal> shows the expected server role according to the
      &repmgr; metadata. <literal>Status</literal> shows whether the server is running or unreachable.
      If the node has an unexpected role not reflected in the &repmgr; metadata, e.g. a node was manually
      promoted to primary, this will be highlighted with an exclamation mark, e.g.:
      <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status               | Upstream | Location | Connection string
    ----+-------+---------+----------------------+----------+----------+-----------------------------------------
     1  | node1 | primary | ? unreachable        |          | default  | host=db_node1 dbname=repmgr user=repmgr
     2  | node2 | standby | ! running as primary | node1    | default  | host=db_node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running            | node1    | default  | host=db_node3 dbname=repmgr user=repmgr
    WARNING: following issues were detected
      node "node1" (ID: 1) is registered as an active primary but is unreachable
      node "node2" (ID: 2) is registered as standby but running as primary</programlisting>
    </para>
    <para>
      Node availability is tested by connecting from the node where
      <command>repmgr cluster show</command> is executed, and does not necessarily imply the node
      is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
          a better overviews of connections between nodes.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--csv</option></term>
        <listitem>
 		  <para>
 			<command>repmgr cluster show</command> accepts an optional parameter <literal>--csv</literal>, which
 			outputs the replication cluster's status in a simple CSV format, suitable for
 			parsing by scripts, e.g.:
 			<programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show --csv
    1,-1,-1
    2,0,0
    3,0,1</programlisting>
 		  </para>
 		  <para>
 			The columns have following meanings:
 			<itemizedlist spacing="compact" mark="bullet">
 			  <listitem>
 				<simpara>
 				  node ID
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
            availability (0 = available, -1 = unavailable)
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
 				  recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
 				</simpara>
 			  </listitem>
 			</itemizedlist>
 		  </para>
 		</listitem>
 	  </varlistentry>
      <varlistentry>
        <term><option>--verbose</option></term>
        <listitem>
          <para>
 			Display the full text of any database connection error messages
          </para>
        </listitem>
      </varlistentry>
 	</variablelist>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr cluster show</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            No issues were detected.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_BAD_CONFIG (1)</option></term>
        <listitem>
          <para>
            An issue was encountered while attempting to retrieve
            &repmgr; metadata.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_DB_CONN (6)</option></term>
        <listitem>
          <para>
            &repmgr; was unable to connect to the local PostgreSQL instance.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_NODE_STATUS (25)</option></term>
        <listitem>
          <para>
            One or more issues were detected with the replication configuration,
            e.g. a node was not in its expected state.
          </para>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
     <xref linkend="repmgr-node-status">, <xref linkend="repmgr-node-check">, <xref linkend="repmgr-daemon-status">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-daemon-pause.sgml
+++ b/doc/repmgr-daemon-pause.sgml
@@ -0,0 +1,109 @@
 <refentry id="repmgr-daemon-pause">
  <indexterm>
    <primary>repmgr daemon pause</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr daemon pause</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr daemon pause</refname>
    <refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to pause failover operations</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      This command can be run on any active node in the replication cluster to instruct all
      running <application>repmgrd</application> instances to &quot;pause&quot; themselves, i.e. take no
      action (such as promoting themselves or following a new primary) if a failover event is detected.
    </para>
    <para>
      This functionality is useful for performing maintenance operations, such as switchovers
      or upgrades, which might otherwise trigger a failover if <application>repmgrd</application>
      is running normally.
    </para>
    <note>
      <para>
        It's important to wait a few seconds after restarting PostgreSQL on any node before running
        <command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
        on the restarted node will take a second or two before it has updated its status.
      </para>
    </note>
    <para>
      <xref linkend="repmgr-daemon-unpause"> will instruct all previously paused <application>repmgrd</application>
      instances to resume normal failover operation.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      <command>repmgr daemon pause</command> can be executed on any active node in the
      replication cluster. A valid <filename>repmgr.conf</filename> file is required.
      It will have no effect on previously paused nodes.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
    <programlisting>
 $ repmgr -f /etc/repmgr.conf daemon pause
 NOTICE: node 1 (node1) paused
 NOTICE: node 2 (node2) paused
 NOTICE: node 3 (node3) paused</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check if nodes are reachable but don't pause <application>repmgrd</application>.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr daemon unpause</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            <application>repmgrd</application> could be paused on all nodes.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_REPMGRD_PAUSE (26)</option></term>
        <listitem>
          <para>
           <application>repmgrd</application> could not be paused on one or mode nodes.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      <xref linkend="repmgr-daemon-unpause">, <xref linkend="repmgr-daemon-status">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-daemon-status.sgml
+++ b/doc/repmgr-daemon-status.sgml
@@ -0,0 +1,165 @@
 <refentry id="repmgr-daemon-status">
  <indexterm>
    <primary>repmgr daemon status</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr daemon status</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr daemon status</refname>
    <refpurpose>display information about the status of <application>repmgrd</application> on each node in the cluster</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      This command provides an overview over all active nodes in the cluster and the state
      of each node's <application>repmgrd</application> instance. It can be used to check
      the result of <xref linkend="repmgr-daemon-pause"> and <xref linkend="repmgr-daemon-unpause">
      operations.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      <command>repmgr daemon status</command> can be executed on any active node in the
      replication cluster. A valid <filename>repmgr.conf</filename> file is required.
    </para>
    <note>
      <para>
        After restarting PostgreSQL on any node, the <application>repmgrd</application> instance
        will take a second or two before it is able to update its status. Until then,
        <application>repmgrd</application> will be shown as not running.
      </para>
    </note>
  </refsect1>
  <refsect1>
    <title>Examples</title>
    <para>
      <application>repmgrd</application> running normally on all nodes:
    <programlisting>$ repmgr -f /etc/repmgr.conf daemon status
 ID | Name  | Role    | Status  | repmgrd | PID  | Paused?
 ----+-------+---------+---------+---------+------+---------
 1  | node1 | primary | running | running | 7851 | no
 2  | node2 | standby | running | running | 7889 | no
 3  | node3 | standby | running | running | 7918 | no</programlisting>
    </para>
    <para>
      <application>repmgrd</application> paused on all nodes (using <xref linkend="repmgr-daemon-pause">):
    <programlisting>$ repmgr -f /etc/repmgr.conf daemon status
 ID | Name  | Role    | Status  | repmgrd | PID  | Paused?
 ----+-------+---------+---------+---------+------+---------
 1  | node1 | primary | running | running | 7851 | yes
 2  | node2 | standby | running | running | 7889 | yes
 3  | node3 | standby | running | running | 7918 | yes</programlisting>
    </para>
    <para>
      <application>repmgrd</application> not running on one node:
    <programlisting>$ repmgr -f /etc/repmgr.conf daemon status
 ID | Name  | Role    | Status  | repmgrd     | PID  | Paused?
 ----+-------+---------+---------+-------------+------+---------
 1  | node1 | primary | running | running     | 7851 | yes
 2  | node2 | standby | running | not running | n/a  | n/a
 3  | node3 | standby | running | running     | 7918 | yes</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--csv</option></term>
        <listitem>
 		  <para>
 			<command>repmgr daemon status</command> accepts an optional parameter <literal>--csv</literal>, which
 			outputs the replication cluster's status in a simple CSV format, suitable for
 			parsing by scripts, e.g.:
 			<programlisting>
    $ repmgr -f /etc/repmgr.conf daemon status --csv
    1,node1,primary,1,1,10204,1
    2,node2,standby,1,0,-1,1
    3,node3,standby,1,1,10225,1</programlisting>
 		  </para>
 		  <para>
 			The columns have following meanings:
 			<itemizedlist spacing="compact" mark="bullet">
 			  <listitem>
 				<simpara>
 				  node ID
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  node name
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  node type (primary or standby)
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  PostgreSQL server running
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  <application>repmgrd</application> running (1 = running, 0 = not running)
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  <application>repmgrd</application> PID (-1 if not running)
 				</simpara>
 			  </listitem>
 			  <listitem>
 				<simpara>
                  <application>repmgrd</application> paused (1 = paused, 0 = not paused)
 				</simpara>
 			  </listitem>
 			</itemizedlist>
 		  </para>
 		</listitem>
 	  </varlistentry>
      <varlistentry>
        <term><option>--verbose</option></term>
        <listitem>
          <para>
 			Display the full text of any database connection error messages
          </para>
        </listitem>
      </varlistentry>
 	</variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      <xref linkend="repmgr-daemon-pause">, <xref linkend="repmgr-daemon-unpause">, <xref linkend="repmgr-cluster-show">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-daemon-unpause.sgml
+++ b/doc/repmgr-daemon-unpause.sgml
@@ -0,0 +1,103 @@
 <refentry id="repmgr-daemon-unpause">
  <indexterm>
    <primary>repmgr daemon unpause</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr daemon unpause</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr daemon unpause</refname>
    <refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to resume failover operations</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      This command can be run on any active node in the replication cluster to instruct all
      running <application>repmgrd</application> instances to &quot;unpause&quot;
      (following a previous execution of <xref linkend="repmgr-daemon-pause">)
      and resume normal failover/monitoring operation.
    </para>
    <note>
      <para>
        It's important to wait a few seconds after restarting PostgreSQL on any node before running
        <command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
        on the restarted node will take a second or two before it has updated its status.
      </para>
    </note>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
     <command>repmgr daemon unpause</command> can be executed on any active node in the
      replication cluster. A valid <filename>repmgr.conf</filename> file is required.
      It will have no effect on nodes which are not already paused.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
    <programlisting>
 $ repmgr -f /etc/repmgr.conf daemon unpause
 NOTICE: node 1 (node1) unpaused
 NOTICE: node 2 (node2) unpaused
 NOTICE: node 3 (node3) unpaused</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check if nodes are reachable but don't unpause <application>repmgrd</application>.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr daemon unpause</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            <application>repmgrd</application> could be unpaused on all nodes.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_REPMGRD_PAUSE (26)</option></term>
        <listitem>
          <para>
           <application>repmgrd</application> could not be unpaused on one or mode nodes.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      <xref linkend="repmgr-daemon-pause">, <xref linkend="repmgr-daemon-status">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-node-check.sgml
+++ b/doc/repmgr-node-check.sgml
@@ -0,0 +1,189 @@
 <refentry id="repmgr-node-check">
  <indexterm>
    <primary>repmgr node check</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr node check</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr node check</refname>
    <refpurpose>performs some health checks on a node from a replication perspective</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Performs some health checks on a node from a replication perspective.
      This command must be run on the local node.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
      <programlisting>
       $ repmgr -f /etc/repmgr.conf node check
       Node "node1":
            Server role: OK (node is primary)
            Replication lag: OK (N/A - node is primary)
            WAL archiving: OK (0 pending files)
            Downstream servers: OK (2 of 2 downstream nodes attached)
            Replication slots: OK (node has no replication slots)
            Missing replication slots: OK (node has no missing replication slots)</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Individual checks</title>
    <para>
      Each check can be performed individually by supplying
      an additional command line parameter, e.g.:
      <programlisting>
        $ repmgr node check --role
        OK (node is primary)</programlisting>
    </para>
    <para>
   Parameters for individual checks are as follows:
    <itemizedlist spacing="compact" mark="bullet">
     <listitem>
      <simpara>
        <literal>--role</literal>: checks if the node has the expected role
      </simpara>
     </listitem>
     <listitem>
      <simpara>
        <literal>--replication-lag</literal>: checks if the node is lagging by more than
        <varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
      </simpara>
     </listitem>
     <listitem>
      <simpara>
        <literal>--archive-ready</literal>: checks for WAL files which have not yet been archived,
        and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
        exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
      </simpara>
     </listitem>
     <listitem>
      <simpara>
        <literal>--downstream</literal>: checks that the expected downstream nodes are attached
      </simpara>
     </listitem>
     <listitem>
      <simpara>
        <literal>--slots</literal>: checks there are no inactive replication slots
      </simpara>
     </listitem>
     <listitem>
      <simpara>
        <literal>--missing-slots</literal>: checks there are no missing replication slots
      </simpara>
     </listitem>
    </itemizedlist>
  </para>
  </refsect1>
  <refsect1>
    <title>Output format</title>
    <para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <literal>--csv</literal>: generate output in CSV format (not available
            for individual checks)
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <literal>--nagios</literal>: generate output in a Nagios-compatible format
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      When executing <command>repmgr node check</command> with one of the individual
      checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
      (even if <literal>--nagios</literal> is not supplied):
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <literal>0</literal>: OK
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <literal>1</literal>: WARNING
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <literal>2</literal>: ERROR
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <literal>3</literal>: UNKNOWN
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
    <para>
      Following exit codes can be emitted by <command>repmgr status check</command>
      if no individual check was specified.
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            No issues were detected.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_NODE_STATUS (25)</option></term>
        <listitem>
          <para>
            One or more issues were detected.
          </para>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
     <xref linkend="repmgr-node-status">, <xref linkend="repmgr-cluster-show">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-node-rejoin.sgml
+++ b/doc/repmgr-node-rejoin.sgml
@@ -0,0 +1,264 @@
 <refentry id="repmgr-node-rejoin">
  <indexterm>
    <primary>repmgr node rejoin</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr node rejoin</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr node rejoin</refname>
    <refpurpose>rejoin a dormant (stopped) node to the replication cluster</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Enables a dormant (stopped) node to be rejoined to the replication cluster.
    </para>
    <para>
      This can optionally use <application>pg_rewind</application> to re-integrate
      a node which has diverged from the rest of the cluster, typically a failed primary.
    </para>
    <tip>
      <para>
        If the node is running and needs to be attached to the current primary, use
        <xref linkend="repmgr-standby-follow">.
      </para>
      <para>
        Note <xref linkend="repmgr-standby-follow"> can only be used for standbys which have not diverged
        from the rest of the cluster.
      </para>
    </tip>
  </refsect1>
  <refsect1>
    <title>Usage</title>
    <para>
      <programlisting>
      repmgr node rejoin -d '$conninfo'</programlisting>
      where <literal>$conninfo</literal> is the conninfo string of any reachable node in the cluster.
      <filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
      otherwise available.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually execute the rejoin.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
        <listitem>
          <para>
            Execute <application>pg_rewind</application>.
          </para>
          <para>
            It is only necessary to provide the <application>pg_rewind</application> path
            if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
            is not installed in the PostgreSQL <filename>bin</filename> directory.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--config-files</option></term>
        <listitem>
          <para>
            comma-separated list of configuration files to retain after
            executing <application>pg_rewind</application>.
          </para>
          <para>
            Currently <application>pg_rewind</application> will overwrite
            the local node's configuration files with the files from the source node,
            so it's advisable to use this option to ensure they are kept.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--config-archive-dir</option></term>
        <listitem>
          <para>
            Directory to temporarily store configuration files specified with
            <option>--config-files</option>; default: <filename>/tmp</filename>.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-W/--no-wait</option></term>
        <listitem>
          <para>
            Don't wait for the node to rejoin cluster.
          </para>
          <para>
            If this option is supplied, &repmgr; will restart the node but
            not wait for it to connect to the primary.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>Configuration file settings</title>
    <para>
      <itemizedlist spacing="compact" mark="bullet">
       <listitem>
         <simpara>
           <literal>node_rejoin_timeout</literal>:
 		   the maximum length of time (in seconds) to wait for
 		   the node to reconnect to the replication cluster (defaults to
 		   the value set in <literal>standby_reconnect_timeout</literal>,
 		   60 seconds).
 		 </simpara>
 	   </listitem>
 	  </itemizedlist>
 	</para>
  </refsect1>
  <refsect1 id="repmgr-node-rejoin-events">
    <title>Event notifications</title>
    <para>
      A <literal>node_rejoin</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
  <refsect1>
    <title>Notes</title>
    <para>
      Currently <command>repmgr node rejoin</command> can only be used to attach
      a standby to the current primary, not another standby.
    </para>
    <para>
      The node must have been shut down cleanly; if this was not the case, it will
      need to be manually started (remove any existing <filename>recovery.conf</filename> file first)
      until it has reached a consistent recovery point, then shut down cleanly.
    </para>
    <tip>
      <para>
        If <application>PostgreSQL</application> is started in single-user mode and
        input is directed from <filename>/dev/null/</filename>, it will perform recovery
        then immediately quit, and will then be in a state suitable for use by
        <application>pg_rewind</application>.
        <programlisting>
          rm -f /var/lib/pgsql/data/recovery.conf
          postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
      </para>
    </tip>
  </refsect1>
  <refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
   <indexterm>
      <primary>pg_rewind</primary>
      <secondary>using with "repmgr node rejoin"</secondary>
    </indexterm>
    <title>Using <command>pg_rewind</command></title>
    <para>
      <command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
      node which has diverged from the rest of the cluster, typically a failed primary.
      <command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution,
      and can be installed from external sources for PostgreSQL 9.3 and 9.4.
    </para>
    <note>
      <para>
        <command>pg_rewind</command> <emphasis>requires</emphasis> that either
        <varname>wal_log_hints</varname> is enabled, or that
        data checksums were enabled when the cluster was initialized. See the
        <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html"><command>pg_rewind</command> documentation</ulink> for details.
      </para>
    </note>
    <para>
      To have <command>repmgr node rejoin</command> use <command>pg_rewind</command>,
      pass the command line option <literal>--force-rewind</literal>, which will tell &repmgr;
      to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
    </para>
    <para>
      Be aware that if <command>pg_rewind</command> is executed and actually performs a
      rewind operation, any configuration files in the PostgreSQL data directory will be
      overwritten with those from the source server.
    </para>
    <para>
      To prevent this happening, provide a comma-separated list of files to retain
      using the <literal>--config-file</literal> command line option; the specified files
      will be archived in a temporary directory (whose parent directory can be specified with
      <literal>--config-archive-dir</literal>) and restored once the rewind operation is
      complete.
    </para>
    <para>
      Example, first using <literal>--dry-run</literal>, then actually executing the
      <literal>node rejoin command</literal>.
    <programlisting>
    $ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
         --force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose --dry-run
    NOTICE: using provided configuration file "/etc/repmgr.conf"
    INFO: prerequisites for using pg_rewind are met
    INFO: file "postgresql.local.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
    INFO: file "postgresql.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
    INFO: 2 files would have been copied to "/tmp/repmgr-config-archive-node1"
    INFO: directory "/tmp/repmgr-config-archive-node1" deleted
    INFO: pg_rewind would now be executed
    DETAIL: pg_rewind command is:
      pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node1 dbname=repmgr user=repmgr'</programlisting>
    <note>
      <para>
        If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
        this checks the prerequisites for using <application>pg_rewind</application>, but cannot
        predict the outcome of actually executing <application>pg_rewind</application>.
      </para>
    </note>
    <programlisting>
    $ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
         --force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose
    NOTICE: using provided configuration file "/etc/repmgr.conf"
    INFO: prerequisites for using pg_rewind are met
    INFO: 2 files copied to "/tmp/repmgr-config-archive-node1"
    NOTICE: executing pg_rewind
    NOTICE: 2 files copied to /var/lib/pgsql/data
    INFO: directory "/tmp/repmgr-config-archive-node1" deleted
    INFO: deleting "recovery.done"
    INFO: setting node 1's primary to node 2
    NOTICE: starting server using "pg_ctl-l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
    waiting for server to start.... done
    server started
    NOTICE: NODE REJOIN successful
    DETAIL: node 1 is now attached to node 2</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
     <xref linkend="repmgr-standby-follow">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-node-service.sgml
+++ b/doc/repmgr-node-service.sgml
@@ -0,0 +1,151 @@
 <refentry id="repmgr-node-service">
  <indexterm>
    <primary>repmgr node service</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr node service</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr node service</refname>
    <refpurpose>show or execute the system service command to stop/start/restart/reload/promote a node</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Shows or executes the system service command to stop/start/restart/reload a node.
    </para>
    <para>
      This command is mainly meant for internal &repmgr; usage, but is useful for
      confirming the command configuration.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Log the steps which would be taken, including displaying the command which would be executed.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--action</option></term>
        <listitem>
          <para>
            The action to perform. One of <literal>start</literal>, <literal>stop</literal>,
            <literal>restart</literal>, <literal>reload</literal> or <literal>promote</literal>.
          </para>
          <para>
            If the parameter <option>--list-actions</option> is provided together with
            <option>--action</option>, the command which would be executed will be printed.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--list-actions</option></term>
        <listitem>
          <para>
            List all configured commands.
          </para>
          <para>
            If the parameter <option>--action</option> is provided together with
            <option>--list-actions</option>, the command which would be executed for that
            particular action will be printed.
          </para>
        </listitem>
      </varlistentry>
     <varlistentry>
        <term><option>--checkpoint</option></term>
        <listitem>
          <para>
            Issue a <command>CHECKPOINT</command> before stopping or restarting the node.
          </para>
        </listitem>
     </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr node service</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            No issues were detected.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_LOCAL_COMMAND (5)</option></term>
        <listitem>
          <para>
            Execution of the system service command failed.
          </para>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
  <refsect1>
    <title>Examples</title>
    <para>
      See what action would be taken for a restart:
      <programlisting>
 [postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --action=restart --checkpoint --dry-run
 INFO: a CHECKPOINT would be issued here
 INFO: would execute server command "sudo service postgresql-11 restart"</programlisting>
    </para>
    <para>
      Restart the PostgreSQL instance:
      <programlisting>
 [postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --action=restart --checkpoint
 NOTICE: issuing CHECKPOINT
 DETAIL: executing server command "sudo service postgresql-11 restart"
 Redirecting to /bin/systemctl restart postgresql-11.service</programlisting>
    </para>
    <para>
      List all commands:
      <programlisting>
 [postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --list-actions
 Following commands would be executed for each action:
    start: "sudo service postgresql-11 start"
     stop: "sudo service postgresql-11 stop"
  restart: "sudo service postgresql-11 restart"
   reload: "sudo service postgresql-11 reload"
  promote: "/usr/pgsql-11/bin/pg_ctl  -w -D '/var/lib/pgsql/11/data' promote"</programlisting>
    </para>
    <para>
      List a single command:
      <programlisting>
 [postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --list-actions --action=promote
 /usr/pgsql-11/bin/pg_ctl  -w -D '/var/lib/pgsql/11/data' promote      </programlisting>
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-node-status.sgml
+++ b/doc/repmgr-node-status.sgml
@@ -0,0 +1,91 @@
 <refentry id="repmgr-node-status">
  <indexterm>
    <primary>repmgr node status</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr node status</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr node status</refname>
    <refpurpose>show overview of a node's basic information and replication status</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Displays an overview of a node's basic information and replication
      status. This command must be run on the local node.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
    <programlisting>
        $ repmgr -f /etc/repmgr.conf node status
        Node "node1":
            PostgreSQL version: 10beta1
            Total data size: 30 MB
            Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2
            Role: primary
            WAL archiving: off
            Archive command: (none)
            Replication connections: 2 (of maximal 10)
            Replication slots: 0 (of maximal 10)
            Replication lag: n/a</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Output format</title>
    <para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <literal>--csv</literal>: generate output in CSV format
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr node status</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            No issues were detected.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_NODE_STATUS (25)</option></term>
        <listitem>
          <para>
            One or more issues were detected.
          </para>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      See <xref linkend="repmgr-node-check"> to diagnose issues and <xref linkend="repmgr-cluster-show">
      for an overview of all nodes in the cluster.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-primary-register.sgml
+++ b/doc/repmgr-primary-register.sgml
@@ -0,0 +1,93 @@
 <refentry id="repmgr-primary-register">
  <indexterm>
    <primary>repmgr primary register</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr primary register</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr primary register</refname>
    <refpurpose>initialise a repmgr installation and register the primary node</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr primary register</command> registers a primary node in a
      streaming replication cluster, and configures it for use with &repmgr;, including
      installing the &repmgr; extension. This command needs to be executed before any
      standby nodes are registered.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      Execute with the <option>--dry-run</option> option to check what would happen without
      actually registering the primary.
    </para>
    <para>
      <command>repmgr master register</command> can be used as an alias for
      <command>repmgr primary register</command>.
    </para>
    <note>
    <para>
      If providing the configuration file location with <option>-f/--config-file</option>,
      avoid using a relative path, as &repmgr; stores the configuration file location
      in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
      <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
        a relative path into an absolute one, but this may not be the same as the path you
        would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
        to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
        <filename>/path/to/repmgr.conf</filename>).
    </para>
    </note>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually register the primary.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
       <term><option>-F</option>, <option>--force</option></term>
        <listitem>
          <para>
            Overwrite an existing node record
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-primary-register-events">
    <title>Event notifications</title>
    <para>
      Following <link linkend="event-notifications">event notifications</link> will be generated:
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara><literal>cluster_created</literal></simpara>
        </listitem>
        <listitem>
          <simpara><literal>primary_register</literal></simpara>
        </listitem>
      </itemizedlist>
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-primary-unregister.sgml
+++ b/doc/repmgr-primary-unregister.sgml
@@ -0,0 +1,74 @@
 <refentry id="repmgr-primary-unregister">
  <indexterm>
    <primary>repmgr primary unregister</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr primary unregister</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr primary unregister</refname>
    <refpurpose>unregister an inactive primary node</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr primary unregister</command> unregisters an inactive primary node
      from the &repmgr; metadata. This is typically when the primary has failed and is
      being removed from the cluster after a new primary has been promoted.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      <command>repmgr primary unregister</command> can be run on any active &repmgr; node,
      with the ID of the node to unregister passed as <option>--node-id</option>.
    </para>
    <para>
      Execute with the <literal>--dry-run</literal> option to check what would happen without
      actually unregistering the node.
    </para>
    <para>
      <command>repmgr master unregister</command> can be used as an alias for
      <command>repmgr primary unregister</command>.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually unregister the primary.
          </para>
        </listitem>
      </varlistentry>
     <varlistentry>
        <term><option>--node-id</option></term>
        <listitem>
          <para>
            ID of the inactive primary to be unregistered.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-primary-unregister-events">
    <title>Event notifications</title>
    <para>
      A <literal>primary_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-clone.sgml
+++ b/doc/repmgr-standby-clone.sgml
@@ -0,0 +1,367 @@
 <refentry id="repmgr-standby-clone">
  <indexterm>
    <primary>repmgr standby clone</primary>
    <seealso>cloning</seealso>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby clone</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby clone</refname>
    <refpurpose>clone a PostgreSQL standby node from another PostgreSQL node</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr standby clone</command> clones a PostgreSQL node from another
      PostgreSQL node, typically the primary, but optionally from any other node in
      the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
      to attach the cloned node to the primary node (or another standby, if cascading replication
      is in use).
    </para>
    <note>
      <simpara>
        <command>repmgr standby clone</command> does not start the standby, and after cloning
        a standby, the command <command>repmgr standby register</command> must be executed to
        notify &repmgr; of its existence.
      </simpara>
    </note>
  </refsect1>
  <refsect1 id="repmgr-standby-clone-config-file-copying" xreflabel="Copying configuration files">
   <title>Handling configuration files</title>
   <para>
    Note that by default, all configuration files in the source node's data
    directory will be copied to the cloned node.  Typically these will be
    <filename>postgresql.conf</filename>, <filename>postgresql.auto.conf</filename>,
    <filename>pg_hba.conf</filename> and <filename>pg_ident.conf</filename>.
    These may require modification before the standby is started.
   </para>
   <para>
    In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
    configuration files are located outside of the data directory and will
    not be copied by default. &repmgr; can copy these files, either to the same
    location on the standby server (provided appropriate directory and file permissions
    are available), or into the standby's data directory. This requires passwordless
    SSH access to the primary server. Add the option <option>--copy-external-config-files</option>
    to the <command>repmgr standby clone</command> command; by default files will be copied to
    the same path as on the upstream server. Note that the user executing <command>repmgr</command>
    must have write access to those directories.
   </para>
   <para>
    To have the configuration files placed in the standby's data directory, specify
    <literal>--copy-external-config-files=pgdata</literal>, but note that
    any include directives in the copied files may need to be updated.
   </para>
   <note>
 	 <para>
 	   When executing <command>repmgr standby clone</command> with the
 	   <option>--copy-external-config-files</option> aand <option>--dry-run</option>
 	   options, &repmgr; will check the SSH connection to the source node, but
 	   will not verify whether the files can actually be copied.
 	 </para>
 	 <para>
 	   During the actual clone operation, a check will be made before the database itself
 	   is cloned to determine whether the files can actually be copied; if any problems are
 	   encountered, the clone operation will be aborted, enabling the user to fix
 	   any issues before retrying the clone operation.
 	 </para>
   </note>
   <tip>
    <simpara>
     For reliable configuration file management we recommend using a
     configuration management tool such as Ansible, Chef, Puppet or Salt.
    </simpara>
   </tip>
  </refsect1>
  <refsect1 id="repmgr-standby-clone-recovery-conf">
   <indexterm>
     <primary>recovery.conf</primary>
     <secondary>customising with "repmgr standby clone"</secondary>
   </indexterm>
   <title>Customising recovery.conf</title>
   <para>
     By default, &repmgr; will create a minimal <filename>recovery.conf</filename>
     containing following parameters:
   </para>
   <itemizedlist spacing="compact" mark="bullet">
     <listitem>
       <simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
     </listitem>
     <listitem>
       <simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
     </listitem>
     <listitem>
       <simpara><varname>primary_conninfo</varname></simpara>
     </listitem>
     <listitem>
       <simpara><varname>primary_slot_name</varname> (if replication slots in use)</simpara>
     </listitem>
   </itemizedlist>
   <para>
     The following additional parameters can be specified in <filename>repmgr.conf</filename>
     for inclusion in <filename>recovery.conf</filename>:
   </para>
   <itemizedlist spacing="compact" mark="bullet">
     <listitem>
       <simpara><varname>restore_command</varname></simpara>
     </listitem>
     <listitem>
       <simpara><varname>archive_cleanup_command</varname></simpara>
     </listitem>
     <listitem>
       <simpara><varname>recovery_min_apply_delay</varname></simpara>
     </listitem>
   </itemizedlist>
   <note>
     <para>
       We recommend using <ulink url="https://www.pgbarman.org/">Barman</ulink> to manage
       WAL file archiving. For more details on combining &repmgr; and <application>Barman</application>,
       in particular using <varname>restore_command</varname> to configure Barman as a backup source of
       WAL files, see <xref linkend="cloning-from-barman">.
     </para>
   </note>
  </refsect1>
  <refsect1 id="repmgr-standby-clone-wal-management">
   <title>Managing WAL during the cloning process</title>
   <para>
    When initially cloning a standby, you will need to ensure
    that all required WAL files remain available while the cloning is taking
    place. To ensure this happens when using the default <command>pg_basebackup</command> method,
    &repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
    parameter to <literal>stream</literal>,
    which will ensure all WAL files generated during the cloning process are
    streamed in parallel with the main backup. Note that this requires two
    replication connections to be available (&repmgr; will verify sufficient
    connections are available before attempting to clone, and this can be checked
    before performing the clone using the <literal>--dry-run</literal> option).
   </para>
   <para>
    To override this behaviour, in <filename>repmgr.conf</filename> set
    <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
    parameter to <literal>fetch</literal>:
    <programlisting>
      pg_basebackup_options='--xlog-method=fetch'</programlisting>
    and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
    See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
    pg_basebackup</ulink> documentation for details.
   </para>
   <note>
    <simpara>
      From PostgreSQL 10, <command>pg_basebackup</command>'s
      <literal>--xlog-method</literal> parameter has been renamed to
      <literal>--wal-method</literal>.
    </simpara>
   </note>
  </refsect1>
  <refsect1 id="repmgr-standby-create-recovery-conf">
   <indexterm>
     <primary>recovery.conf</primary>
     <secondary>generating for a standby cloned by another method</secondary>
   </indexterm>
   <title>Using a standby cloned by another method</title>
   <para>
     &repmgr; supports standbys cloned by another method (e.g. using <application>barman</application>'s
     <command><ulink url="http://docs.pgbarman.org/release/2.4/#recover">barman recover</ulink></command> command).
   </para>
   <para>
     To integrate the standby as a &repmgr; node, ensure the <filename>repmgr.conf</filename>
     file is created for the node, and that it has been registered using
     <command><link linkend="repmgr-standby-register">repmgr standby register</link></command>.
     Then execute the command <command>repmgr standby clone --recovery-conf-only</command>.
     This will create the <filename>recovery.conf</filename> file needed to attach
     the node to its upstream, and will also create a replication slot on the
     upstream node if required.
   </para>
   <para>
     Note that the upstream node must be running. An existing
     <filename>recovery.conf</filename> will not be overwritten unless the
     <option>-F/--force</option> option is provided.
   </para>
   <para>
     Execute <command>repmgr standby clone --recovery-conf-only --dry-run</command>
     to check the prerequisites for creating the <filename>recovery.conf</filename> file,
     and display the contents of the file without actually creating it.
   </para>
   <note>
     <para>
       <option>--recovery-conf-only</option> was introduced in &repmgr; <link linkend="release-4.0.4">4.0.4</link>.
     </para>
   </note>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>-d, --dbname=CONNINFO</option></term>
        <listitem>
          <para>
            Connection string of the upstream node to use for cloning.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually clone the standby.
          </para>
          <para>
            If <option>--recovery-conf-only</option> specified, the contents of
            the generated <filename>recovery.conf</filename> file will be displayed
            but the file itself not written.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-c, --fast-checkpoint</option></term>
        <listitem>
          <para>
            Force fast checkpoint (not effective when cloning from Barman).
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--copy-external-config-files[={samepath|pgdata}]</option></term>
        <listitem>
          <para>
            Copy configuration files located outside the data directory on the source
            node to the same path on the standby (default) or to the
            PostgreSQL data directory.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--no-upstream-connection</option></term>
        <listitem>
          <para>
            When using Barman, do not connect to upstream node.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-R, --remote-user=USERNAME</option></term>
        <listitem>
          <para>
            Remote system username for SSH operations (default: current local system username).
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option> --recovery-conf-only</option></term>
        <listitem>
          <para>
            Create <filename>recovery.conf</filename> file for a previously cloned instance. &repmgr 4.0.4 and later.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--replication-user</option></term>
        <listitem>
          <para>
            User to make replication connections with (optional, not usually required).
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--superuser</option></term>
        <listitem>
          <para>
            If the &repmgr; user is not a superuser, the name of a valid superuser must
            be provided with this option.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--upstream-conninfo</option></term>
        <listitem>
          <para>
            <literal>primary_conninfo</literal> value to write in recovery.conf
            when the intended upstream server does not yet exist.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--upstream-node-id</option></term>
        <listitem>
          <para>
            ID of the upstream node to replicate from (optional, defaults to primary node)
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--without-barman </option></term>
        <listitem>
          <para>
            Do not use Barman even if configured.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-standby-clone-events">
    <title>Event notifications</title>
    <para>
      A <literal>standby_clone</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      See <xref linkend="cloning-standbys"> for details about various aspects of cloning.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-follow.sgml
+++ b/doc/repmgr-standby-follow.sgml
@@ -0,0 +1,116 @@
 <refentry id="repmgr-standby-follow">
  <indexterm>
    <primary>repmgr standby follow</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby follow</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby follow</refname>
    <refpurpose>attach a standby to a new primary</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Attaches the standby to a new primary. This command requires a valid
      <filename>repmgr.conf</filename> file for the standby, either specified
      explicitly with <literal>-f/--config-file</literal> or located in a
      default location; no additional arguments are required.
    </para>
    <para>
      This command will force a restart of the standby server, which must be
      running. It can only be used to attach an active standby to the current primary node
   (and not to another standby).
    </para>
 	<tip>
      <para>
 		To re-add an inactive node to the replication cluster, use
 		<xref linkend="repmgr-node-rejoin">.
      </para>
 	</tip>
 	<para>
 	  <command>repmgr standby follow</command> will wait up to
 	  <varname>standby_follow_timeout</varname> seconds (default: <literal>30</literal>)
 	  to verify the standby has actually connected to the new primary.
 	</para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
      <programlisting>
      $ repmgr -f /etc/repmgr.conf standby follow
      INFO: setting node 3's primary to node 2
      NOTICE: restarting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' restart"
      waiting for server to shut down........ done
      server stopped
      waiting for server to start.... done
      server started
      NOTICE: STANDBY FOLLOW successful
      DETAIL: node 3 is now attached to node 2</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually follow a new standby.
          </para>
          <important>
            <para>
              This does not guarantee the standby can follow the primary; in
              particular, whether the primary and standby timelines have diverged,
              can currently only be determined by actually attempting to
              attach the standby to the primary.
            </para>
          </important>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-w</option></term>
        <term><option>--wait</option></term>
        <listitem>
          <para>
            Wait for a primary to appear. &repmgr; will wait for up to
            <varname>primary_follow_timeout</varname> seconds
            (default: 60 seconds) to verify that the standby is following the new primary.
            This value can be defined in <filename>repmgr.conf</filename>.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-standby-follow-events">
    <title>Event notifications</title>
    <para>
      A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
    <para>
      If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the primary
      being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
      <literal>%a</literal> with its node name.
    </para>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
     <xref linkend="repmgr-node-rejoin">
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-promote.sgml
+++ b/doc/repmgr-standby-promote.sgml
@@ -0,0 +1,60 @@
 <refentry id="repmgr-standby-promote">
  <indexterm>
    <primary>repmgr standby promote</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby promote</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby promote</refname>
    <refpurpose>promote a standby to a primary</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Promotes a standby to a primary if the current primary has failed. This
      command requires a valid <filename>repmgr.conf</filename> file for the standby, either
      specified explicitly  with <literal>-f/--config-file</literal> or located in a
      default location; no additional arguments are required.
    </para>
    <para>
      If the standby promotion succeeds, the server will not need to be
      restarted. However any other standbys will need to follow the new server,
      by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application>
        is active, it will handle this automatically.
    </para>
    <para>
      Note that &repmgr; will wait for up to <varname>promote_check_timeout</varname> seconds
      (default: 60 seconds) to verify that the standby has been promoted, and will
      check the promotion every <varname>promote_check_interval</varname> seconds (default: 1 second).
      Both values can be defined in <filename>repmgr.conf</filename>.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
      <programlisting>
      $ repmgr -f /etc/repmgr.conf standby promote
      NOTICE: promoting standby to primary
      DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
      server promoting
      DEBUG: setting node 2 as primary and marking existing primary as failed
      NOTICE: STANDBY PROMOTE successful
      DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
    </para>
  </refsect1>
  <refsect1 id="repmgr-standby-promote-events">
    <title>Event notifications</title>
    <para>
      A <literal>standby_promote</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-register.sgml
+++ b/doc/repmgr-standby-register.sgml
@@ -0,0 +1,183 @@
 <refentry id="repmgr-standby-register" xreflabel="repmgr standby register">
  <indexterm>
    <primary>repmgr standby register</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby register</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby register</refname>
    <refpurpose>add a standby's information to the &repmgr; metadata</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr standby register</command> adds a standby's information to
      the &repmgr; metadata. This command needs to be executed to enable
      promote/follow operations and to allow <application>repmgrd</application> to work with the node.
      An existing standby can be registered using this command. Execute with the
      <literal>--dry-run</literal> option to check what would happen without actually registering the
      standby.
    </para>
    <note>
      <para>
        If providing the configuration file location with <literal>-f/--config-file</literal>,
        avoid using a relative path, as &repmgr; stores the configuration file location
        in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
        <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
          a relative path into an absolute one, but this may not be the same as the path you
          would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
          to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
          <filename>/path/to/repmgr.conf</filename>).
      </para>
    </note>
  </refsect1>
  <refsect1 id="repmgr-standby-register-wait-start" xreflabel="repmgr standby register --wait-start">
   <title>Waiting for the the standby to start</title>
   <para>
     By default, &repmgr; will wait 30 seconds for the standby to become available before
     aborting with a connection error. This is useful when setting up a standby from a script,
     as the standby may not have fully started up by the time <command>repmgr standby register</command>
     is executed.
   </para>
   <para>
     To change the timeout, pass the desired value with the <literal>--wait-start</literal> option.
     A value of <literal>0</literal> will disable the timeout.
   </para>
   <para>
     The timeout will be ignored if <literal>-F/--force</literal> was provided.
   </para>
  </refsect1>
  <refsect1 id="repmgr-standby-register-wait-sync" xreflabel="repmgr standby register --wait-sync">
   <title>Waiting for the registration to propagate to the standby</title>
   <para>
     Depending on your environment and workload, it may take some time for the standby's node record
     to propagate from the primary to the standby. Some actions (such as starting
     <application>repmgrd</application>) require that the standby's node record
     is present and up-to-date to function correctly.
   </para>
   <para>
    By providing the option <option>--wait-sync</option> to the
    <command>repmgr standby register</command> command, &repmgr; will wait
    until the record is synchronised before exiting. An optional timeout (in
    seconds) can be added to this option (e.g. <option>--wait-sync=60</option>).
   </para>
  </refsect1>
  <refsect1 id="repmgr-standby-register-inactive-node" xreflabel="Registering an inactive node">
   <title>Registering an inactive node</title>
   <para>
    Under some circumstances you may wish to register a standby which is not
    yet running; this can be the case when using provisioning tools to create
    a complex replication cluster. In this case, by using the <option>-F/--force</option>
    option and providing the connection parameters to the primary server,
    the standby can be registered.
   </para>
   <para>
    Similarly, with cascading replication it may be necessary to register
    a standby whose upstream node has not yet been registered - in this case,
    using <option>-F/--force</option> will result in the creation of an inactive placeholder
    record for the upstream node, which will however later need to be registered
    with the <option>-F/--force</option> option too.
   </para>
   <para>
    When used with <command>repmgr standby register</command>, care should be taken that use of the
    <option>-F/--force</option> option does not result in an incorrectly configured cluster.
   </para>
  </refsect1>
  <refsect1 id="repmgr-standby-register-node-cloned-other-source">
    <title>Registering a node not cloned by repmgr</title>
    <para>
      If you've cloned a standby using another method (e.g. <application>barman</application>'s
     <command>barman recover</command> command), first execute
     <link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --recovery-conf-only</link>
     to add the <filename>recovery.conf</filename> file, then register the standby as usual.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually register the standby.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
       <term><option>-F</option><option>--force</option></term>
        <listitem>
          <para>
            Overwrite an existing node record
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--upstream-node-id</option></term>
        <listitem>
          <para>
            ID of the upstream node to replicate from (optional)
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--wait-start</option></term>
        <listitem>
          <para>
            wait for the standby to start (timeout in seconds, default 30 seconds)
          </para>
        </listitem>
      </varlistentry>
     <varlistentry>
        <term><option>--wait-sync</option></term>
        <listitem>
          <para>
            wait for the node record to synchronise to the standby (optional timeout in seconds)
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-standby-register-events">
    <title>Event notifications</title>
    <para>
      A <literal>standby_register</literal> <link linkend="event-notifications">event notification</link>
      will be generated immediately after the node record is updated on the primary.
    </para>
    <para>
      If the <option>--wait-sync</option> option is provided, a <literal>standby_register_sync</literal>
      event notification  will be generated immediately after the node record has synchronised to the
      standby.
    </para>
    <para>
      If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the
      primary node, <literal>%c</literal> with its <literal>conninfo</literal> string, and
      <literal>%a</literal> with its node name.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-switchover.sgml
+++ b/doc/repmgr-standby-switchover.sgml
@@ -0,0 +1,284 @@
 <refentry id="repmgr-standby-switchover">
  <indexterm>
    <primary>repmgr standby switchover</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby switchover</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby switchover</refname>
    <refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Promotes a standby to primary and demotes the existing primary to a standby.
      This command must be run on the standby to be promoted, and requires a
      passwordless SSH connection to the current primary.
    </para>
    <para>
      If other standbys are connected to the demotion candidate, &repmgr; can instruct
      these to follow the new primary if the option <literal>--siblings-follow</literal>
      is specified. This requires a passwordless SSH connection between the promotion
      candidate (new primary) and the standbys attached to the demotion candidate
      (existing primary).
    </para>
    <note>
      <para>
        Performing a switchover is a non-trivial operation. In particular it
        relies on the current primary being able to shut down cleanly and quickly.
        &repmgr; will attempt to check for potential issues but cannot guarantee
        a successful switchover.
      </para>
      <para>
        &repmgr; will refuse to perform the switchover if an exclusive backup is running on
        the current primary.
      </para>
    </note>
    <para>
      For more details on performing a switchover, including preparation and configuration,
      see section <xref linkend="performing-switchover">.
    </para>
    <note>
      <para>
        From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
        <application>repmgrd</application> instances to pause operations while the switchover
        is being carried out, to prevent <application>repmgrd</application> from
        unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing">.
      </para>
      <para>
        Users of &repmgr; versions prior to 4.2 should ensure that <application>repmgrd</application>
        is not running on any nodes while a switchover is being executed.
      </para>
    </note>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--always-promote</option></term>
        <listitem>
          <para>
            Promote standby to primary, even if it is behind or has diverged
            from the original primary. The original primary will be shut down in any case,
            and will need to be manually reintegrated into the replication cluster.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually execute a switchover.
          </para>
          <important>
            <para>
              Success of <option>--dry-run</option> does not imply the switchover will
              complete successfully, only that
              the prerequisites for performing the operation are met.
            </para>
          </important>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-F</option></term>
        <term><option>--force</option></term>
        <listitem>
          <para>
            Ignore warnings and continue anyway.
          </para>
          <para>
            Specifically, if a problem is encountered when shutting down the current primary,
            using <option>-F/--force</option> will cause &repmgr; to continue by promoting
            the standby to be the new primary, and if <option>--siblings-follow</option> is
            specified, attach any other standbys to the new primary.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
        <listitem>
          <para>
            Use <application>pg_rewind</application> to reintegrate the old primary if necessary
            (and the prerequisites for using <application>pg_rewind</application> are met).
            If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
            binary is not installed in the PostgreSQL <filename>bin</filename> directory,
            provide its full path. For more details see also <xref linkend="switchover-pg-rewind">.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>-R</option></term>
        <term><option>--remote-user</option></term>
        <listitem>
          <para>
            System username for remote SSH operations (defaults to local system user).
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>--repmgrd-no-pause</option></term>
        <listitem>
          <para>
            Don't pause <application>repmgrd</application> while executing a switchover.
          </para>
          <para>
            This option should not be used unless you take steps by other means
            to ensure <application>repmgrd</application> is paused or not
            running on all nodes.
          </para>
        </listitem>
      </varlistentry>
     <varlistentry>
        <term><option>--siblings-follow</option></term>
        <listitem>
          <para>
            Have standbys attached to the old primary follow the new primary.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1>
    <title>Configuration file settings</title>
    <para>
     Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
     switchover operation:
     <itemizedlist spacing="compact" mark="bullet">
       <listitem>
         <simpara>
           <literal>replication_lag_critical</literal>:
           if replication lag (in seconds) on the standby exceeds this value, the
           switchover will be aborted (unless the <literal>-F/--force</literal> option
           is provided)
         </simpara>
       </listitem>
       <listitem>
         <simpara>
           <literal>shutdown_check_timeout</literal>: maximum number of seconds to wait for the
           demotion candidate (current primary) to shut down, before aborting the switchover.
         </simpara>
         <simpara>
           Note that this parameter is set on the node where <command>repmgr standby switchover</command>
           is executed (promotion candidate); setting it on the demotion candidate (former primary) will
           have no effect.
         </simpara>
         <note>
           <para>
             In versions prior to <link linkend="release-4.2">&repmgr; 4.2</link>, <command>repmgr standby switchover</command> would
             use the values defined in <literal>reconnect_attempts</literal> and <literal>reconnect_interval</literal>
             to determine the timeout for demotion candidate shutdown.
           </para>
         </note>
       </listitem>
       <listitem>
         <simpara>
           <literal>standby_reconnect_timeout</literal>:
           maximum number of seconds to attempt to wait for the demotion candidate (former primary)
           to reconnect to the promoted primary (default: 60 seconds)
         </simpara>
       </listitem>
     </itemizedlist>
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      Execute with the <literal>--dry-run</literal> option to test the switchover as far as
      possible without actually changing the status of either node.
    </para>
    <para>
      External database connections, e.g. from an application, should not be permitted while
      the switchover is taking place. In particular, active transactions on the primary
      can potentially disrupt the shutdown process.
    </para>
  </refsect1>
  <refsect1 id="repmgr-standby-switchover-events">
    <title>Event notifications</title>
    <para>
      <literal>standby_switchover</literal> and <literal>standby_promote</literal>
      <link linkend="event-notifications">event notifications</link> will be generated for the new primary,
      and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
    </para>
    <para>
      If using an event notification script, <literal>standby_switchover</literal>
      will populate the placeholder parameter <literal>%p</literal> with the node ID of
      the former primary.
    </para>
  </refsect1>
  <refsect1>
    <title>Exit codes</title>
    <para>
      Following exit codes can be emitted by <command>repmgr standby switchover</command>:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>SUCCESS (0)</option></term>
        <listitem>
          <para>
            The switchover completed successfully.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
        <listitem>
          <para>
            The switchover could not be executed.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
        <listitem>
          <para>
            The switchover was executed but a problem was encountered.
            Typically this means the former primary could not be reattached
            as a standby. Check preceding log messages for more information.
          </para>
        </listitem>
      </varlistentry>
   </variablelist>
  </refsect1>
  <refsect1>
    <title>See also</title>
    <para>
      For more details see the section <xref linkend="performing-switchover">.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-standby-unregister.sgml
+++ b/doc/repmgr-standby-unregister.sgml
@@ -0,0 +1,70 @@
 <refentry id="repmgr-standby-unregister">
  <indexterm>
    <primary>repmgr standby unregister</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr standby unregister</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr standby unregister</refname>
    <refpurpose>remove a standby's information from the &repmgr; metadata</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      Unregisters a standby with &repmgr;. This command does not affect the actual
      replication, just removes the standby's entry from the &repmgr; metadata.
    </para>
  </refsect1>
  <refsect1>
    <title>Execution</title>
    <para>
      To unregister a running standby, execute:
      <programlisting>
        repmgr standby unregister -f /etc/repmgr.conf</programlisting>
    </para>
    <para>
      This will remove the standby record from &repmgr;'s internal metadata
      table (<literal>repmgr.nodes</literal>). A <literal>standby_unregister</literal>
      event notification will be recorded in the <literal>repmgr.events</literal> table.
    </para>
    <para>
      If the standby is not running, the command can be executed on another
      node by providing the id of the node to be unregistered using
      the command line parameter <literal>--node-id</literal>, e.g. executing the following
      command on the primary server will unregister the standby with
      id <literal>3</literal>:
      <programlisting>
        repmgr standby unregister -f /etc/repmgr.conf --node-id=3</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--node-id</option></term>
        <listitem>
          <para>
            <varname>node_id</varname> of the node to unregister (optional)
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-standby-unregister-events">
    <title>Event notifications</title>
    <para>
      A <literal>standby_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-witness-register.sgml
+++ b/doc/repmgr-witness-register.sgml
@@ -0,0 +1,65 @@
 <refentry id="repmgr-witness-register">
  <indexterm>
    <primary>repmgr witness register</primary>
    <seealso>witness server</seealso>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr witness register</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr witness register</refname>
    <refpurpose>add a witness node's information to the &repmgr; metadata</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr witness register</command> adds a witness server's node
      record to the &repmgr; metadata, and if necessary initialises the witness
      node by installing the &repmgr; extension and copying the &repmgr; metadata
      to the witness server. This command needs to be executed to enable
      use of the witness server with <application>repmgrd</application>.
    </para>
    <para>
      When executing <command>repmgr witness register</command>, database connection
      information for the cluster primary server must also be provided.
    </para>
    <para>
      In most cases it's only necessary to provide the primary's hostname with
      the <option>-h</option>/<option>--host</option> option; &repmgr; will
      automatically use the <varname>user</varname> and <varname>dbname</varname>
      values defined in the <varname>conninfo</varname> string defined in the
      witness node's <filename>repmgr.conf</filename>, unless these are explicitly
      provided as command line options.
    </para>
    <para>
      Execute with the <option>--dry-run</option> option to check what would happen
      without actually registering the witness server.
    </para>
  </refsect1>
  <refsect1>
    <title>Example</title>
    <para>
      <programlisting>
    $ repmgr -f /etc/repmgr.conf witness register -h node1
    INFO: connecting to witness node "node3" (ID: 3)
    INFO: connecting to primary node
    NOTICE: attempting to install extension "repmgr"
    NOTICE: "repmgr" extension successfully installed
    INFO: witness registration complete
    NOTICE: witness node "node3" (ID: 3) successfully registered
      </programlisting>
    </para>
  </refsect1>
  <refsect1 id="repmgr-witness-register-events">
    <title>Event notifications</title>
    <para>
      A <literal>witness_register</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr-witness-unregister.sgml
+++ b/doc/repmgr-witness-unregister.sgml
@@ -0,0 +1,102 @@
 <refentry id="repmgr-witness-unregister" xreflabel="repmgr witness unregister">
  <indexterm>
    <primary>repmgr witness unregister</primary>
  </indexterm>
  <refmeta>
    <refentrytitle>repmgr witness unregister</refentrytitle>
  </refmeta>
  <refnamediv>
    <refname>repmgr witness unregister</refname>
    <refpurpose>remove a witness node's information to the &repmgr; metadata</refpurpose>
  </refnamediv>
  <refsect1>
    <title>Description</title>
    <para>
      <command>repmgr witness unregister</command> removes a witness server's node
      record from the &repmgr; metadata.
    </para>
    <para>
      The node does not have to be running to be unregistered, however if this is the
      case then either provide connection information for the primary server, or
 	  execute <command>repmgr witness unregister</command> on a running node and
 	  provide the parameter <option>--node-id</option> with the node ID of the
 	  witness server.
    </para>
    <para>
      Execute with the <literal>--dry-run</literal> option to check what would happen
      without actually registering the witness server.
    </para>
  </refsect1>
  <refsect1>
    <title>Examples</title>
    <para>
      Unregistering a running witness node:
      <programlisting>
    $ repmgr -f /etc/repmgr.conf witness unregister
    INFO: connecting to witness node "node3" (ID: 3)
    INFO: unregistering witness node 3
    INFO: witness unregistration complete
    DETAIL: witness node with UD 3 successfully unregistered</programlisting>
    </para>
    <para>
      Unregistering a non-running witness node:
      <programlisting>
        $ repmgr -f /etc/repmgr.conf witness unregister -h node1 -p 5501  -F
        INFO: connecting to node "node3" (ID: 3)
        NOTICE: unable to connect to node "node3" (ID: 3), removing node record on cluster primary only
        INFO: unregistering witness node 3
        INFO: witness unregistration complete
        DETAIL: witness node with id ID 3 successfully unregistered</programlisting>
    </para>
  </refsect1>
  <refsect1>
    <title>Notes</title>
    <para>
      This command will not make any changes to the witness node itself and will neither
      remove any data from the witness database nor stop the PostgreSQL instance.
    </para>
    <para>
      A witness node which has been unregistered, can be re-registered with
      <link linkend="repmgr-witness-register">repmgr witness register --force</link>.
    </para>
  </refsect1>
  <refsect1>
    <title>Options</title>
    <variablelist>
      <varlistentry>
        <term><option>--dry-run</option></term>
        <listitem>
          <para>
            Check prerequisites but don't actually unregister the witness.
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
       <term><option>--node-id</option></term>
        <listitem>
          <para>
            Unregister witness server with the specified node ID.
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
  </refsect1>
  <refsect1 id="repmgr-witness-unregister-events">
    <title>Event notifications</title>
    <para>
      A <literal>witness_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
    </para>
  </refsect1>
 </refentry>
--- a/doc/repmgr.sgml
+++ b/doc/repmgr.sgml
@@ -0,0 +1,136 @@
 <!-- doc/src/sgml/postgres.sgml -->
 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
          <!ENTITY % version SYSTEM "version.sgml">
          %version;
          <!ENTITY % filelist SYSTEM "filelist.sgml">
          %filelist;
          <!ENTITY repmgr "<productname>repmgr</productname>">
          <!ENTITY postgres "<productname>PostgreSQL</productname>">
 ]>
 <book id="repmgr">
 <title>repmgr &repmgrversion; Documentation</title>
 <bookinfo>
  <corpauthor>2ndQuadrant Ltd</corpauthor>
  <productname>repmgr</productname>
  <productnumber>&repmgrversion;</productnumber>
  &legal;
  <abstract>
   <para>
   This is the official documentation of &repmgr; &repmgrversion; for
   use with PostgreSQL 9.3 - PostgreSQL 11.
   </para>
   <para>
     &repmgr; is being continually developed and we strongly recommend using the
     latest version. Please check the
     <ulink url="https://repmgr.org/">repmgr website</ulink> for details
     about the current &repmgr; version as well as the
     <ulink url="https://repmgr.org/docs/current/index.html">current documentation</ulink>.
   </para>
   <para>
    &repmgr; was developed by
    <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
    along with contributions from other individuals and companies.
    Contributions from the community are appreciated and welcome - get
    in touch via <ulink url="https://github.com/2ndQuadrant/repmgr">github</>
    or <ulink url="https://groups.google.com/group/repmgr">the mailing list/forum</>.
    Multiple 2ndQuadrant customers contribute funding
    to make repmgr development possible.
   </para>
   <para>
    2ndQuadrant, a Platinum sponsor of the PostgreSQL project,
    continues to develop repmgr to meet internal needs and those of customers.
     Other companies as well as individual developers
    are welcome to participate in the efforts.
   </para>
  </abstract>
  <keywordset>
   <keyword>repmgr</keyword>
   <keyword>PostgreSQL</keyword>
   <keyword>replication</keyword>
   <keyword>asynchronous</keyword>
   <keyword>HA</keyword>
   <keyword>high-availability</keyword>
  </keywordset>
 </bookinfo>
 <part id="getting-started">
  <title>Getting started</title>
  &overview;
  &install;
  &quickstart;
 </part>
 <part id="repmgr-administration-manual">
  <title>repmgr administration manual</title>
  &configuration;
  &cloning-standbys;
  &promoting-standby;
  &follow-new-primary;
  &switchover;
  &configuring-witness-server;
  &event-notifications;
  &upgrading-repmgr;
 </part>
 <part id="using-repmgrd">
  <title>Using repmgrd</title>
  &repmgrd-automatic-failover;
  &repmgrd-configuration;
  &repmgrd-demonstration;
  &repmgrd-cascading-replication;
  &repmgrd-network-split;
  &repmgrd-witness-server;
  &repmgrd-pausing;
  &repmgrd-degraded-monitoring;
  &repmgrd-monitoring;
  &repmgrd-bdr;
 </part>
 <part id="repmgr-command-reference">
  <title>repmgr command reference</title>
  &repmgr-primary-register;
  &repmgr-primary-unregister;
  &repmgr-standby-clone;
  &repmgr-standby-register;
  &repmgr-standby-unregister;
  &repmgr-standby-promote;
  &repmgr-standby-follow;
  &repmgr-standby-switchover;
  &repmgr-witness-register;
  &repmgr-witness-unregister;
  &repmgr-node-status;
  &repmgr-node-check;
  &repmgr-node-rejoin;
  &repmgr-node-service;
  &repmgr-cluster-show;
  &repmgr-cluster-matrix;
  &repmgr-cluster-crosscheck;
  &repmgr-cluster-event;
  &repmgr-cluster-cleanup;
  &repmgr-daemon-status;
  &repmgr-daemon-pause;
  &repmgr-daemon-unpause;
 </part>
 &appendix-release-notes;
 &appendix-signatures;
 &appendix-faq;
 &appendix-packages;
 <![%include-index;[&bookindex;]]>
 <![%include-xslt-index;[<index id="bookindex"></index>]]>
 </book>
--- a/doc/repmgrd-automatic-failover.sgml
+++ b/doc/repmgrd-automatic-failover.sgml
@@ -0,0 +1,17 @@
 <chapter id="repmgrd-automatic-failover" xreflabel="Automatic failover with repmgrd">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>automatic failover</secondary>
 </indexterm>
 <title>Automatic failover with repmgrd</title>
 <para>
  <application>repmgrd</application> is a management and monitoring daemon which runs
  on each node in a replication cluster. It can automate actions such as
  failover and updating standbys to follow the new primary, as well as
  providing monitoring information about the state of each standby.
 </para>
 </chapter>
--- a/doc/repmgrd-bdr.sgml
+++ b/doc/repmgrd-bdr.sgml
@@ -0,0 +1,428 @@
 <chapter id="repmgrd-bdr">
  <indexterm>
    <primary>repmgrd</primary>
    <secondary>BDR</secondary>
  </indexterm>
  <indexterm>
    <primary>BDR</primary>
  </indexterm>
  <title>BDR failover with repmgrd</title>
  <para>
    &repmgr; 4.x provides support for monitoring a pair of BDR 2.x nodes and taking action in
    case one of the nodes fails.
  </para>
  <note>
    <simpara>
      Due to the nature of BDR 1.x/2.x, it's only safe to use this solution for
      a two-node scenario. Introducing additional nodes will create an inherent
      risk of node desynchronisation if a node goes down without being cleanly
      removed from the cluster.
    </simpara>
  </note>
  <para>
    In contrast to streaming replication, there's no concept of "promoting" a new
    primary node with BDR. Instead, "failover" involves monitoring both nodes
    with <application>repmgrd</application> and redirecting queries from the failed node to the remaining
    active node. This can be done by using an
    <link linkend="event-notifications">event notification</link> script
    which is called by <application>repmgrd</application> to dynamically
    reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
  </para>
  <note>
    <simpara>
      This &repmgr; functionality is for BDR 2.x only running on PostgreSQL 9.4/9.6.
      It is <emphasis>not</emphasis> required for later BDR versions.
    </simpara>
  </note>
  <sect1 id="bdr-prerequisites" xreflabel="BDR prequisites">
    <title>Prerequisites</title>
    <important>
      <para>
        This &repmgr; functionality is for BDR 2.x only running on PostgreSQL 9.4/9.6.
        It is <emphasis>not</emphasis> required for later BDR versions.
      </para>
    </important>
    <para>
      &repmgr; 4 requires PostgreSQL 9.4 or 9.6 with the BDR 2 extension
      enabled and configured for a two-node BDR network. &repmgr; 4 packages
      must be installed on each node before attempting to configure
      <application>repmgr</application>.
    </para>
    <note>
      <simpara>
        &repmgr; 4 will refuse to install if it detects more than two BDR nodes.
      </simpara>
    </note>
    <para>
      Application database connections *must* be passed through a proxy server/
      connection pooler such as <application>PgBouncer</application>, and it must be possible to dynamically
      reconfigure that from <application>repmgrd</application>. The example demonstrated in this document
      will use <application>PgBouncer</application>
    </para>
    <para>
      The proxy server / connection poolers must <emphasis>not</emphasis>
      be installed on the database servers.
    </para>
    <para>
      For this example, it's assumed password-less SSH connections are available
      from the PostgreSQL servers to the servers where <application>PgBouncer</application>
      runs, and that the user on those servers has permission to alter the
      <application>PgBouncer</application> configuration files.
    </para>
    <para>
      PostgreSQL connections must be possible between each node, and each node
      must be able to connect to each PgBouncer instance.
    </para>
  </sect1>
  <sect1 id="bdr-configuration" xreflabel="BDR configuration">
    <title>Configuration</title>
    <para>
      A sample configuration for <filename>repmgr.conf</filename> on each
      BDR node would look like this:
      <programlisting>
        # Node information
        node_id=1
        node_name='node1'
        conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
        data_directory='/var/lib/postgresql/data'
        replication_type='bdr'
        # Event notification configuration
        event_notifications=bdr_failover
        event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
        # repmgrd options
        monitor_interval_secs=5
        reconnect_attempts=6
        reconnect_interval=5</programlisting>
    </para>
    <para>
      Adjust settings as appropriate; copy and adjust for the second node (particularly
      the values <varname>node_id</varname>, <varname>node_name</varname>
      and <varname>conninfo</varname>).
    </para>
    <para>
      Note that the values provided for the <varname>conninfo</varname> string
      must be valid for connections from <emphasis>both</emphasis> nodes in the
      replication cluster. The database must be the BDR-enabled database.
    </para>
    <para>
      If defined, the <varname>event_notifications</varname> parameter will restrict
      execution of the script defined in  <varname>event_notification_command</varname>
      to the specified event(s).
    </para>
    <note>
      <simpara>
        <varname>event_notification_command</varname> is the script which does the actual "heavy lifting"
        of reconfiguring the proxy server/ connection pooler. It is fully
        user-definable; see section <xref linkend="bdr-event-notification-command"> for a reference
        implementation.
      </simpara>
    </note>
  </sect1>
  <sect1 id="bdr-repmgr-setup" xreflabel="repmgr setup with BDR">
    <title>repmgr setup</title>
    <para>
      Register both nodes; example on <literal>node1</literal>:
      <programlisting>
        $ repmgr -f /etc/repmgr.conf bdr register
        NOTICE: attempting to install extension "repmgr"
        NOTICE: "repmgr" extension successfully installed
        NOTICE: node record created for node 'node1' (ID: 1)
        NOTICE: BDR node 1 registered (conninfo: host=node1 dbname=bdrtest user=repmgr)</programlisting>
    </para>
    <para>
      and on <literal>node1</literal>:
      <programlisting>
        $ repmgr -f /etc/repmgr.conf bdr register
        NOTICE: node record created for node 'node2' (ID: 2)
        NOTICE: BDR node 2 registered (conninfo: host=node2 dbname=bdrtest user=repmgr)</programlisting>
    </para>
    <para>
      The <literal>repmgr</literal> extension will be automatically created
      when the first node is registered, and will be propagated to the second
      node.
    </para>
    <important>
      <simpara>
        Ensure the &repmgr; package is available on both nodes before
        attempting to register the first node.
      </simpara>
    </important>
    <para>
      At this point the meta data for both nodes has been created; executing
      <xref linkend="repmgr-cluster-show"> (on either node) should produce output like this:
      <programlisting>
        $ repmgr -f /etc/repmgr.conf cluster show
        ID | Name  | Role | Status    | Upstream | Location | Connection string
       ----+-------+------+-----------+----------+--------------------------------------------------------
        1  | node1 | bdr  | * running |          | default  | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
        2  | node2 | bdr  | * running |          | default  | host=node2 dbname=bdrtest user=repmgr connect_timeout=2</programlisting>
    </para>
    <para>
      Additionally it's possible to display log of significant events;  executing
      <xref linkend="repmgr-cluster-event"> (on either node) should produce output like this:
      <programlisting>
        $ repmgr -f /etc/repmgr.conf cluster event
        Node ID | Event        | OK | Timestamp           | Details
       ---------+--------------+----+---------------------+----------------------------------------------
        2       | bdr_register | t  | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
        1       | bdr_register | t  | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
      </programlisting>
    </para>
    <para>
      At this point there will only be records for the two node registrations (displayed here
      in reverse chronological order).
    </para>
  </sect1>
  <sect1 id="bdr-event-notification-command" xreflabel="Defining the BDR failover &quot;event_notification command&quot;">
    <title>Defining the BDR failover "event_notification_command"</title>
    <para>
      Key to "failover" execution is the <literal>event_notification_command</literal>,
      which is a user-definable script specified in <filename>repmpgr.conf</filename>
      and which can use a &repmgr; <link linkend="event-notifications">event notification</link>
      to reconfigure the proxy server / connection pooler so it points to the other, still-active node.
      Details of the event will be passed as parameters to the script.
    </para>
    <para>
      Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>;
      these will be replaced with the appropriate value when the script is executed:
    </para>
    <variablelist>
      <varlistentry>
        <term><option>%n</option></term>
        <listitem>
          <para>
            node ID
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%e</option></term>
        <listitem>
          <para>
            event type
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%t</option></term>
        <listitem>
          <para>
            success (1 or 0)
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%t</option></term>
        <listitem>
          <para>
            timestamp
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%d</option></term>
        <listitem>
          <para>
            details
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%c</option></term>
        <listitem>
          <para>
            conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
          </para>
        </listitem>
      </varlistentry>
      <varlistentry>
        <term><option>%a</option></term>
        <listitem>
          <para>
            name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
          </para>
        </listitem>
      </varlistentry>
    </variablelist>
    <para>
      Note that <literal>%c</literal> and <literal>%a</literal> are only provided with
      particular failover events, in this case <varname>bdr_failover</varname>.
    </para>
    <para>
      The provided sample script
     (<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>)
      is configured as follows:
      <programlisting>
        event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
    </para>
    <para>
      and parses the placeholder parameters like this:
      <programlisting>
        NODE_ID=$1
        EVENT_TYPE=$2
        SUCCESS=$3
        NEXT_CONNINFO=$4
        NEXT_NODE_NAME=$5</programlisting>
    </para>
    <note>
      <para>
        The sample script also contains some hard-coded values for the <application>PgBouncer</application>
        configuration for both nodes; these will need to be adjusted for your local environment
        (ideally the scripts would be maintained as templates and generated by some
        kind of provisioning system).
      </para>
    </note>
    <para>
      The script performs following steps:
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>pauses <application>PgBouncer</application> on all nodes</simpara>
        </listitem>
        <listitem>
          <simpara>recreates the <application>PgBouncer</application> configuration file on each
            node using the information provided by <application>repmgrd</application>
            (primarily the <varname>conninfo</varname> string) to configure
            <application>PgBouncer</application></simpara>
        </listitem>
        <listitem>
          <simpara>reloads the <application>PgBouncer</application> configuration</simpara>
        </listitem>
        <listitem>
          <simpara>executes the <command>RESUME</command> command (in <application>PgBouncer</application>)</simpara>
        </listitem>
      </itemizedlist>
    </para>
    <para>
      Following successful script execution, any connections to PgBouncer on the failed BDR node
      will be redirected to the active node.
    </para>
  </sect1>
  <sect1 id="bdr-monitoring-failover" xreflabel="Node monitoring and failover">
    <title>Node monitoring and failover</title>
    <para>
      At the intervals specified by <varname>monitor_interval_secs</varname>
      in <filename>repmgr.conf</filename>, <application>repmgrd</application>
      will ping each node to check if it's available. If a node isn't available,
      <application>repmgrd</application> will enter failover mode and check <varname>reconnect_attempts</varname>
      times at intervals of <varname>reconnect_interval</varname> to confirm the node is definitely unreachable.
      This buffer period is necessary to avoid false positives caused by transient
      network outages.
    </para>
    <para>
      If the node is still unavailable, <application>repmgrd</application> will enter failover mode and execute
      the script defined in <varname>event_notification_command</varname>; an entry will be logged
      in the <literal>repmgr.events</literal> table and <application>repmgrd</application> will
      (unless otherwise configured) resume monitoring of the node in "degraded" mode until it reappears.
    </para>
    <para>
      <application>repmgrd</application> logfile output during a failover event will look something like this
      on one node (usually the node which has failed, here <literal>node2</literal>):
      <programlisting>
            ...
    [2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
    [2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
    [2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
    [2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
    [2017-07-27 21:09:28] [DETAIL] command is:
      /path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
    [2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
    ...</programlisting>
    </para>
    <para>
      Output on the other node (<literal>node1</literal>) during the same event will look like this:
      <programlisting>
    ...
    [2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
    [2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
    [2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
    ...</programlisting>
    </para>
    <para>
      This assumes only the PostgreSQL instance on <literal>node2</literal> has failed. In this case the
      <application>repmgrd</application> instance running on <literal>node2</literal> has performed the failover. However if
      the entire server becomes unavailable, <application>repmgrd</application> on <literal>node1</literal> will perform
      the failover.
    </para>
  </sect1>
  <sect1 id="bdr-node-recovery" xreflabel="Node recovery">
    <title>Node recovery</title>
    <para>
      Following failure of a BDR node, if the node subsequently becomes available again,
      a <varname>bdr_recovery</varname> event will be generated. This could potentially be used to
      reconfigure PgBouncer automatically to bring the node back into the available pool,
      however it would be prudent to manually verify the node's status before
      exposing it to the application.
    </para>
    <para>
      If the failed node comes back up and connects correctly, output similar to this
      will be visible in the <application>repmgrd</application> log:
      <programlisting>
        [2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
        [2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
        [2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
        [2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
        [2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds</programlisting>
    </para>
  </sect1>
  <sect1 id="bdr-complete-shutdown" xreflabel="Shutdown of both nodes">
    <title>Shutdown of both nodes</title>
    <para>
      If both PostgreSQL instances are shut down, <application>repmgrd</application> will try and handle the
      situation as gracefully as possible, though with no failover candidates available
      there's not much it can do. Should this case ever occur, we recommend shutting
      down <application>repmgrd</application> on both nodes and restarting it once the PostgreSQL instances
      are running properly.
    </para>
  </sect1>
 </chapter>
--- a/doc/repmgrd-cascading-replication.sgml
+++ b/doc/repmgrd-cascading-replication.sgml
@@ -0,0 +1,22 @@
 <chapter id="repmgrd-cascading-replication">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>cascading replication</secondary>
 </indexterm>
 <title>repmgrd and cascading replication</title>
 <para>
  Cascading replication - where a standby can connect to an upstream node and not
  the primary server itself - was introduced in PostgreSQL 9.2. &repmgr; and
  <application>repmgrd</application> support cascading replication by keeping track of the relationship
  between standby servers - each node record is stored with the node id of its
  upstream ("parent") server (except of course the primary server).
 </para>
 <para>
  In a failover situation where the primary node fails and a top-level standby
  is promoted, a standby connected to another standby will not be affected
  and continue working as normal (even if the upstream standby it's connected
  to becomes the primary node). If however the node's direct upstream fails,
  the "cascaded standby" will attempt to reconnect to that node's parent.
 </para>
 </chapter>
--- a/doc/repmgrd-configuration.sgml
+++ b/doc/repmgrd-configuration.sgml
@@ -0,0 +1,554 @@
 <chapter id="repmgrd-configuration">
  <indexterm>
    <primary>repmgrd</primary>
    <secondary>configuration</secondary>
  </indexterm>
  <title>repmgrd configuration</title>
  <para>
    <application>repmgrd</application> is a daemon which runs on each PostgreSQL node,
    monitoring the local node, and (unless it's the primary node) the upstream server
    (the primary server or with cascading replication, another standby) which it's
    connected to.
  </para>
  <para>
    <application>repmgrd</application> can be configured to provide failover
    capability in case the primary upstream node becomes unreachable, and/or
    provide monitoring data to the &repmgr; metadatabase.
  </para>
  <sect1 id="repmgrd-basic-configuration">
    <title>repmgrd basic configuration</title>
    <para>
      To use <application>repmgrd</application>, its associated function library <emphasis>must</emphasis> be
      included via <filename>postgresql.conf</filename> with:
      <programlisting>
        shared_preload_libraries = 'repmgr'</programlisting>
    </para>
    <para>
      Changing this setting requires a restart of PostgreSQL; for more details see
      the <ulink url="https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>.
    </para>
    <sect2 id="repmgrd-automatic-failover-configuration">
      <title>automatic failover configuration</title>
      <para>
        If using automatic failover, the following <application>repmgrd</application> options *must* be set in
        <filename>repmgr.conf</filename> :
        <programlisting>
          failover=automatic
          promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
          follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
      </para>
      <para>
        Adjust file paths as appropriate; alway specify the full path to the &repmgr; binary.
      </para>
      <note>
        <para>
          &repmgr; will not apply <option>pg_bindir</option> when executing <option>promote_command</option>
          or <option>follow_command</option>; these can be user-defined scripts so must always be
          specified with the full path.
        </para>
      </note>
      <para>
        Note that the <literal>--log-to-file</literal> option will cause
        output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
        to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
        See <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename>
        for further <application>repmgrd</application>-specific settings.
      </para>
      <para>
        When <varname>failover</varname> is set to <literal>automatic</literal>, upon detecting failure
        of the current  primary, <application>repmgrd</application> will execute one of:
      </para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <varname>promote_command</varname> (if the current server is to become the new primary)
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>follow_command</varname> (if the current server needs to follow another server which has
            become the new primary)
          </simpara>
        </listitem>
      </itemizedlist>
      <note>
        <para>
          These commands can be any valid shell script which results in one of these
          two actions happening, but if &repmgr;'s <command>standby follow</command> or
          <command>standby promote</command>
          commands are not executed (either directly as shown here, or from a script which
          performs other actions), the &repmgr; metadata will not be updated and
          &repmgr; will no longer function reliably.
        </para>
      </note>
      <para>
        The <varname>follow_command</varname> should provide the <literal>--upstream-node-id=%n</literal>
        option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
        <application>repmgrd</application> with the ID of the new primary node. If this is not provided, &repmgr;
        will attempt to determine the new primary by itself, but if the
        original primary comes back online after the new primary is promoted, there is a risk that
        <command>repmgr standby follow</command> will result in the node continuing to follow
        the original primary.
      </para>
    </sect2>
    <sect2 id="repmgrd-service-configuration">
      <indexterm>
        <primary>repmgrd</primary>
        <secondary>PostgreSQL service configuration</secondary>
      </indexterm>
      <title>PostgreSQL service configuration</title>
      <para>
        If using automatic failover, currently <application>repmgrd</application> will need to execute
        <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
        to restart PostgreSQL on standbys to have them follow a new primary.
      </para>
      <para>
        To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
        command appropriate to your operating system via <varname>service_restart_command</varname>
        in <filename>repmgr.conf</filename>. If you don't do this, <application>repmgrd</application>
        will default to using <command>pg_ctl</command>, which can result in unexpected problems,
        particularly on <application>systemd</application>-based systems.
      </para>
      <para>
        For more details, see <xref linkend="configuration-file-service-commands">.
      </para>
    </sect2>
    <sect2 id="repmgrd-monitoring-configuration" xreflabel="repmgrd monitoring configuration">
      <indexterm>
        <primary>repmgrd</primary>
        <secondary>monitoring configuration</secondary>
      </indexterm>
      <title>Monitoring configuration</title>
      <para>
        To enable monitoring, set:
        <programlisting>
          monitoring_history=yes</programlisting>
        in <filename>repmgr.conf</filename>.
      </para>
      <para>
        The default monitoring interval is 2 seconds; this value can be explicitly set using:
        <programlisting>
          monitor_interval_secs=&lt;seconds&gt;</programlisting>
        in <filename>repmgr.conf</filename>.
      </para>
      <para>
        For more details on monitoring, see <xref linkend="repmgrd-monitoring">.
      </para>
    </sect2>
    <sect2 id="repmgrd-reloading-configuration"xreflabel="reloading repmgrd configuration">
      <indexterm>
        <primary>repmgrd</primary>
        <secondary>applying configuration changes</secondary>
      </indexterm>
      <title>Applying configuration changes to repmgrd</title>
      <para>
        To apply configuration file changes to a running <application>repmgrd</application>
        daemon, execute the operating system's <application>repmgrd</application> service reload command
        (see <xref linkend="appendix-packages"> for examples),
          or for instances  which were manually started, execute <command>kill -HUP</command>, e.g.
          <command>kill -HUP `cat /tmp/repmgrd.pid`</command>.
      </para>
      <tip>
        <para>
          Check the <application>repmgrd</application> log to see what changes were
          applied, or if any issues were encountered when reloading the configuration.
        </para>
      </tip>
      <para>
        Note that only the following subset of configuration file parameters can be changed on a
        running <application>repmgrd</application> daemon:
      </para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <varname>async_query_timeout</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>bdr_local_monitoring_only</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>bdr_recovery_timeout</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>conninfo</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>degraded_monitoring_timeout</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>event_notification_command</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>event_notifications</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>failover</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>follow_command</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>log_facility</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>log_file</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>log_level</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>log_status_interval</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>monitor_interval_secs</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>monitoring_history</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>primary_notification_timeout</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>promote_command</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>reconnect_attempts</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>reconnect_interval</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>repmgrd_standby_startup_timeout</varname>
          </simpara>
        </listitem>
      </itemizedlist>
      <para>
        The following set of configuration file parameters must be updated via
        <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
        as they require changes to the <literal>repmgr.nodes</literal> table so they are visible to
        all nodes in the replication cluster:
      </para>
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            <varname>node_id</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>node_name</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>data_directory</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>location</varname>
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            <varname>priority</varname>
          </simpara>
        </listitem>
      </itemizedlist>
      <note>
        <para>
          After executing <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
          <application>repmgrd</application> <emphasis>must</emphasis> be restarted for the changes to take effect.
        </para>
      </note>
    </sect2>
  </sect1>
  <sect1 id="repmgrd-daemon">
    <indexterm>
      <primary>repmgrd</primary>
      <secondary>starting and stopping</secondary>
    </indexterm>
    <title>repmgrd daemon</title>
    <para>
      If installed from a package, the <application>repmgrd</application> can be started
      via the operating system's service command, e.g. in <application>systemd</application>
      using <command>systemctl</command>.
    </para>
    <para>
      See appendix <xref linkend="appendix-packages"> for details of service commands
      for different distributions.
    </para>
    <para>
      <application>repmgrd</application> can be started manually like this:
      <programlisting>
        repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid</programlisting>
      and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
    </para>
    <sect2 id="repmgrd-pid-file" xreflabel="repmgrd's PID file">
      <indexterm>
        <primary>repmgrd</primary>
        <secondary>PID file</secondary>
      </indexterm>
      <indexterm>
        <primary>PID file</primary>
        <secondary>repmgrd</secondary>
      </indexterm>
      <title>repmgrd's PID file</title>
      <para>
        <application>repmgrd</application> will generate a PID file by default.
      </para>
      <note>
        <simpara>
          This is a behaviour change from previous versions (earlier than 4.1), where
          the PID file had to be explicitly specified with the command line
          parameter <option> --pid-file</option>.
        </simpara>
      </note>
      <para>
        The PID file can be specified in <filename>repmgr.conf</filename> with the configuration
        parameter <varname>repmgrd_pid_file</varname>.
      </para>
      <para>
        It can also be specified on the command line (as in previous versions) with
        the command line parameter <option>--pid-file</option>. Note this will override
        any value set in <filename>repmgr.conf</filename> with <varname>repmgrd_pid_file</varname>.
        <option>--pid-file</option> may be deprecated in future releases.
      </para>
      <para>
        If a PID file location was specified by the package maintainer, <application>repmgrd</application>
        will use that. This only applies if &repmgr; was installed from a package and the package
        maintainer has specified the PID file location.
      </para>
      <para>
        If none of the above apply, <application>repmgrd</application> will create a PID file
        in the operating system's temporary directory (das etermined by the environment variable
        <varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
      </para>
      <para>
        To prevent a PID file being generated at all, provide the command line option
        <option>--no-pid-file</option>.
      </para>
      <para>
        To see which PID file <application>repmgrd</application> would use, execute <application>repmgrd</application>
        with the option <option>--show-pid-file</option>. <application>repmgrd</application>
        will not start if this option is provided. Note that the value shown is the
        file  <application>repmgrd</application> would use next time it starts, and is
        not necessarily the PID file currently in use.
      </para>
    </sect2>
    <sect2 id="repmgrd-configuration-debian-ubuntu">
      <indexterm>
        <primary>repmgrd</primary>
        <secondary>Debian/Ubuntu and daemon configuration</secondary>
      </indexterm>
      <indexterm>
        <primary>Debian/Ubuntu</primary>
        <secondary>repmgrd daemon configuration</secondary>
      </indexterm>
      <title>repmgrd daemon configuration on Debian/Ubuntu</title>
      <para>
        If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
        is required before <application>repmgrd</application> is started as a daemon.
      </para>
      <para>
        This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
        looks like this:
        <programlisting>
 # default settings for repmgrd. This file is source by /bin/sh from
 # /etc/init.d/repmgrd
 # disable repmgrd by default so it won't get started upon installation
 # valid values: yes/no
 REPMGRD_ENABLED=no
 # configuration file (required)
 #REPMGRD_CONF="/path/to/repmgr.conf"
 # additional options
 REPMGRD_OPTS="--daemonize=false"
 # user to run repmgrd as
 #REPMGRD_USER=postgres
 # repmgrd binary
 #REPMGRD_BIN=/usr/bin/repmgrd
 # pid file
 #REPMGRD_PIDFILE=/var/run/repmgrd.pid</programlisting>
      </para>
      <para>
        Set <varname>REPMGRD_ENABLED</varname> to <literal>yes</literal>, and <varname>REPMGRD_CONF</varname>
        to the <filename>repmgr.conf</filename> file you are using.
      </para>
      <tip>
        <para>
          See <xref linkend="packages-debian-ubuntu"> for details of the Debian/Ubuntu packages and
          typical file locations (including <filename>repmgr.conf</filename>).
        </para>
      </tip>
      <para>
        From <application>repmgrd</application> 4.1, ensure <varname>REPMGRD_OPTS</varname> includes
        <option>--daemonize=false</option>, as daemonization is handled by the service command.
      </para>
      <para>
        If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
        Also, if you attempted to start <application>repmgrd</application> using <command>systemctl start repmgrd</command>,
        you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
        rolls.
      </para>
    </sect2>
  </sect1>
  <sect1 id="repmgrd-connection-settings">
    <title>repmgrd connection settings</title>
 <para>
  In addition to the &repmgr; configuration settings, parameters in the
  <varname>conninfo</varname> string influence how &repmgr; makes a network connection to
  PostgreSQL. In particular, if another server in the replication cluster
  is unreachable at network level, system network settings will influence
  the length of time it takes to determine that the connection is not possible.
 </para>
 <para>
  In particular explicitly setting a parameter for <literal>connect_timeout</literal>
  should be considered; the effective minimum value of <literal>2</literal>
  (seconds) will ensure that a connection failure at network level is reported
  as soon as possible, otherwise depending on the system settings (e.g.
  <varname>tcp_syn_retries</varname> in Linux) a delay of a minute or more
  is possible.
 </para>
 <para>
  For further details on <varname>conninfo</varname> network connection
  parameters, see the
  <ulink url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>.
 </para>
 </sect1>
 <sect1 id="repmgrd-log-rotation">
   <indexterm>
     <primary>log rotation</primary>
     <secondary>repmgrd</secondary>
   </indexterm>
   <indexterm>
     <primary>repmgrd</primary>
     <secondary>log rotation</secondary>
   </indexterm>
  <title>repmgrd log rotation</title>
  <para>
   To ensure the current <application>repmgrd</application> logfile
   (specified in <filename>repmgr.conf</filename> with the parameter
   <option>log_file</option>) does not grow indefinitely, configure your
   system's <command>logrotate</command> to regularly rotate it.
  </para>
  <para>
   Sample configuration to rotate logfiles weekly with retention for
   up to 52 weeks and rotation forced if a file grows beyond 100Mb:
   <programlisting>
    /var/log/repmgr/repmgrd.log {
        missingok
        compress
        rotate 52
        maxsize 100M
        weekly
        create 0600 postgres postgres
        postrotate
            /usr/bin/killall -HUP repmgrd
        endscript
    }</programlisting>
  </para>
 </sect1>
 </chapter>
--- a/doc/repmgrd-degraded-monitoring.sgml
+++ b/doc/repmgrd-degraded-monitoring.sgml
@@ -0,0 +1,83 @@
 <chapter id="repmgrd-degraded-monitoring" xreflabel="repmgrd degraded monitoring">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>degraded monitoring</secondary>
 </indexterm>
 <title>"degraded monitoring" mode</title>
 <para>
  In certain circumstances, <application>repmgrd</application> is not able to fulfill its primary mission
  of monitoring the node's upstream server. In these cases it enters &quot;degraded monitoring&quot;
  mode, where <application>repmgrd</application> remains active but is waiting for the situation
  to be resolved.
 </para>
 <para>
  Situations where this happens are:
  <itemizedlist spacing="compact" mark="bullet">
   <listitem>
    <simpara>a failover situation has occurred, no nodes in the primary node's location are visible</simpara>
   </listitem>
   <listitem>
    <simpara>a failover situation has occurred, but no promotion candidate is available</simpara>
   </listitem>
   <listitem>
    <simpara>a failover situation has occurred, but the promotion candidate could not be promoted</simpara>
   </listitem>
   <listitem>
    <simpara>a failover situation has occurred, but the node was unable to follow the new primary</simpara>
   </listitem>
   <listitem>
    <simpara>a failover situation has occurred, but no primary has become available</simpara>
   </listitem>
   <listitem>
    <simpara>a failover situation has occurred, but automatic failover is not enabled for the node</simpara>
   </listitem>
   <listitem>
    <simpara>repmgrd is monitoring the primary node, but it is not available (and no other node has been promoted as primary)</simpara>
   </listitem>
  </itemizedlist>
 </para>
 <para>
  Example output in a situation where there is only one standby with <literal>failover=manual</literal>,
  and the primary node is unavailable (but is later restarted):
  <programlisting>
    [2017-08-29 10:59:19] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)
    [2017-08-29 10:59:33] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
    [2017-08-29 10:59:33] [INFO] checking state of node 1, 1 of 5 attempts
    [2017-08-29 10:59:33] [INFO] sleeping 1 seconds until next reconnection attempt
    (...)
    [2017-08-29 10:59:37] [INFO] checking state of node 1, 5 of 5 attempts
    [2017-08-29 10:59:37] [WARNING] unable to reconnect to node 1 after 5 attempts
    [2017-08-29 10:59:37] [NOTICE] this node is not configured for automatic failover so will not be considered as promotion candidate
    [2017-08-29 10:59:37] [NOTICE] no other nodes are available as promotion candidate
    [2017-08-29 10:59:37] [HINT] use "repmgr standby promote" to manually promote this node
    [2017-08-29 10:59:37] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
    [2017-08-29 10:59:53] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
    [2017-08-29 11:00:45] [NOTICE] reconnected to upstream node 1 after 68 seconds, resuming monitoring
    [2017-08-29 11:00:57] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)</programlisting>
 </para>
 <para>
  By default, <literal>repmgrd</literal> will continue in degraded monitoring mode indefinitely.
  However a timeout (in seconds) can be set with <varname>degraded_monitoring_timeout</varname>,
  after which <application>repmgrd</application> will terminate.
 </para>
 <note>
   <para>
     If <application>repmgrd</application> is monitoring a primary mode which has been stopped
     and manually restarted as a standby attached to a new primary, it will automatically detect
     the status change and update the node record to reflect the node's new status
     as an active standby. It will then resume monitoring the node as a standby.
   </para>
 </note>
 </chapter>
--- a/doc/repmgrd-demonstration.sgml
+++ b/doc/repmgrd-demonstration.sgml
@@ -0,0 +1,96 @@
 <chapter id="repmgrd-demonstration">
 <title>repmgrd demonstration</title>
 <para>
  To demonstrate automatic failover, set up a 3-node replication cluster (one primary
  and two standbys streaming directly from the primary) so that the cluster looks
  something like this:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
 </para>
 <para>
  Start <application>repmgrd</application> on each standby and verify that it's running by examining the
  log output, which at log level <literal>INFO</literal> will look like this:
  <programlisting>
    [2017-08-24 17:31:00] [NOTICE] using configuration file "/etc/repmgr.conf"
    [2017-08-24 17:31:00] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr"
    [2017-08-24 17:31:00] [NOTICE] starting monitoring of node <literal>node2</literal> (ID: 2)
    [2017-08-24 17:31:00] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
 </para>
 <para>
  Each <application>repmgrd</application> should also have recorded its successful startup as an event:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
     Node ID | Name  | Event         | OK | Timestamp           | Details
    ---------+-------+---------------+----+---------------------+-------------------------------------------------------------
     3       | node3 | repmgrd_start | t  | 2017-08-24 17:35:54 | monitoring connection to upstream node "node1" (node ID: 1)
     2       | node2 | repmgrd_start | t  | 2017-08-24 17:35:50 | monitoring connection to upstream node "node1" (node ID: 1)
     1       | node1 | repmgrd_start | t  | 2017-08-24 17:35:46 | monitoring cluster primary "node1" (node ID: 1)  </programlisting>
 </para>
 <para>
  Now stop the current primary server with e.g.:
  <programlisting>
    pg_ctl -D /var/lib/postgresql/data -m immediate stop</programlisting>
 </para>
 <para>
  This will force the primary to shut down straight away, aborting all processes
  and transactions.  This will cause a flurry of activity in the <application>repmgrd</application> log
  files as each <application>repmgrd</application> detects the failure of the primary and a failover
  decision is made. This is an extract from the log of a standby server (<literal>node2</literal>)
  which has promoted to new primary after failure of the original primary (<literal>node1</literal>).
  <programlisting>
    [2017-08-24 23:32:01] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state
    [2017-08-24 23:32:08] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
    [2017-08-24 23:32:08] [INFO] checking state of node 1, 1 of 5 attempts
    [2017-08-24 23:32:08] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-08-24 23:32:09] [INFO] checking state of node 1, 2 of 5 attempts
    [2017-08-24 23:32:09] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-08-24 23:32:10] [INFO] checking state of node 1, 3 of 5 attempts
    [2017-08-24 23:32:10] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-08-24 23:32:11] [INFO] checking state of node 1, 4 of 5 attempts
    [2017-08-24 23:32:11] [INFO] sleeping 1 seconds until next reconnection attempt
    [2017-08-24 23:32:12] [INFO] checking state of node 1, 5 of 5 attempts
    [2017-08-24 23:32:12] [WARNING] unable to reconnect to node 1 after 5 attempts
    INFO:  setting voting term to 1
    INFO:  node 2 is candidate
    INFO:  node 3 has received request from node 2 for electoral term 1 (our term: 0)
    [2017-08-24 23:32:12] [NOTICE] this node is the winner, will now promote self and inform other nodes
    INFO: connecting to standby database
    NOTICE: promoting standby
    DETAIL: promoting server using 'pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote'
    INFO: reconnecting to promoted server
    NOTICE: STANDBY PROMOTE successful
    DETAIL: node 2 was successfully promoted to primary
    INFO:  node 3 received notification to follow node 2
    [2017-08-24 23:32:13] [INFO] switching to primary monitoring mode</programlisting>
 </para>
 <para>
  The cluster status will now look like this, with the original primary (<literal>node1</literal>)
  marked as inactive, and standby <literal>node3</literal> now following the new primary
  (<literal>node2</literal>):
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+----------------------------------------------------
     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
 </para>
 <para>
  <command>repmgr cluster event</command> will display a summary of what happened to each server
  during the failover:
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster event
     Node ID | Name  | Event                    | OK | Timestamp           | Details
    ---------+-------+--------------------------+----+---------------------+-----------------------------------------------------------------------------------
     3       | node3 | repmgrd_failover_follow  | t  | 2017-08-24 23:32:16 | node 3 now following new upstream node 2
     3       | node3 | standby_follow           | t  | 2017-08-24 23:32:16 | node 3 is now attached to node 2
     2       | node2 | repmgrd_failover_promote | t  | 2017-08-24 23:32:13 | node 2 promoted to primary; old primary 1 marked as failed
     2       | node2 | standby_promote          | t  | 2017-08-24 23:32:13 | node 2 was successfully promoted to primary</programlisting>
 </para>
 </chapter>
--- a/doc/repmgrd-monitoring.sgml
+++ b/doc/repmgrd-monitoring.sgml
@@ -0,0 +1,80 @@
 <chapter id="repmgrd-monitoring" xreflabel="Monitoring with repmgrd">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>monitoring</secondary>
 </indexterm>
 <indexterm>
   <primary>monitoring</primary>
   <secondary>with repmgrd</secondary>
 </indexterm>
 <title>Monitoring with repmgrd</title>
 <para>
   When <application>repmgrd</application> is running with the option <literal>monitoring_history=true</literal>,
  it will constantly write standby node status information to the
  <varname>monitoring_history</varname> table, providing a near-real time
  overview of replication status on all nodes
  in the cluster.
 </para>
 <para>
   The view <literal>replication_status</literal> shows the most recent state
   for each node, e.g.:
  <programlisting>
    repmgr=# select * from repmgr.replication_status;
    -[ RECORD 1 ]-------------+------------------------------
    primary_node_id           | 1
    standby_node_id           | 2
    standby_name              | node2
    node_type                 | standby
    active                    | t
    last_monitor_time         | 2017-08-24 16:28:41.260478+09
    last_wal_primary_location | 0/6D57A00
    last_wal_standby_location | 0/5000000
    replication_lag           | 29 MB
    replication_time_lag      | 00:00:11.736163
    apply_lag                 | 15 MB
    communication_time_lag    | 00:00:01.365643</programlisting>
 </para>
 <para>
  The interval in which monitoring history is written is controlled by the
  configuration parameter <varname>monitor_interval_secs</varname>;
  default is 2.
 </para>
 <para>
  As this can generate a large amount of monitoring data in the table
  <literal>repmgr.monitoring_history</literal>. it's advisable to regularly
  purge historical data using the <xref linkend="repmgr-cluster-cleanup">
  command; use the <literal>-k/--keep-history</literal> option to
  specify how many day's worth of data should be retained.
 </para>
 <para>
  It's possible to use <application>repmgrd</application> to run in monitoring
  mode only (without automatic failover capability) for some or all
  nodes by setting <literal>failover=manual</literal> in the node's
  <filename>repmgr.conf</filename> file. In the event of the node's upstream failing,
  no failover action will be taken and the node will require manual intervention to
  be reattached to replication. If this occurs, an
  <link linkend="event-notifications">event notification</link>
  <varname>standby_disconnect_manual</varname> will be created.
 </para>
 <para>
  Note that when a standby node is not streaming directly from its upstream
  node, e.g. recovering WAL from an archive, <varname>apply_lag</varname> will always appear as
  <literal>0 bytes</literal>.
 </para>
 <tip>
  <para>
   If monitoring history is enabled, the contents of the <literal>repmgr.monitoring_history</literal>
   table will be replicated to attached standbys. This means there will be a small but
   constant stream of replication activity which may not be desirable. To prevent
   this, convert the table to an <literal>UNLOGGED</literal> one with:
   <programlisting>
     ALTER TABLE repmgr.monitoring_history SET UNLOGGED;</programlisting>
  </para>
  <para>
   This will however mean that monitoring history will not be available on
   another node following a failover, and the view <literal>repmgr.replication_status</literal>
   will not work on standbys.
  </para>
 </tip>
 </chapter>
--- a/doc/repmgrd-network-split.sgml
+++ b/doc/repmgrd-network-split.sgml
@@ -0,0 +1,48 @@
 <chapter id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>network splits</secondary>
 </indexterm>
 <title>Handling network splits with repmgrd</title>
 <para>
  A common pattern for replication cluster setups is to spread servers over
  more than one datacentre. This can provide benefits such as geographically-
  distributed read replicas and DR (disaster recovery capability). However
  this also means there is a risk of disconnection at network level between
  datacentre locations, which would result in a split-brain scenario if
  servers in a secondary data centre were no longer able to see the primary
  in the main data centre and promoted a standby among themselves.
 </para>
 <para>
  &repmgr; enables provision of &quot;<xref linkend="witness-server">&quot; to
  artificially create a quorum of servers in a particular location, ensuring
  that nodes in another location will not elect a new primary if they
  are unable to see the majority of nodes. However this approach does not
  scale well, particularly with more complex replication setups, e.g.
  where the majority of nodes are located outside of the primary datacentre.
  It also means the <literal>witness</literal> node needs to be managed as an
  extra PostgreSQL instance outside of the main replication cluster, which
  adds administrative and programming complexity.
 </para>
 <para>
  <literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
  each node is associated with an arbitrary location string (default is
  <literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
  <programlisting>
    node_id=1
    node_name=node1
    conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
    data_directory='/var/lib/postgresql/data'
    location='dc1'</programlisting>
 </para>
 <para>
  In a failover situation, <application>repmgrd</application> will check if any servers in the
  same location as the current primary node are visible.  If not, <application>repmgrd</application>
  will assume a network interruption and not promote any node in any
  other location (it will however enter <link linkend="repmgrd-degraded-monitoring">degraded monitoring</link>
  mode until a primary becomes visible).
 </para>
 </chapter>
--- a/doc/repmgrd-pausing.sgml
+++ b/doc/repmgrd-pausing.sgml
@@ -0,0 +1,178 @@
 <chapter id="repmgrd-pausing" xreflabel="Pausing repmgrd">
  <indexterm>
    <primary>repmgrd</primary>
    <secondary>pausing</secondary>
  </indexterm>
  <indexterm>
    <primary>pausing repmgrd</primary>
  </indexterm>
  <title>Pausing repmgrd</title>
  <para>
    In normal operation, <application>repmgrd</application> monitors the state of the
    PostgreSQL node it is running on, and will take appropriate action if problems
    are detected, e.g. (if so configured) promote the node to primary, if the existing
    primary has been determined as failed.
  </para>
  <para>
    However, <application>repmgrd</application> is unable to distinguish between
    planned outages (such as performing a <link linkend="performing-switchover">switchover</link>
    or installing PostgreSQL maintenance released), and an actual server outage. In versions prior to
    &repmgr; 4.2 it was necessary to stop <application>repmgrd</application> on all nodes (or at least
    on all nodes where <application>repmgrd</application> is
    <link linkend="repmgrd-automatic-failover">configured for automatic failover</link>)
    to prevent <application>repmgrd</application> from making unintentional changes to the
    replication cluster.
  </para>
  <para>
    From <link linkend="release-4.2">&repmgr; 4.2</link>, <application>repmgrd</application>
    can now be &quot;paused&quot;, i.e. instructed not to take any action such as performing a failover.
    This can be done from any node in the cluster, removing the need to stop/restart
    each <application>repmgrd</application> individually.
  </para>
  <note>
    <para>
      For major PostgreSQL upgrades, e.g. from PostgreSQL 10 to PostgreSQL 11,
      <application>repmgrd</application> should be shut down completely and only started up
      once the &repmgr; packages for the new PostgreSQL major version have been installed.
    </para>
  </note>
  <sect1 id="repmgrd-pausing-prerequisites">
    <title>Prerequisites for pausing <application>repmgrd</application></title>
    <para>
      In order to be able to pause/unpause <application>repmgrd</application>, following
      prerequisites must be met:
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara><link linkend="release-4.2">&repmgr; 4.2</link> or later must be installed on all nodes.</simpara>
        </listitem>
        <listitem>
          <simpara>The same major &repmgr; version (e.g. 4.2) must be installed on all nodes (and preferably the same minor version).</simpara>
        </listitem>
        <listitem>
          <simpara>
            PostgreSQL on all nodes must be accessible from the node where the
            <literal>pause</literal>/<literal>unpause</literal> operation is executed, using the
            <varname>conninfo</varname> string shown by <link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>.
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
    <note>
      <para>
        These conditions are required for normal &repmgr; operation in any case.
      </para>
    </note>
  </sect1>
  <sect1 id="repmgrd-pausing-execution">
    <title>Pausing/unpausing <application>repmgrd</application></title>
    <para>
      To pause <application>repmgrd</application>, execute <link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link>, e.g.:
   <programlisting>
 $ repmgr -f /etc/repmgr.conf daemon pause
 NOTICE: node 1 (node1) paused
 NOTICE: node 2 (node2) paused
 NOTICE: node 3 (node3) paused</programlisting>
    </para>
    <para>
      The state of <application>repmgrd</application> on each node can be checked with
      <link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>, e.g.:
    <programlisting>$ repmgr -f /etc/repmgr.conf daemon status
 ID | Name  | Role    | Status  | repmgrd | PID  | Paused?
 ----+-------+---------+---------+---------+------+---------
 1  | node1 | primary | running | running | 7851 | yes
 2  | node2 | standby | running | running | 7889 | yes
 3  | node3 | standby | running | running | 7918 | yes</programlisting>
    </para>
    <note>
      <para>
        If executing a switchover with  <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>,
 		&repmgr; will automatically pause/unpause <application>repmgrd</application> as part of the switchover process.
      </para>
    </note>
    <para>
      If the primary (in this example, <literal>node1</literal>) is stopped, <application>repmgrd</application>
      running on one of the standbys (here: <literal>node2</literal>) will react like this:
      <programlisting>
 [2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
 [2018-09-20 12:22:21] [INFO] checking state of node 1, 1 of 5 attempts
 [2018-09-20 12:22:21] [INFO] sleeping 1 seconds until next reconnection attempt
 ...
 [2018-09-20 12:22:24] [INFO] sleeping 1 seconds until next reconnection attempt
 [2018-09-20 12:22:25] [INFO] checking state of node 1, 5 of 5 attempts
 [2018-09-20 12:22:25] [WARNING] unable to reconnect to node 1 after 5 attempts
 [2018-09-20 12:22:25] [NOTICE] node is paused
 [2018-09-20 12:22:33] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state
 [2018-09-20 12:22:33] [DETAIL] repmgrd paused by administrator
 [2018-09-20 12:22:33] [HINT] execute "repmgr daemon unpause" to resume normal failover mode</programlisting>
    </para>
    <para>
      If the primary becomes available again (e.g. following a software upgrade), <application>repmgrd</application>
      will automatically reconnect, e.g.:
      <programlisting>
 [2018-09-20 13:12:41] [NOTICE] reconnected to upstream node 1 after 8 seconds, resuming monitoring</programlisting>
    </para>
    <para>
      To unpause <application>repmgrd</application>, execute <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>, e.g.:
   <programlisting>
 $ repmgr -f /etc/repmgr.conf daemon unpause
 NOTICE: node 1 (node1) unpaused
 NOTICE: node 2 (node2) unpaused
 NOTICE: node 3 (node3) unpaused</programlisting>
    </para>
    <note>
      <para>
        If the previous primary is no longer accessible when <application>repmgrd</application>
        is unpaused, no failover action will be taken. Instead, a new primary must be manually promoted using
        <link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>,
 		and any standbys attached to the new primary with
 		<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>.
      </para>
      <para>
        This is to prevent <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
        resulting in the automatic promotion of a new primary, which may be a problem particularly
        in larger clusters, where <application>repmgrd</application> could select a different promotion
        candidate to the one intended by the administrator.
      </para>
    </note>
  <sect2 id="repmgrd-pausing-details">
    <title>Details on the <application>repmgrd</application> pausing mechanism</title>
    <para>
      The pause state of each node will be stored over a PostgreSQL restart.
    </para>
 	<para>
 	  <link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
 	  <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link> can be
 	  executed even if <application>repmgrd</application> is not running; in this case,
 	  <application>repmgrd</application> will start up in whichever pause state has been set.
 	</para>
    <note>
      <para>
 		<link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
 		<link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
 		<emphasis>do not</emphasis> stop/start <application>repmgrd</application>.
      </para>
    </note>
  </sect2>
  </sect1>
 </chapter>
--- a/doc/repmgrd-witness-server.sgml
+++ b/doc/repmgrd-witness-server.sgml
@@ -0,0 +1,31 @@
 <chapter id="repmgrd-witness-server" xreflabel="Using a witness server with repmgrd">
 <indexterm>
   <primary>repmgrd</primary>
   <secondary>witness server</secondary>
 </indexterm>
 <title>Using a witness server with repmgrd</title>
 <para>
   In a situation caused e.g. by a network interruption between two
   data centres, it's important to avoid a "split-brain" situation where
   both sides of the network assume they are the active segment and the
   side without an active primary unilaterally promotes one of its standbys.
 </para>
 <para>
   To prevent this situation happening, it's essential to ensure that one
   network segment has a "voting majority", so other segments will know
   they're in the minority and not attempt to promote a new primary. Where
   an odd number of servers exists, this is not an issue. However, if each
   network has an even number of nodes, it's necessary to provide some way
   of ensuring a majority, which is where the witness server becomes useful.
 </para>
 <para>
   This is not a fully-fledged standby node and is not integrated into
   replication, but it effectively represents the "casting vote" when
   deciding which network segment has a majority. A witness server can
   be set up using <xref linkend="repmgr-witness-register">. Note that it only
   makes sense to create a witness server in conjunction with running
   <application>repmgrd</application>; the witness server will require its own
   <application>repmgrd</application> instance.
 </para>
 </chapter>
--- a/doc/stylesheet.css
+++ b/doc/stylesheet.css
@@ -0,0 +1,96 @@
 /* doc/src/sgml/stylesheet.css */
 /* color scheme similar to www.postgresql.org */
 BODY {
 	color: #000000;
 	background: #FFFFFF;
 	font-family: verdana, sans-serif;
 }
 A:link		{ color:#0066A2; }
 A:visited	{ color:#004E66; }
 A:active	{ color:#0066A2; }
 A:hover		{ color:#000000; }
 H1 {
 	font-size: 1.4em;
 	font-weight: bold;
 	margin-top: 0em;
 	margin-bottom: 0em;
 	color: #EC5800;
 }
 H2 {
 	font-size: 1.2em;
 	margin: 1.2em 0em 1.2em 0em;
 	font-weight: bold;
 	color: #666;
 }
 H3 {
 	font-size: 1.1em;
 	margin: 1.2em 0em 1.2em 0em;
 	font-weight: bold;
 	color: #666;
 }
 H4 {
 	font-size: 0.95em;
 	margin: 1.2em 0em 1.2em 0em;
 	font-weight: normal;
 	color: #666;
 }
 H5 {
 	font-size: 0.9em;
 	margin: 1.2em 0em 1.2em 0em;
 	font-weight: normal;
 }
 H6 {
 	font-size: 0.85em;
 	margin: 1.2em 0em 1.2em 0em;
 	font-weight: normal;
 }
 /* center some titles */
 .BOOK .TITLE, .BOOK .CORPAUTHOR, .BOOK .COPYRIGHT {
 	text-align: center;
 }
 /* decoration for formal examples */
 DIV.EXAMPLE {
 	padding-left: 15px;
 	border-style: solid;
 	border-width: 0px;
 	border-left-width: 2px;
 	border-color: black;
 	margin: 0.5ex;
 }
 /* less dense spacing of TOC */
 .BOOK .TOC DL DT {
 	padding-top: 1.5ex;
 	padding-bottom: 1.5ex;
 }
 .BOOK .TOC DL DL DT {
 	padding-top: 0ex;
 	padding-bottom: 0ex;
 }
 /* miscellaneous */
 PRE.LITERALLAYOUT, .SCREEN, .SYNOPSIS, .PROGRAMLISTING {
 	margin-left: 4ex;
 }
 .COMMENT	{ color: red; }
 VAR		{ font-family: monospace; font-style: italic; }
 /* Konqueror's standard style for ACRONYM is italic. */
 ACRONYM		{ font-style: inherit; }
--- a/doc/stylesheet.dsl
+++ b/doc/stylesheet.dsl
@@ -0,0 +1,851 @@
 <!-- doc/src/sgml/stylesheet.dsl -->
 <!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
 <!-- must turn on one of these with -i on the jade command line -->
 <!ENTITY % output-html          "IGNORE">
 <!ENTITY % output-print         "IGNORE">
 <!ENTITY % output-text          "IGNORE">
 <![ %output-html; [
 <!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML Stylesheet//EN" CDATA DSSSL>
 ]]>
 <![ %output-print; [
 <!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook Print Stylesheet//EN" CDATA DSSSL>
 ]]>
 <![ %output-text; [
 <!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML Stylesheet//EN" CDATA DSSSL>
 ]]>
 ]>
 <style-sheet>
 <style-specification use="docbook">
  <style-specification-body>
 <!-- general customization ......................................... -->
 <!-- (applicable to all output formats) -->
 (define draft-mode              #f)
 ;; Don't show manpage volume numbers
 (define %refentry-xref-manvolnum% #f)
 ;; Don't use graphics for callouts.  (We could probably do that, but
 ;; it needs extra work.)
 (define %callout-graphics%      #f)
 ;; Show comments during the development stage.
 (define %show-comments%         draft-mode)
 ;; Force a chapter TOC even if it includes only a single entry
 (define %force-chapter-toc% #t)
 ;; Don't append period if run-in title ends with any of these
 ;; characters.  We had to add the colon here.  This is fixed in
 ;; stylesheets version 1.71, so it can be removed sometime.
 (define %content-title-end-punct%
  '(#\. #\! #\? #\:))
 ;; No automatic punctuation after honorific name parts
 (define %honorific-punctuation% "")
 ;; Change display of some elements
 (element command ($mono-seq$))
 (element envar ($mono-seq$))
 (element lineannotation ($italic-seq$))
 (element literal ($mono-seq$))
 (element option ($mono-seq$))
 (element parameter ($mono-seq$))
 (element structfield ($mono-seq$))
 (element structname ($mono-seq$))
 (element symbol ($mono-seq$))
 (element token ($mono-seq$))
 (element type ($mono-seq$))
 (element varname ($mono-seq$))
 (element (programlisting emphasis) ($bold-seq$)) ;; to highlight sections of code
 ;; Special support for Tcl synopses
 (element optional
  (if (equal? (attribute-string (normalize "role")) "tcl")
      (make sequence
        (literal "?")
        ($charseq$)
        (literal "?"))
      (make sequence
        (literal %arg-choice-opt-open-str%)
        ($charseq$)
        (literal %arg-choice-opt-close-str%))))
 ;; Avoid excessive cross-reference labels
 (define (auto-xref-indirect? target ancestor)
  (cond
 ;   ;; Always add indirect references to another book
 ;   ((member (gi ancestor) (book-element-list))
 ;    #t)
   ;; Add indirect references to the section or component a block
   ;; is in iff chapters aren't autolabelled.  (Otherwise "Figure 1-3"
   ;; is sufficient)
   ((and (member (gi target) (block-element-list))
         (not %chapter-autolabel%))
    #t)
   ;; Add indirect references to the component a section is in if
   ;; the sections are not autolabelled
   ((and (member (gi target) (section-element-list))
         (member (gi ancestor) (component-element-list))
         (not %section-autolabel%))
    #t)
   (else #f)))
 ;; Bibliography things
 ;; Use the titles of bibliography entries in cross-references
 (define biblio-xref-title       #t)
 ;; Process bibliography entry components in the order shown below, not
 ;; in the order they appear in the document.  (I suppose this should
 ;; be made to fit some publishing standard.)
 (define %biblioentry-in-entry-order% #f)
 (define (biblioentry-inline-elements)
  (list
   (normalize "author")
   (normalize "authorgroup")
   (normalize "title")
   (normalize "subtitle")
   (normalize "volumenum")
   (normalize "edition")
   (normalize "othercredit")
   (normalize "contrib")
   (normalize "editor")
   (normalize "publishername")
   (normalize "confgroup")
   (normalize "publisher")
   (normalize "isbn")
   (normalize "issn")
   (normalize "pubsnumber")
   (normalize "date")
   (normalize "pubdate")
   (normalize "pagenums")
   (normalize "bibliomisc")))
 (mode biblioentry-inline-mode
  (element confgroup
    (make sequence
      (literal "Proc. ")
      (next-match)))
  (element isbn
    (make sequence
      (literal "ISBN ")
      (process-children)))
  (element issn
    (make sequence
      (literal "ISSN ")
      (process-children))))
 ;; The rules in the default stylesheet for productname format it as a
 ;; paragraph.  This may be suitable for productname directly within
 ;; *info, but it's nonsense when productname is used inline, as we do.
 (mode book-titlepage-recto-mode
  (element (para productname) ($charseq$)))
 (mode book-titlepage-verso-mode
  (element (para productname) ($charseq$)))
 ;; Add more here if needed...
 ;; Replace a sequence of whitespace in a string by a single space
 (define (normalize-whitespace str #!optional (whitespace '(#\space #\U-000D)))
  (let loop ((characters (string->list str))
             (result '())
             (prev-was-space #f))
    (if (null? characters)
        (list->string (reverse result))
        (let ((c (car characters))
              (rest (cdr characters)))
          (if (member c whitespace)
              (if prev-was-space
                  (loop rest result #t)
                  (loop rest (cons #\space result) #t))
              (loop rest (cons c result) #f))))))
 <!-- HTML output customization ..................................... -->
 <![ %output-html; [
 (define %section-autolabel%     #t)
 (define %label-preface-sections% #f)
 (define %generate-legalnotice-link% #t)
 (define %html-ext%              ".html")
 (define %root-filename%         "index")
 (define %link-mailto-url%       (string-append "mailto: repmgr-list@2ndquadrant.com"))
 (define %use-id-as-filename%    #t)
 (define website-build           #f)
 (define %stylesheet%            (if website-build "/resources/docs.css" "website-docs.css"))
 (define %graphic-default-extension% "gif")
 (define %body-attr%             '())
 (define ($generate-book-lot-list$) '())
 (define use-output-dir          #t)
 (define %output-dir%            "html")
 (define html-index-filename     "../HTML.index")
 ;; Only build HTML.index or the actual HTML output, not both.  Saves a
 ;; *lot* of time.  (overrides docbook.dsl)
 (root
   (if (not html-index)
       (make sequence
         (process-children)
         (with-mode manifest
           (process-children)))
       (with-mode htmlindex
         (process-children))))
 ;; Do not combine first section into chapter chunk.
 (define (chunk-skip-first-element-list) '())
 ;; Returns the depth of auto TOC that should be made at the nd-level
 (define (toc-depth nd)
  (cond ((string=? (gi nd) (normalize "book")) 2)
        ((string=? (gi nd) (normalize "part")) 2)
        ((string=? (gi nd) (normalize "chapter")) 2)
        (else 1)))
 ;; Add character encoding and time of creation into HTML header
 (define %html-header-tags%
  (list (list "META" '("HTTP-EQUIV" "Content-Type") '("CONTENT" "text/html; charset=ISO-8859-1"))
        (list "META" '("NAME" "creation") (list "CONTENT" (time->string (time) #t)))))
 ;; Block elements are allowed in PARA in DocBook, but not in P in
 ;; HTML.  With %fix-para-wrappers% turned on, the stylesheets attempt
 ;; to avoid putting block elements in HTML P tags by outputting
 ;; additional end/begin P pairs around them.
 (define %fix-para-wrappers% #t)
 ;; ...but we need to do some extra work to make the above apply to PRE
 ;; as well.  (mostly pasted from dbverb.dsl)
 (define ($verbatim-display$ indent line-numbers?)
  (let ((content (make element gi: "PRE"
                       attributes: (list
                                    (list "CLASS" (gi)))
                       (if (or indent line-numbers?)
                           ($verbatim-line-by-line$ indent line-numbers?)
                           (process-children)))))
    (if %shade-verbatim%
        (make element gi: "TABLE"
              attributes: ($shade-verbatim-attr$)
              (make element gi: "TR"
                    (make element gi: "TD"
                          content)))
        (make sequence
          (para-check)
          content
          (para-check 'restart)))))
 ;; ...and for notes.
 (element note
  (make sequence
    (para-check)
    ($admonition$)
    (para-check 'restart)))
 ;;; XXX The above is very ugly.  It might be better to run 'tidy' on
 ;;; the resulting *.html files.
 ;; Format multiple terms in varlistentry vertically, instead
 ;; of comma-separated.
 (element (varlistentry term)
  (make sequence
    (process-children-trim)
    (if (not (last-sibling?))
        (make empty-element gi: "BR")
        (empty-sosofo))))
 ;; Customization of header
 ;; - make title a link to the home page
 ;; - add tool tips to Prev/Next links
 ;; - add Up link
 ;; (overrides dbnavig.dsl)
 (define (default-header-nav-tbl-noff elemnode prev next prevsib nextsib)
  (let* ((r1? (nav-banner? elemnode))
         (r1-sosofo (make element gi: "TR"
                          (make element gi: "TH"
                                attributes: (list
                                             (list "COLSPAN" "4")
                                             (list "ALIGN" "center")
                                             (list "VALIGN" "bottom"))
                                (make element gi: "A"
                                      attributes: (list
                                                   (list "HREF" (href-to (nav-home elemnode))))
                                      (nav-banner elemnode)))))
         (r2? (or (not (node-list-empty? prev))
                  (not (node-list-empty? next))
                  (nav-context? elemnode)))
         (r2-sosofo (make element gi: "TR"
                          (make element gi: "TD"
                                attributes: (list
                                             (list "WIDTH" "10%")
                                             (list "ALIGN" "left")
                                             (list "VALIGN" "top"))
                                (if (node-list-empty? prev)
                                    (make entity-ref name: "nbsp")
                                    (make element gi: "A"
                                          attributes: (list
                                                       (list "TITLE" (element-title-string prev))
                                                       (list "HREF"
                                                             (href-to
                                                              prev))
                                                       (list "ACCESSKEY"
                                                             "P"))
                                          (gentext-nav-prev prev))))
                          (make element gi: "TD"
                                attributes: (list
                                             (list "WIDTH" "10%")
                                             (list "ALIGN" "left")
                                             (list "VALIGN" "top"))
                                (if (nav-up? elemnode)
                                    (nav-up elemnode)
                                    (nav-home-link elemnode)))
                          (make element gi: "TD"
                                attributes: (list
                                             (list "WIDTH" "60%")
                                             (list "ALIGN" "center")
                                             (list "VALIGN" "bottom"))
                                (nav-context elemnode))
                          (make element gi: "TD"
                                attributes: (list
                                             (list "WIDTH" "20%")
                                             (list "ALIGN" "right")
                                             (list "VALIGN" "top"))
                                (if (node-list-empty? next)
                                    (make entity-ref name: "nbsp")
                                    (make element gi: "A"
                                          attributes: (list
                                                       (list "TITLE" (element-title-string next))
                                                       (list "HREF"
                                                             (href-to
                                                              next))
                                                       (list "ACCESSKEY"
                                                             "N"))
                                          (gentext-nav-next next)))))))
    (if (or r1? r2?)
        (make element gi: "DIV"
              attributes: '(("CLASS" "NAVHEADER"))
          (make element gi: "TABLE"
                attributes: (list
                             (list "SUMMARY" "Header navigation table")
                             (list "WIDTH" %gentext-nav-tblwidth%)
                             (list "BORDER" "0")
                             (list "CELLPADDING" "0")
                             (list "CELLSPACING" "0"))
                (if r1? r1-sosofo (empty-sosofo))
                (if r2? r2-sosofo (empty-sosofo)))
          (make empty-element gi: "HR"
                attributes: (list
                             (list "ALIGN" "LEFT")
                             (list "WIDTH" %gentext-nav-tblwidth%))))
        (empty-sosofo))))
 ;; Put index "quicklinks" (A | B | C | ...) at the top of the bookindex page.
 (element index
  (let ((preamble (node-list-filter-by-not-gi
                   (children (current-node))
                   (list (normalize "indexentry"))))
        (indexdivs  (node-list-filter-by-gi
                     (children (current-node))
                     (list (normalize "indexdiv"))))
        (entries  (node-list-filter-by-gi
                   (children (current-node))
                   (list (normalize "indexentry")))))
    (html-document
     (with-mode head-title-mode
       (literal (element-title-string (current-node))))
     (make element gi: "DIV"
           attributes: (list (list "CLASS" (gi)))
           ($component-separator$)
           ($component-title$)
           (if (node-list-empty? indexdivs)
               (empty-sosofo)
               (make element gi: "P"
                     attributes: (list (list "CLASS" "INDEXDIV-QUICKLINKS"))
                     (with-mode indexdiv-quicklinks-mode
                       (process-node-list indexdivs))))
           (process-node-list preamble)
           (if (node-list-empty? entries)
               (empty-sosofo)
               (make element gi: "DL"
                     (process-node-list entries)))))))
 (mode indexdiv-quicklinks-mode
  (element indexdiv
    (make sequence
      (make element gi: "A"
            attributes: (list (list "HREF" (href-to (current-node))))
            (element-title-sosofo))
      (if (not (last-sibling?))
          (literal " | ")
          (literal "")))))
 ;; Changed to strip and normalize index term content (overrides
 ;; dbindex.dsl)
 (define (htmlindexterm)
  (let* ((attr    (gi (current-node)))
         (content (data (current-node)))
         (string  (strip (normalize-whitespace content))) ;; changed
         (sortas  (attribute-string (normalize "sortas"))))
    (make sequence
      (make formatting-instruction data: attr)
      (if sortas
          (make sequence
            (make formatting-instruction data: "[")
            (make formatting-instruction data: sortas)
            (make formatting-instruction data: "]"))
          (empty-sosofo))
      (make formatting-instruction data: " ")
      (make formatting-instruction data: string)
      (htmlnewline))))
 (define ($html-body-start$)
        (if website-build
            (make empty-element gi: "!--#include virtual=\"/resources/docs-header.html\"--")
            (empty-sosofo)))
 (define ($html-body-end$)
        (if website-build
            (make empty-element gi: "!--#include virtual=\"/resources/docs-footer.html\"--")
            (empty-sosofo)))
 ]]> <!-- %output-html -->
 <!-- Print output customization .................................... -->
 <![ %output-print; [
 (define %section-autolabel%     #t)
 (define %default-quadding%      'justify)
 ;; Don't know how well hyphenation works with other backends.  Might
 ;; turn this on if desired.
 (define %hyphenation%
  (if tex-backend #t #f))
 ;; Put footnotes at the bottom of the page (rather than end of
 ;; section), and put the URLs of links into footnotes.
 ;;
 ;; bop-footnotes only works with TeX, otherwise it's ignored.  But
 ;; when both of these are #t and TeX is used, you need at least
 ;; stylesheets 1.73 because otherwise you don't get any footnotes at
 ;; all for the links.
 (define bop-footnotes           #t)
 (define %footnote-ulinks%       #t)
 (define %refentry-new-page%     #t)
 (define %refentry-keep%         #f)
 ;; Disabled because of TeX problems
 ;; (http://archives.postgresql.org/pgsql-docs/2007-12/msg00056.php)
 (define ($generate-book-lot-list$) '())
 ;; Indentation of verbatim environments.  (This should really be done
 ;; with start-indent in DSSSL.)
 ;; Use of indentation in this area exposes a bug in openjade,
 ;; http://archives.postgresql.org/pgsql-docs/2006-12/msg00064.php
 ;; (define %indent-programlisting-lines% "    ")
 ;; (define %indent-screen-lines% "    ")
 ;; (define %indent-synopsis-lines% "    ")
 ;; Default graphic format: Jadetex wants eps, pdfjadetex wants pdf.
 ;; (Note that pdfjadetex will not accept eps, that's why we need to
 ;; create a different .tex file for each.)  What works with RTF?
 (define texpdf-output #f) ;; override from command line
 (define %graphic-default-extension%
  (cond (tex-backend (if texpdf-output "pdf" "eps"))
        (rtf-backend "gif")
        (else "XXX")))
 ;; Need to add pdf here so that the above works.  Default setup
 ;; doesn't know about PDF.
 (define preferred-mediaobject-extensions
  (list "eps" "ps" "jpg" "jpeg" "pdf" "png"))
 ;; Don't show links when citing a bibliography entry.  This fouls up
 ;; the footnumber counting.  To get the link, one can still look into
 ;; the bibliography itself.
 (mode xref-title-mode
  (element ulink
    (process-children)))
 ;; Format legalnotice justified and with space between paragraphs.
 (mode book-titlepage-verso-mode
  (element (legalnotice para)
    (make paragraph
      use: book-titlepage-verso-style   ;; alter this if ever it needs to appear elsewhere
      quadding: %default-quadding%
      line-spacing: (* 0.8 (inherited-line-spacing))
      font-size: (* 0.8 (inherited-font-size))
      space-before: (* 0.8 %para-sep%)
      space-after: (* 0.8 %para-sep%)
      first-line-start-indent: (if (is-first-para)
                                   (* 0.8 %para-indent-firstpara%)
                                   (* 0.8 %para-indent%))
      (process-children))))
 ;; Fix spacing problems in variablelists
 (element (varlistentry term)
  (make paragraph
    space-before: (if (first-sibling?)
                      %para-sep%
                      0pt)
    keep-with-next?: #t
    (process-children)))
 (define %varlistentry-indent% 2em)
 (element (varlistentry listitem)
  (make sequence
    start-indent: (+ (inherited-start-indent) %varlistentry-indent%)
    (process-children)))
 ;; Whitespace fixes for itemizedlists and orderedlists
 (define (process-listitem-content)
  (if (absolute-first-sibling?)
      (make sequence
        (process-children-trim))
      (next-match)))
 ;; Default stylesheets format simplelists as tables.  This spells
 ;; trouble for Jade.  So we just format them as plain lines.
 (define %simplelist-indent% 1em)
 (define (my-simplelist-vert members)
  (make display-group
    space-before: %para-sep%
    space-after: %para-sep%
    start-indent: (+ %simplelist-indent% (inherited-start-indent))
    (process-children)))
 (element simplelist
  (let ((type (attribute-string (normalize "type")))
        (cols (if (attribute-string (normalize "columns"))
                  (if (> (string->number (attribute-string (normalize "columns"))) 0)
                      (string->number (attribute-string (normalize "columns")))
                      1)
                  1))
        (members (select-elements (children (current-node)) (normalize "member"))))
    (cond
       ((equal? type (normalize "inline"))
        (if (equal? (gi (parent (current-node)))
                    (normalize "para"))
            (process-children)
            (make paragraph
              space-before: %para-sep%
              space-after: %para-sep%
              start-indent: (inherited-start-indent))))
       ((equal? type (normalize "vert"))
        (my-simplelist-vert members))
       ((equal? type (normalize "horiz"))
        (simplelist-table 'row    cols members)))))
 (element member
  (let ((type (inherited-attribute-string (normalize "type"))))
    (cond
     ((equal? type (normalize "inline"))
      (make sequence
        (process-children)
        (if (not (last-sibling?))
            (literal ", ")
            (literal ""))))
      ((equal? type (normalize "vert"))
       (make paragraph
         space-before: 0pt
         space-after: 0pt))
      ((equal? type (normalize "horiz"))
       (make paragraph
         quadding: 'start
         (process-children))))))
 ;; Jadetex doesn't handle links to the content of tables, so
 ;; indexterms that point to table entries will go nowhere.  We fix
 ;; this by pointing the index entry to the table itself instead, which
 ;; should be equally useful in practice.
 (define (find-parent-table nd)
  (let ((table (ancestor-member nd ($table-element-list$))))
    (if (node-list-empty? table)
        nd
        table)))
 ;; (The function below overrides the one in print/dbindex.dsl.)
 (define (indexentry-link nd)
  (let* ((id        (attribute-string (normalize "role") nd))
         (prelim-target (find-indexterm id))
         (target    (find-parent-table prelim-target))
         (preferred (not (node-list-empty?
                          (select-elements (children (current-node))
                                           (normalize "emphasis")))))
         (sosofo    (if (node-list-empty? target)
                        (literal "?")
                        (make link
                          destination: (node-list-address target)
                          (with-mode toc-page-number-mode
                            (process-node-list target))))))
    (if preferred
        (make sequence
          font-weight: 'bold
          sosofo)
        sosofo)))
 ;; By default, the part and reference title pages get wrong page
 ;; numbers: The first title page gets roman numerals carried over from
 ;; preface/toc -- we want Arabic numerals.  We also need to make sure
 ;; that page-number-restart is set of #f explicitly, because otherwise
 ;; it will carry over from the previous component, which is not good.
 ;;
 ;; (This looks worse than it is.  It's copied from print/dbttlpg.dsl
 ;; and common/dbcommon.dsl and modified in minor detail.)
 (define (first-part?)
  (let* ((book (ancestor (normalize "book")))
         (nd   (ancestor-member (current-node)
                                (append
                                 (component-element-list)
                                 (division-element-list))))
         (bookch (children book)))
    (let loop ((nl bookch))
      (if (node-list-empty? nl)
          #f
          (if (equal? (gi (node-list-first nl)) (normalize "part"))
              (if (node-list=? (node-list-first nl) nd)
                  #t
                  #f)
              (loop (node-list-rest nl)))))))
 (define (first-reference?)
  (let* ((book (ancestor (normalize "book")))
         (nd   (ancestor-member (current-node)
                                (append
                                 (component-element-list)
                                 (division-element-list))))
         (bookch (children book)))
    (let loop ((nl bookch))
      (if (node-list-empty? nl)
          #f
          (if (equal? (gi (node-list-first nl)) (normalize "reference"))
              (if (node-list=? (node-list-first nl) nd)
                  #t
                  #f)
              (loop (node-list-rest nl)))))))
 (define (part-titlepage elements #!optional (side 'recto))
  (let ((nodelist (titlepage-nodelist
                   (if (equal? side 'recto)
                       (reference-titlepage-recto-elements)
                       (reference-titlepage-verso-elements))
                   elements))
        ;; partintro is a special case...
        (partintro (node-list-first
                    (node-list-filter-by-gi elements (list (normalize "partintro"))))))
    (if (part-titlepage-content? elements side)
        (make simple-page-sequence
          page-n-columns: %titlepage-n-columns%
          ;; Make sure that page number format is correct.
          page-number-format: ($page-number-format$)
          ;; Make sure that the page number is set to 1 if this is the
          ;; first part in the book
          page-number-restart?: (first-part?)
          input-whitespace-treatment: 'collapse
          use: default-text-style
          ;; This hack is required for the RTF backend. If an external-graphic
          ;; is the first thing on the page, RTF doesn't seem to do the right
          ;; thing (the graphic winds up on the baseline of the first line
          ;; of the page, left justified).  This "one point rule" fixes
          ;; that problem.
          (make paragraph
            line-spacing: 1pt
            (literal ""))
          (let loop ((nl nodelist) (lastnode (empty-node-list)))
            (if (node-list-empty? nl)
                (empty-sosofo)
                (make sequence
                  (if (or (node-list-empty? lastnode)
                          (not (equal? (gi (node-list-first nl))
                                       (gi lastnode))))
                      (part-titlepage-before (node-list-first nl) side)
                      (empty-sosofo))
                  (cond
                   ((equal? (gi (node-list-first nl)) (normalize "subtitle"))
                    (part-titlepage-subtitle (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "title"))
                    (part-titlepage-title (node-list-first nl) side))
                   (else
                    (part-titlepage-default (node-list-first nl) side)))
                  (loop (node-list-rest nl) (node-list-first nl)))))
          (if (and %generate-part-toc%
                   %generate-part-toc-on-titlepage%
                   (equal? side 'recto))
              (make display-group
                (build-toc (current-node)
                           (toc-depth (current-node))))
              (empty-sosofo))
          ;; PartIntro is a special case
          (if (and (equal? side 'recto)
                   (not (node-list-empty? partintro))
                   %generate-partintro-on-titlepage%)
              ($process-partintro$ partintro #f)
              (empty-sosofo)))
        (empty-sosofo))))
 (define (reference-titlepage elements #!optional (side 'recto))
  (let ((nodelist (titlepage-nodelist
                   (if (equal? side 'recto)
                       (reference-titlepage-recto-elements)
                       (reference-titlepage-verso-elements))
                   elements))
        ;; partintro is a special case...
        (partintro (node-list-first
                    (node-list-filter-by-gi elements (list (normalize "partintro"))))))
    (if (reference-titlepage-content? elements side)
        (make simple-page-sequence
          page-n-columns: %titlepage-n-columns%
          ;; Make sure that page number format is correct.
          page-number-format: ($page-number-format$)
          ;; Make sure that the page number is set to 1 if this is the
          ;; first part in the book
          page-number-restart?: (first-reference?)
          input-whitespace-treatment: 'collapse
          use: default-text-style
          ;; This hack is required for the RTF backend. If an external-graphic
          ;; is the first thing on the page, RTF doesn't seem to do the right
          ;; thing (the graphic winds up on the baseline of the first line
          ;; of the page, left justified).  This "one point rule" fixes
          ;; that problem.
          (make paragraph
            line-spacing: 1pt
            (literal ""))
          (let loop ((nl nodelist) (lastnode (empty-node-list)))
            (if (node-list-empty? nl)
                (empty-sosofo)
                (make sequence
                  (if (or (node-list-empty? lastnode)
                          (not (equal? (gi (node-list-first nl))
                                       (gi lastnode))))
                      (reference-titlepage-before (node-list-first nl) side)
                      (empty-sosofo))
                  (cond
                   ((equal? (gi (node-list-first nl)) (normalize "author"))
                    (reference-titlepage-author (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "authorgroup"))
                    (reference-titlepage-authorgroup (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "corpauthor"))
                    (reference-titlepage-corpauthor (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "editor"))
                    (reference-titlepage-editor (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "subtitle"))
                    (reference-titlepage-subtitle (node-list-first nl) side))
                   ((equal? (gi (node-list-first nl)) (normalize "title"))
                    (reference-titlepage-title (node-list-first nl) side))
                   (else
                    (reference-titlepage-default (node-list-first nl) side)))
                  (loop (node-list-rest nl) (node-list-first nl)))))
          (if (and %generate-reference-toc%
                   %generate-reference-toc-on-titlepage%
                   (equal? side 'recto))
              (make display-group
                (build-toc (current-node)
                           (toc-depth (current-node))))
              (empty-sosofo))
          ;; PartIntro is a special case
          (if (and (equal? side 'recto)
                   (not (node-list-empty? partintro))
                   %generate-partintro-on-titlepage%)
              ($process-partintro$ partintro #f)
              (empty-sosofo)))
        (empty-sosofo))))
 ]]> <!-- %output-print -->
 <!-- Plain text output customization ............................... -->
 <!--
 This is used for making the INSTALL file and others.  We customize the
 HTML stylesheets to be suitable for dumping plain text (via Netscape,
 Lynx, or similar).
 -->
 <![ %output-text; [
 (define %section-autolabel% #f)
 (define %chapter-autolabel% #f)
 (define $generate-chapter-toc$ (lambda () #f))
 ;; For text output, produce "ASCII markup" for emphasis and such.
 (define ($asterix-seq$ #!optional (sosofo (process-children)))
  (make sequence
    (literal "*")
    sosofo
    (literal "*")))
 (define ($dquote-seq$ #!optional (sosofo (process-children)))
  (make sequence
    (literal (gentext-start-quote))
    sosofo
    (literal (gentext-end-quote))))
 (element (para command) ($dquote-seq$))
 (element (para emphasis) ($asterix-seq$))
 (element (para filename) ($dquote-seq$))
 (element (para option) ($dquote-seq$))
 (element (para replaceable) ($dquote-seq$))
 (element (para userinput) ($dquote-seq$))
 ]]> <!-- %output-text -->
  </style-specification-body>
 </style-specification>
 <external-specification id="docbook" document="dbstyle">
 </style-sheet>
--- a/doc/switchover.sgml
+++ b/doc/switchover.sgml
@@ -0,0 +1,426 @@
 <chapter id="performing-switchover" xreflabel="Performing a switchover with repmgr">
 <indexterm>
  <primary>switchover</primary>
 </indexterm>
 <title>Performing a switchover with repmgr</title>
 <para>
  A typical use-case for replication is a combination of primary and standby
  server, with the standby serving as a backup which can easily be activated
  in case of a problem with the primary. Such an unplanned failover would
  normally be handled by promoting the standby, after which an appropriate
  action must be taken to restore the old primary.
 </para>
 <para>
  In some cases however it's desirable to promote the standby in a planned
  way, e.g. so maintenance can be performed on the primary; this kind of switchover
  is supported by the <xref linkend="repmgr-standby-switchover"> command.
 </para>
 <para>
  <command>repmgr standby switchover</command> differs from other &repmgr;
  actions in that it also performs actions on other servers (the demotion
  candidate, and optionally any other servers which are to follow the new primary),
  which means passwordless SSH access is required to those servers from the one where
  <command>repmgr standby switchover</command> is executed.
 </para>
 <note>
  <simpara>
   <command>repmgr standby switchover</command> performs a relatively complex
   series of operations on two servers, and should therefore be performed after
   careful preparation and with adequate attention. In particular you should
   be confident that your network environment is stable and reliable.
  </simpara>
  <simpara>
   Additionally you should be sure that the current primary can be shut down
   quickly and cleanly. In particular, access from applications should be
   minimalized or preferably blocked completely. Also be aware that if there
   is a backlog of files waiting to be archived, PostgreSQL will not shut
   down until archiving completes.
  </simpara>
  <simpara>
    We recommend running <command>repmgr standby switchover</command> at the
    most verbose logging level (<literal>--log-level=DEBUG --verbose</literal>)
    and capturing all output to assist troubleshooting any problems.
  </simpara>
  <simpara>
   Please also read carefully the sections <xref linkend="preparing-for-switchover"> and
   <xref linkend="switchover-caveats"> below.
  </simpara>
 </note>
 <sect1 id="preparing-for-switchover" xreflabel="Preparing for switchover">
   <indexterm>
     <primary>switchover</primary>
     <secondary>preparation</secondary>
   </indexterm>
   <title>Preparing for switchover</title>
   <para>
    As mentioned in the previous section, success of the switchover operation depends on
    &repmgr; being able to shut down the current primary server quickly and cleanly.
   </para>
   <para>
     Ensure that the promotion candidate has sufficient free walsenders available
     (PostgreSQL configuration item <varname>max_wal_senders</varname>), and if replication
     slots are in use, at least one free slot is available for the demotion candidate (
     PostgreSQL configuration item <varname>max_replication_slots</varname>).
   </para>
   <para>
     Ensure that a passwordless SSH connection is possible from the promotion candidate
     (standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
     will be used, ensure that passwordless SSH connections are possible from the
     promotion candidate to all standbys attached to the demotion candidate.
   </para>
   <note>
     <simpara>
       &repmgr; expects to find the &repmgr; binary in the same path on the remote
       server as on the local server.
     </simpara>
   </note>
   <para>
    Double-check which commands will be used to stop/start/restart the current
    primary; this can be done by e.g. executing <command><link linkend="repmgr-node-service">repmgr node service</link></command>
    on the current primary:
    <programlisting>
     repmgr -f /etc/repmgr.conf node service --list-actions --action=stop
     repmgr -f /etc/repmgr.conf node service --list-actions --action=start
     repmgr -f /etc/repmgr.conf node service --list-actions --action=restart</programlisting>
   </para>
   <para>
     These commands can be defined in <filename>repmgr.conf</filename> with
     <option>service_start_command</option>, <option>service_stop_command</option>
     and <option>service_restart_command</option>.
   </para>
   <important>
     <para>
       If &repmgr; is installed from a package. you should set these commands
       to use the appropriate service commands defined by the package/operating
       system as these will ensure PostgreSQL is stopped/started properly
       taking into account configuration and log file locations etc.
     </para>
     <para>
       If the <option>service_*_command</option> options aren't defined, &repmgr; will
       fall back to using <application>pg_ctl</application> to stop/start/restart
       PostgreSQL, which may not work properly, particularly when executed on a remote
       server.
     </para>
     <para>
       For more details, see <xref linkend="configuration-file-service-commands">.
     </para>
   </important>
   <note>
    <simpara>
     On <literal>systemd</literal> systems we strongly recommend using the appropriate
     <command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
     <literal>systemd</literal> is informed about the status of the PostgreSQL service.
    </simpara>
    <simpara>
     If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
     <command>sudo</command> specification doesn't require a real tty for the user. If not set
     this way, <command>repmgr</command> will fail to stop the primary.
    </simpara>
   </note>
   <para>
     Check that access from applications is minimalized or preferably blocked
     completely, so applications are not unexpectedly interrupted.
   </para>
   <note>
     <para>
       If an exclusive backup is running on the current primary, &repmgr; will not perform the
       switchover.
     </para>
   </note>
   <para>
     Check there is no significant replication lag on standbys attached to the
     current primary.
   </para>
   <para>
    If WAL file archiving is set up, check that there is no backlog of files waiting
    to be archived, as PostgreSQL will not finally shut down until all of these have been
    archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files,
    &repmgr; will emit a warning before attempting to perform a switchover; you can also check
    manually with <command>repmgr node check --archive-ready</command>.
   </para>
    <note>
      <para>
        From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
        <application>repmgrd</application> instances to pause operations while the switchover
        is being carried out, to prevent <application>repmgrd</application> from
        unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing">.
      </para>
      <para>
        Users of &repmgr; versions prior to 4.2 should ensure that <application>repmgrd</application>
        is not running on any nodes while a switchover is being executed.
      </para>
    </note>
   <para>
    Finally, consider executing <command>repmgr standby switchover</command> with the
    <literal>--dry-run</literal> option; this will perform any necessary checks and inform you about
    success/failure, and stop before the first actual command is run (which would be the shutdown of the
    current primary). Example output:
    <programlisting>
      $ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run
      NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
      INFO: SSH connection to host "node1" succeeded
      INFO: archive mode is "off"
      INFO: replication lag on this standby is 0 seconds
      INFO: all sibling nodes are reachable via SSH
      NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
      INFO: following shutdown command would be run on node "node1":
        "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
    </programlisting>
   </para>
   <important>
     <para>
       Be aware that <option>--dry-run</option> checks the prerequisites
       for performing the switchover and some basic sanity checks on the
       state of the database which might effect the switchover operation
       (e.g. replication lag); it cannot however guarantee the switchover
       operation will succeed. In particular, if the current primary
       does not shut down cleanly, &repmgr; will not be able to reliably
       execute the switchover (as there would be a danger of divergence
       between the former and new primary nodes).
     </para>
   </important>
   <note>
     <simpara>
       See <xref linkend="repmgr-standby-switchover"> for a full list of available
       command line options and <filename>repmgr.conf</filename> settings relevant
       to performing a switchover.
     </simpara>
   </note>
  <sect2 id="switchover-pg-rewind" xreflabel="Switchover and pg_rewind">
    <indexterm>
      <primary>pg_rewind</primary>
      <secondary>using with "repmgr standby switchover"</secondary>
    </indexterm>
    <title>Switchover and pg_rewind</title>
    <para>
      If the demotion candidate does not shut down smoothly or cleanly, there's a risk it
      will have a slightly divergent timeline and will not be able to attach to the new
      primary. To fix this situation without needing to reclone the old primary, it's
      possible to use the <application>pg_rewind</application> utility, which will usually be
      able to resync the two servers.
    </para>
    <para>
      To have &repmgr; execute <application>pg_rewind</application> if it detects this
      situation after promoting the new primary, add the <option>--force-rewind</option>
      option.
    </para>
    <note>
      <simpara>
        If &repmgr; detects a situation where it needs to execute <application>pg_rewind</application>,
        it will execute a <literal>CHECKPOINT</literal> on the new primary before executing
        <application>pg_rewind</application>.
      </simpara>
    </note>
    <para>
      For more details on <application>pg_rewind</application>, see:
      <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html">https://www.postgresql.org/docs/current/static/app-pgrewind.html</ulink>.
    </para>
    <para>
      <application>pg_rewind</application> has been part of the core PostgreSQL distribution since
      version 9.5. Users of versions 9.3 and 9.4 will need to manually install it; the source code is available here:
      <ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
      If the <application>pg_rewind</application>
      binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
      its full path  on the demotion candidate  with <option>--force-rewind</option>.
    </para>
    <para>
      Note that building the 9.3/9.4 version of <application>pg_rewind</application> requires the PostgreSQL
      source code. Also, PostgreSQL 9.3 does not provide <varname>wal_log_hints</varname>,
      meaning data checksums must have been enabled when the database was initialized.
    </para>
  </sect2>
 </sect1>
 <sect1 id="switchover-execution" xreflabel="Executing the switchover command">
  <indexterm>
   <primary>switchover</primary>
    <secondary>execution</secondary>
  </indexterm>
  <title>Executing the switchover command</title>
  <para>
   To demonstrate switchover, we will assume a replication cluster with a
   primary (<literal>node1</literal>) and one standby (<literal>node2</literal>);
   after the switchover <literal>node2</literal> should become the primary with
   <literal>node1</literal> following it.
  </para>
  <para>
   The switchover command must be run from the standby which is to be promoted,
   and in its simplest form looks like this:
   <programlisting>
    $ repmgr -f /etc/repmgr.conf standby switchover
    NOTICE: executing switchover on node "node2" (ID: 2)
    INFO: searching for primary node
    INFO: checking if node 1 is primary
    INFO: current primary node is 1
    INFO: SSH connection to host "node1" succeeded
    INFO: archive mode is "off"
    INFO: replication lag on this standby is 0 seconds
    NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
    NOTICE: stopping current primary node "node1" (ID: 1)
    NOTICE: issuing CHECKPOINT
    DETAIL: executing server command "pg_ctl -l /var/log/postgres/startup.log -D '/var/lib/pgsql/data' -m fast -W stop"
    INFO: checking primary status; 1 of 6 attempts
    NOTICE: current primary has been cleanly shut down at location 0/3001460
    NOTICE: promoting standby to primary
    DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote"
    server promoting
    NOTICE: STANDBY PROMOTE successful
    DETAIL: server "node2" (ID: 2) was successfully promoted to primary
    INFO: setting node 1's primary to node 2
    NOTICE: starting server using  "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' restart"
    NOTICE: NODE REJOIN successful
    DETAIL: node 1 is now attached to node 2
    NOTICE: switchover was successful
    DETAIL: node "node2" is now primary
    NOTICE: STANDBY SWITCHOVER is complete
   </programlisting>
  </para>
  <para>
   The old primary is now replicating as a standby from the new primary, and the
   cluster status will now look like this:
   <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | standby |   running | node2    | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
   </programlisting>
  </para>
  <para>
    If <application>repmgrd</application> is in use, it's worth double-checking that
    all nodes are unpaused by executing <command><link linkend="repmgr-daemon-status">repmgr-daemon-status</link></command>.
  </para>
   <note>
     <para>
       Users of &repmgr; versions prior to 4.2 will need to manually restart <application>repmgrd</application>
       on all nodes after the switchover is completed.
     </para>
    </note>
 </sect1>
 <sect1 id="switchover-caveats" xreflabel="Caveats">
  <indexterm>
   <primary>switchover</primary>
    <secondary>caveats</secondary>
  </indexterm>
  <title>Caveats</title>
  <para>
   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <simpara>
      If using PostgreSQL 9.3 or 9.4, you should ensure that the shutdown command
      is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
      and later). If relying on <command>pg_ctl</command> to perform database server operations,
      you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>
      in <filename>repmgr.conf</filename>.
     </simpara>
    </listitem>
    <listitem>
     <simpara>
      <command>pg_rewind</command> *requires* that either <varname>wal_log_hints</varname> is enabled, or that
      data checksums were enabled when the cluster was initialized. See the
      <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html">pg_rewind documentation</ulink>
      for details.
     </simpara>
    </listitem>
   </itemizedlist>
  </para>
 </sect1>
 <sect1 id="switchover-troubleshooting" xreflabel="Troubleshooting">
   <indexterm>
     <primary>switchover</primary>
     <secondary>troubleshooting</secondary>
   </indexterm>
   <title>Troubleshooting switchover issues</title>
   <para>
     As <link linkend="performing-switchover">emphasised previously</link>, performing a switchover
     is a non-trivial operation and there are a number of potential issues which can occur.
     While &repmgr; attempts to perform sanity checks, there's no guaranteed way of determining the success of
     a switchover without actually carrying it out.
   </para>
   <sect2 id="switchover-troubleshooting-primary-shutdown">
     <title>Demotion candidate (old primary) does not shut down</title>
     <para>
       &repmgr; may abort a switchover with a message like:
       <programlisting>
 ERROR: shutdown of the primary server could not be confirmed
 HINT: check the primary server status before performing any further actions</programlisting>
     </para>
     <para>
       This means the shutdown of the old primary has taken longer than &repmgr; expected,
       and it has given up waiting.
     </para>
     <para>
       In this case, check the PostgreSQL log on the primary server to see what is going
       on. It's entirely possible the shutdown process is just taking longer than the
       timeout set by the configuration parameter <varname>shutdown_check_timeout</varname>
       (default: 60 seconds), in which case you may need to adjust this parameter.
     </para>
     <note>
       <para>
         Note that <varname>shutdown_check_timeout</varname> is set on the node where
         <command>repmgr standby switchover</command> is executed (promotion candidate); setting it on the
         demotion candidate (former primary) will have no effect.
       </para>
     </note>
     <para>
       If the primary server has shut down cleanly, and no other node has been promoted,
       it is safe to restart it, in which case the replication cluster will be restored
       to its original configuration.
     </para>
   </sect2>
   <sect2 id="switchover-troubleshooting-exclusive-backup">
     <title>Switchover aborts with an &quot;exclusive backup&quot; error</title>
     <para>
       &repmgr; may abort a switchover with a message like:
       <programlisting>
 ERROR: unable to perform a switchover while primary server is in exclusive backup mode
 HINT: stop backup before attempting the switchover</programlisting>
     </para>
     <para>
       This means an exclusive backup is running on the current primary; interrupting this
       will not only abort the backup, but potentially leave the primary with an ambiguous
       backup state.
     </para>
     <para>
       To proceed, either wait until the backup has finished, or cancel it with the command
       <command>SELECT pg_stop_backup()</command>. For more details see the PostgreSQL
       documentation section
       <ulink url="https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-EXCLUSIVE">Making an exclusive low level backup</ulink>.
     </para>
   </sect2>
 </sect1>
 </chapter>
--- a/doc/upgrading-from-repmgr3.md
+++ b/doc/upgrading-from-repmgr3.md
@@ -1,121 +1,9 @@
 Upgrading from repmgr 3
 =======================
-The upgrade process consists of two steps:
+This document has been integrated into the main `repmgr` documentation
 and is now located here:
-    1) converting the repmgr.conf configuration files
+> [Upgrading from repmgr 3.x](https://repmgr.org/docs/current/upgrading-from-repmgr-3.html)
    2) upgrading the repmgr schema.
 Scripts are provided to assist both with converting repmgr.conf
 and upgrading the schema.
 Converting repmgr.conf configuration files
 ------------------------------------------
 With a completely new repmgr version, we've taken the opportunity
 to rename some configuration items have had their names changed for
 clarity and consistency, both between the configuration file and
 the column names in `repmgr.nodes` (e.g. `node` → `node_id`), and
 also for consistency with PostgreSQL naming conventions
 (e.g. `loglevel` → `log_level`).
 Other configuration items have been changed to command line options,
 and vice-versa, e.g. to avoid hard-coding items such as a a node's
 upstream ID, which might change over time.
 `repmgr` will issue a warning about deprecated/altered options.
 ### Changed parameters
 Following parameters have been added:
    - `data_directory`: this is mandatory and must contain the path
        to the node's data directory
    - `monitoring_history`: this replaces the `repmgrd` command line
        option `--monitoring-history`
 Following parameters have been renamed:
    - `node` → `node_id`
    - `loglevel` → `log_level`
    - `logfacility` → `log_facility`
    - `logfile` → `log_file`
    - `master_reponse_timeout` → `async_query_timeout`
 Following parameters have been removed:
    - `cluster` is no longer required and will be ignored.
    - `upstream_node_id` is replaced by the command-line parameter
         `--upstream-node-id`
 ### Conversion script
 To assist with conversion of `repmgr.conf` files, a Perl script
 is provided in `contrib/convert-config.pl`. Use like this:
    $ ./convert-config.pl /etc/repmgr.conf
    node_id=2
    node_name=node2
    conninfo=host=localhost dbname=repmgr user=repmgr port=5602
    pg_ctl_options='-l /tmp/postgres.5602.log'
    pg_bindir=/home/barwick/devel/builds/HEAD/bin
    rsync_options=--exclude=postgresql.local.conf --archive
    log_level=DEBUG
    pg_basebackup_options=--no-slot
    data_directory=
 The converted file is printed to `STDOUT` and the original file is not
 changed.
 Please note that the parameter `data_directory` *must* be provided;
 if not already present, the conversion script will add an empty
 placeholder parameter.
 Upgrading the repmgr schema
 ---------------------------
 Ensure `repmgrd` is not running, or any cron jobs which execute the
 `repmgr` binary.
 Install `repmgr4`; any `repmgr3` packages should be uninstalled
 (if not automatically installed already).
 ### Manually create the repmgr extension
 In the database used by the existing `repmgr` configuration, execute:
    CREATE EXTENSION repmgr FROM unpackaged;
 This will move and convert all objects from the existing schema
 into the new, standard `repmgr` schema.
 > *NOTE* there must be only one schema matching 'repmgr_%' in the
 > database, otherwise this step may not work.
 ### Re-register each node
 This is necessary to update the `repmgr` metadata with some additional items.
 On the primary node, execute e.g.
    repmgr primary register -f /etc/repmgr.conf --force
 On each standby node, execute e.g.
    repmgr standby register -f /etc/repmgr.conf --force
 Check the data is updated as expected by examining the `repmgr.nodes` table;
 restart `repmgrd` if required.
 The original `repmgr_$cluster` schema can be dropped at any time.
 * * *
 > *TIP* If you don't care about any data from the existing `repmgr` installation,
 > (e.g. the contents of the `events` and `monitoring` tables), the manual
 > "CREATE EXTENSION" step can be skipped; just re-register each node, starting
 > with the primary node, and the `repmgr` extension will be automatically created.
 * * *
--- a/doc/upgrading-repmgr.sgml
+++ b/doc/upgrading-repmgr.sgml
@@ -0,0 +1,521 @@
 <chapter id="upgrading-repmgr" xreflabel="Upgrading repmgr">
 <indexterm>
  <primary>upgrading</primary>
 </indexterm>
 <title>Upgrading repmgr</title>
 <para>
  &repmgr; is updated regularly with minor releases (e.g. 4.0.1 to 4.0.2)
  containing bugfixes and other minor improvements. Any substantial new
  functionality will be included in a major release (e.g. 4.0 to 4.1).
 </para>
 <sect1 id="upgrading-repmgr-extension" xreflabel="Upgrading repmgr 4.x and later">
  <indexterm>
   <primary>upgrading</primary>
   <secondary>repmgr 4.x and later</secondary>
  </indexterm>
  <title>Upgrading repmgr 4.x and later</title>
  <para>
    From version 4, &repmgr; consists of three elements:
     <itemizedlist spacing="compact" mark="bullet">
       <listitem>
         <simpara>
           the <application>repmgr</application> and <application>repmgrd</application> executables
         </simpara>
       </listitem>
       <listitem>
         <simpara>
           the objects for the &repmgr; PostgreSQL extension (SQL files for creating/updating
           repmgr metadata, and the extension control file)
         </simpara>
       </listitem>
       <listitem>
         <simpara>
           the shared library module used by <application>repmgrd</application> which
           is resident in the PostgreSQL backend
         </simpara>
       </listitem>
     </itemizedlist>
  </para>
  <para>
    With <emphasis>minor releases</emphasis>, usually changes are only made to the <application>repmgr</application>
    and <application>repmgrd</application> executables. In this case, the upgrade is quite straightforward,
    and is simply a case of installing the new version, and restarting <application>repmgrd</application>
    (if running).
  </para>
  <para>
    For <emphasis>major releases</emphasis>, the &repmgr; PostgreSQL extension will need to be updated
    to the latest version. Additionally, if the shared library module has been updated (this is sometimes,
    but not always the case), PostgreSQL itself will need to be restarted on each node.
  </para>
  <important>
    <para>
      Always check the <link linkend="appendix-release-notes">release notes</link> for every
      release as they may contain upgrade instructions particular to individual versions.
    </para>
  </important>
  <sect2 id="upgrading-minor-version" xreflabel="Upgrading a minor version release">
 	<indexterm>
 	  <primary>upgrading</primary>
 	  <secondary>minor release</secondary>
 	</indexterm>
 	<title>Upgrading a minor version release</title>
    <para>
      The process for installing minor version upgrades is quite straightforward:
      <itemizedlist spacing="compact" mark="bullet">
        <listitem>
          <simpara>
            install the new &repmgr; version
          </simpara>
        </listitem>
        <listitem>
          <simpara>
            restart <application>repmgrd</application> on all nodes where it is running
          </simpara>
        </listitem>
      </itemizedlist>
    </para>
    <note>
 	  <para>
        Some packaging systems (e.g. <link linkend="packages-debian-ubuntu">Debian/Ubuntu</link>
        may restart <application>repmgrd</application> as part of the package upgrade process.
      </para>
    </note>
 	<para>
 	  Minor version upgrades can be performed in any order on the nodes in the replication
 	  cluster.
 	</para>
 	<para>
 	  A PostgreSQL restart is <emphasis>not</emphasis> required for minor version upgrades.
 	</para>
    <note>
 	  <para>
 	    The same &repmgr; &quot;major version&quot; (e.g. <literal>4.2</literal>) must be
 	    installed on all nodes in the replication cluster. While it's possible to have differing
 	    &repmgr; &quot;minor versions&quot; (e.g. <literal>4.2.1</literal>)  on different nodes,
 	    we strongly recommend updating all nodes to the latest minor version.
 	  </para>
    </note>
  </sect2>
  <sect2 id="upgrading-major-version" xreflabel="Upgrading a major version release">
 	<indexterm>
 	  <primary>upgrading</primary>
 	  <secondary>major release</secondary>
 	</indexterm>
 	<title>Upgrading a major version release</title>
 	<para>
 	  &quot;major version&quot; upgrades need to be planned more carefully, as they may include
 	  changes to the &repmgr; metadata (which need to be propagated from the primary to all
 	  standbys) and/or changes to the shared object file used by <application>repmgrd</application>
 	  (which require a PostgreSQL restart).
 	</para>
 	<para>
 	  With this in mind,
 	</para>
 	<para>
      <orderedlist>
 		<listitem>
          <simpara>
 			Stop <application>repmgrd</application> (if in use) on all nodes where it is running.
          </simpara>
 		</listitem>
 		<listitem>
          <simpara>
 			Disable the <application>repmgrd</application> service on all nodes where it is in use;
            this is to prevent packages from prematurely restarting <application>repmgrd</application>.
          </simpara>
 		</listitem>
 		<listitem>
          <simpara>
 			Install the updated package (or compile the updated source) on all nodes.
          </simpara>
 		</listitem>
        <listitem>
          <para>
            If running a <literal>systemd</literal>-based Linux distribution, execute (as <literal>root</literal>,
            or with appropriate <literal>sudo</literal> permissions):
            <programlisting>
 systemctl daemon-reload</programlisting>
          </para>
        </listitem>
 		<listitem>
          <simpara>
 			If the &repmgr; shared library module has been updated (check the <link linkend="appendix-release-notes">release notes</link>!),
            restart PostgreSQL, then <application>repmgrd</application> (if in use)	on each node,
            The order in which this is applied to individual nodes is not critical,
 			and it's also fine to restart PostgreSQL on all nodes first before starting <application>repmgrd</application>.
 		  </simpara>
 		  <simpara>
 			Note that if the upgrade requires a PostgreSQL restart, <application>repmgrd</application>
 			will only function correctly once all nodes have been restarted.
          </simpara>
 		</listitem>
 		<listitem>
          <para>
 			On the primary node, execute
 			<programlisting>
 ALTER EXTENSION repmgr UPDATE</programlisting>
 			in the database where &repmgr; is installed.
          </para>
 		</listitem>
 		<listitem>
          <simpara>
 			Reenable the <application>repmgrd</application> service on all nodes where it is in use, and
            ensure it is running.
          </simpara>
 		</listitem>
 	  </orderedlist>
 	</para>
 	<tip>
 	  <para>
 		If the &repmgr; upgrade requires a PostgreSQL restart, combine the &repmgr; upgrade
 		with a PostgreSQL minor version upgrade, which will require a restart in any case.
 		New PostgreSQL minor version are usually released every couple of months.
 	  </para>
 	</tip>
  </sect2>
  <sect2 id="upgrading-check-repmgrd" xreflabel="Checking repmgrd status after an upgrade">
 	<indexterm>
 	  <primary>upgrading</primary>
 	  <secondary>checking repmgrd status</secondary>
 	</indexterm>
 	<title>Checking repmgrd status after an upgrade</title>
 	<para>
      From <link linkend="release-4.2">repmgr 4.2</link>, once the upgrade is complete, execute the <command><link linkend="repmgr-daemon-status">repmgr daemon status</link></command>
      command (on any node) to show an overview of the status of <application>repmgrd</application> on all nodes.
    </para>
  </sect2>
 </sect1>
 <sect1 id="upgrading-and-pg-upgrade" xreflabel="pg_upgrade and repmgr">
  <indexterm>
   <primary>upgrading</primary>
   <secondary>pg_upgrade</secondary>
  </indexterm>
  <indexterm>
    <primary>pg_upgrade</primary>
  </indexterm>
  <title>pg_upgrade and repmgr</title>
  <para>
    <application>pg_upgrade</application> requires that if any functions are
    dependent on a shared library, this library must be present in both
    the old and new installations before <application>pg_upgrade</application>
    can be executed.
  </para>
  <para>
    To minimize the risk of any upgrade issues (particularly if an upgrade to
    a new major &repmgr; version is involved), we recommend upgrading
    &repmgr; on the old server <emphasis>before</emphasis> running
    <application>pg_upgrade</application> to ensure that old and new
    versions are the same.
  </para>
  <note>
    <simpara>
      This issue applies to any PostgreSQL extension which has
      dependencies on a shared library.
    </simpara>
  </note>
  <para>
    For further details please see the <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade documentation</ulink>.
  </para>
  <para>
    If replication slots are in use, bear in mind these will <emphasis>not</emphasis>
    be recreated by <application>pg_upgrade</application>. These will need to
    be recreated manually.
  </para>
  <tip>
 	<para>
 	  Use <command><link linkend="repmgr-node-check">repmgr node check</link></command>
 	  to determine which replacation slots need to be recreated.
 	</para>
  </tip>
 </sect1>
 <sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x">
  <indexterm>
   <primary>upgrading</primary>
   <secondary>from repmgr 3.x</secondary>
  </indexterm>
  <title>Upgrading from repmgr 3.x</title>
  <para>
   The upgrade process consists of two steps:
   <orderedlist>
    <listitem>
     <simpara>
       converting the repmgr.conf configuration files
     </simpara>
    </listitem>
    <listitem>
     <simpara>
       upgrading the repmgr schema using <command>CREATE EXTENSION</command>
     </simpara>
    </listitem>
   </orderedlist>
  </para>
  <para>
   A script is provided to assist with converting <filename>repmgr.conf</filename>.
  </para>
  <para>
   The schema upgrade (which converts the &repmgr; metadata into
   a packaged PostgreSQL extension) is normally carried out
   automatically when the &repmgr; extension is created.
  </para>
  <para>
   The shared library has been renamed from <literal>repmgr_funcs</literal> to
   <literal>repmgr</literal> - if it's set in <varname>shared_preload_libraries</varname>
   in <filename>postgresql.conf</filename> it will need to be updated to the new name:
   <programlisting>
    shared_preload_libraries = 'repmgr'</programlisting>
  </para>
  <sect2 id="converting-repmgr-conf">
   <title>Converting repmgr.conf configuration files</title>
   <para>
    With a completely new repmgr version, we've taken the opportunity
    to rename some configuration items for
    clarity and consistency, both between the configuration file and
    the column names in <structname>repmgr.nodes</structname>
    (e.g. <varname>node</varname> to <varname>node_id</varname>), and
    also for consistency with PostgreSQL naming conventions
    (e.g. <varname>loglevel</varname> to <varname>log_level</varname>).
   </para>
   <para>
    Other configuration items have been changed to command line options,
    and vice-versa, e.g. to avoid hard-coding items such as a a node's
    upstream ID, which might change over time.
   </para>
   <para>
    &repmgr; will issue a warning about deprecated/altered options.
   </para>
   <sect3>
    <title>Changed parameters in "repmgr.conf"</title>
    <para>
     Following parameters have been added:
     <itemizedlist spacing="compact" mark="bullet">
      <listitem>
        <simpara><varname>data_directory</varname>: this is mandatory and must
         contain the path to the node's data directory</simpara>
      </listitem>
      <listitem>
        <simpara><varname>monitoring_history</varname>: this replaces the
          <application>repmgrd</application> command line option
          <literal>--monitoring-history</literal></simpara>
      </listitem>
     </itemizedlist>
    </para>
    <para>
     Following parameters have been renamed:
    </para>
    <table tocentry="1" id="repmgr3-repmgr4-renamed-parameters">
     <title>Parameters renamed in repmgr4</title>
     <tgroup cols="2">
      <thead>
       <row>
        <entry>repmgr3</entry>
        <entry>repmgr4</entry>
       </row>
      </thead>
      <tbody>
       <row>
        <entry><varname>node</varname></entry>
        <entry><varname>node_id</varname></entry>
       </row>
       <row>
        <entry><varname>loglevel</varname></entry>
        <entry><varname>log_level</varname></entry>
       </row>
       <row>
        <entry><varname>logfacility</varname></entry>
        <entry><varname>log_facility</varname></entry>
       </row>
       <row>
        <entry><varname>logfile</varname></entry>
        <entry><varname>log_file</varname></entry>
       </row>
       <row>
        <entry><varname>barman_server</varname></entry>
        <entry><varname>barman_host</varname></entry>
       </row>
       <row>
        <entry><varname>master_reponse_timeout</varname></entry>
        <entry><varname>async_query_timeout</varname></entry>
       </row>
      </tbody>
     </tgroup>
    </table>
    <note>
      <para>
        From &repmgr; 4, <literal>barman_server</literal> refers
        to the server configured in Barman (in &repmgr; 3, the deprecated
        <literal>cluster</literal> parameter was used for this);
        the physical Barman hostname is configured with
        <literal>barman_host</literal> (see <xref linkend="cloning-from-barman-prerequisites">
          for details).
      </para>
    </note>
    <para>
     Following parameters have been removed:
     <itemizedlist spacing="compact" mark="bullet">
      <listitem>
        <simpara><varname>cluster</varname>: is no longer required and will
        be ignored.</simpara>
      </listitem>
      <listitem>
        <simpara><varname>upstream_node</varname>:  is replaced by the
        command-line parameter <literal>--upstream-node-id</literal></simpara>
      </listitem>
     </itemizedlist>
    </para>
   </sect3>
   <sect3>
    <title>Conversion script</title>
    <para>
     To assist with conversion of <filename>repmgr.conf</filename> files, a Perl script
     is provided in <filename>contrib/convert-config.pl</filename>.
     Use like this:
     <programlisting>
    $ ./convert-config.pl /etc/repmgr.conf
    node_id=2
    node_name=node2
    conninfo=host=node2 dbname=repmgr user=repmgr connect_timeout=2
    pg_ctl_options='-l /var/log/postgres/startup.log'
    rsync_options=--exclude=postgresql.local.conf --archive
    log_level=INFO
    pg_basebackup_options=--no-slot
    data_directory=</programlisting>
    </para>
    <para>
      The converted file is printed to <literal>STDOUT</literal> and the original file is not
      changed.
    </para>
    <para>
      Please note that the the conversion script will add an empty
      placeholder parameter for <varname>data_directory</varname>, which
      is a required parameter in repmgr4 and which <emphasis>must</emphasis>
      be provided.
    </para>
   </sect3>
  </sect2>
  <sect2>
   <title>Upgrading the repmgr schema</title>
   <para>
    Ensure <application>repmgrd</application> is not running, or any cron jobs which execute the
    <command>repmgr</command> binary.
   </para>
   <para>
    Install <literal>repmgr 4</literal> packages; any <literal>repmgr 3.x</literal> packages
    should be uninstalled (if not automatically uninstalled already by your packaging system).
   </para>
   <sect3>
    <title>Upgrading from repmgr 3.1.1 or earlier</title>
    <para>
     If your repmgr version is 3.1.1 or earlier, you will need to update
     the schema to the latest version in the 3.x series (3.3.2) before
     converting the installation to repmgr 4.
    </para>
    <para>
      To do this, apply the following upgrade scripts as appropriate for
      your current version:
      <itemizedlist spacing="compact" mark="bullet">
      <listitem>
        <simpara>
          <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.0_repmgr3.1.sql">repmgr3.0_repmgr3.1.sql</ulink></simpara>
      </listitem>
      <listitem>
        <simpara><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.1.1_repmgr3.1.2.sql">repmgr3.1.1_repmgr3.1.2.sql</ulink></simpara>
      </listitem>
      </itemizedlist>
    </para>
    <para>
      For more details see the
      <ulink url="https://repmgr.org/release-notes-3.3.2.html#upgrading">repmgr 3 upgrade notes</ulink>.
    </para>
   </sect3>
   <sect3>
    <title>Manually create the repmgr extension</title>
    <para>
     In the database used by the existing &repmgr; installation, execute:
     <programlisting>
      CREATE EXTENSION repmgr FROM unpackaged;</programlisting>
    </para>
    <para>
     This will move and convert all objects from the existing schema
     into the new, standard <literal>repmgr</literal> schema.
    </para>
    <note>
      <simpara>there must be only one schema matching <literal>repmgr_%</literal> in the
        database, otherwise this step may not work.
      </simpara>
    </note>
   </sect3>
   <sect3>
    <title>Re-register each node</title>
    <para>
     This is necessary to update the <literal>repmgr</literal> metadata with some additional items.
    </para>
    <para>
     On the primary node, execute e.g.
     <programlisting>
      repmgr primary register -f /etc/repmgr.conf --force</programlisting>
    </para>
    <para>
     On each standby node, execute e.g.
     <programlisting>
      repmgr standby register -f /etc/repmgr.conf --force</programlisting>
    </para>
    <para>
     Check the data is updated as expected by examining the <structname>repmgr.nodes</structname>
     table; restart <application>repmgrd</application> if required.
    </para>
    <para>
     The original <literal>repmgr_$cluster</literal> schema can be dropped at any time.
    </para>
    <tip>
     <simpara>
      If you don't care about any data from the existing &repmgr; installation,
      (e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
      tables), the manual <command>CREATE EXTENSION</command> step can be skipped; just re-register
      each node, starting with the primary node, and the <literal>repmgr</literal> extension will be
      automatically created.
     </simpara>
    </tip>
   </sect3>
  </sect2>
 </sect1>
 </chapter>
--- a/doc/version.sgml
+++ b/doc/version.sgml
@@ -0,0 +1 @@
 <!ENTITY repmgrversion "4.2">
--- a/doc/website-docs.css
+++ b/doc/website-docs.css
@@ -0,0 +1,469 @@
 /* PostgreSQL.org Documentation Style */
 /* requires global.css, table.css and text.css to be loaded before this file! */
 body {
  font-family: verdana, sans-serif;
  font-size: 76%;
  background: url("/resources/background.png") repeat-x scroll left top transparent;
  padding: 15px 4%;
  margin: 0;
 }
 /* monospace font size fix */
 pre, code, kbd, samp, tt {
  font-family: monospace,monospace;
  font-size: 1em;
 }
 div.NAVHEADER table {
  margin-left: 0;
 }
 /* Container Definitions */
 #docContainerWrap {
  text-align: center; /* Win IE5 */
 }
 #docContainer {
  margin: 0 auto;
  width: 90%;
  padding-bottom: 2em;
  display: block;
  text-align: left; /* Win IE5 */
 }
 #docHeader {
  background-image: url("/media/img/docs/bg_hdr.png");
  height: 83px;
  margin: 0px;
  padding: 0px;
  display: block;
 }
 #docHeaderLogo {
  position: relative;
  width: 206px;
  height: 83px;
  border: 0px;
  padding: 0px;
  margin: 0 0 0 20px;
 }
 #docHeaderLogo img {
  border: 0px;
 }
 #docNavSearchContainer {
  padding-bottom: 2px;
 }
 #docNav, #docVersions {
  position: relative;
  text-align: left;
  margin-left: 10px;
  margin-top: 5px;
  color: #666;
  font-size: 0.95em;
 }
 #docSearch {
  position: relative;
  text-align: right;
  padding: 0;
  margin: 0;
  color: #666;
 }
 #docTextSize {
  text-align: right;
  white-space: nowrap;
  margin-top: 7px;
  font-size: 0.95em;
 }
 #docSearch form {
  position: relative;
  top: 5px;
  right: 0;
  margin: 0; /* need for IE 5.5 OSX */
  text-align: right; /* need for IE 5.5 OSX */
  white-space: nowrap; /* for Opera */
 }
 #docSearch form label {
  color: #666;
  font-size: 0.95em;
 }
 #docSearch form input {
  font-size: 0.95em;
 }
 #docSearch form #submit {
  font-size: 0.95em;
  background: #7A7A7A;
  color: #fff;
  border: 1px solid #7A7A7A;
  padding: 1px 4px;
 }
 #docSearch form #q {
  width: 170px;
  font-size: 0.95em;
  border:  1px solid #7A7A7A;
  background: #E1E1E1;
  color: #000000;
  padding: 2px;
 }
 .frmDocSearch {
  padding: 0;
  margin: 0;
  display: inline;
 }
 .inpDocSearch {
  padding: 0;
  margin: 0;
  color: #000;
 }
 #docContent {
  position: relative;
  margin-left: 10px;
  margin-right: 10px;
  margin-top: 40px;
 }
 #docFooter {
  position: relative;
  font-size: 0.9em; 
  color: #666; 
  line-height: 1.3em; 
  margin-left: 10px;
  margin-right: 10px;
 }
 #docComments {
  margin-top: 10px;
 }
 #docClear {
  clear: both;
  margin: 0;
  padding: 0;
 }
 /* Heading Definitions */
 h1, h2, h3 {
  font-weight: bold;
  margin-top: 2ex;
  color: #444;
 }
 h1 {
  font-size: 1.4em;
 }
 h2 {
  font-size: 1.2em !important;
 }
 h3 {
  font-size: 1.1em;
 }
 h1 a:hover,
 h2 a:hover,
 h3 a:hover,
 h4 a:hover {
  color: #444;
  text-decoration: none;
 }
 /* Text Styles */
 div.SECT2 {
  margin-top: 4ex;
 }
 div.SECT3 {
  margin-top: 3ex;
  margin-left: 3ex;
 }
 .txtCurrentLocation {
  font-weight: bold;
 }
 p, ol, ul, li {
  line-height: 1.5em;
 }
 .txtCommentsWrap {
  border: 2px solid #F5F5F5; 
  width: 100%;
 }
 .txtCommentsContent {
  background: #F5F5F5;
  padding: 3px;
 }
 .txtCommentsPoster {
  float: left;
 }
 .txtCommentsDate {
  float: right;
 }
 .txtCommentsComment {
  padding: 3px;
 }
 #docContainer pre code,
 #docContainer pre tt,
 #docContainer pre pre,
 #docContainer tt tt,
 #docContainer tt code,
 #docContainer tt pre {
  font-size: 1em;
 }
 pre.LITERALLAYOUT,
 .SCREEN,
 .SYNOPSIS,
 .PROGRAMLISTING,
 .REFSYNOPSISDIV p,
 table.CAUTION,
 table.WARNING,
 blockquote.NOTE,
 blockquote.TIP,
 table.CALSTABLE {
  -moz-box-shadow: 3px 3px 5px #DFDFDF;
  -webkit-box-shadow: 3px 3px 5px #DFDFDF;
  -khtml-box-shadow: 3px 3px 5px #DFDFDF;
  -o-box-shadow: 3px 3px 5px #DFDFDF;
  box-shadow: 3px 3px 5px #DFDFDF;
 }
 pre.LITERALLAYOUT,
 .SCREEN,
 .SYNOPSIS,
 .PROGRAMLISTING,
 .REFSYNOPSISDIV p,
 table.CAUTION,
 table.WARNING,
 blockquote.NOTE,
 blockquote.TIP {
  color: black;
  border-width: 1px;
  border-style: solid;
  padding: 2ex;
  margin: 2ex 0 2ex 2ex;
  overflow: auto;
  -moz-border-radius: 8px;
  -webkit-border-radius: 8px;
  -khtml-border-radius: 8px;
  border-radius: 8px;
 }
 pre.LITERALLAYOUT,
 pre.SYNOPSIS,
 pre.PROGRAMLISTING,
 .REFSYNOPSISDIV p,
 .SCREEN {
  border-color: #CFCFCF;
  background-color: #F7F7F7;
 }
 blockquote.NOTE,
 blockquote.TIP {
  border-color: #DBDBCC;
  background-color: #EEEEDD;
  padding: 14px;
  width: 572px;
 }
 blockquote.NOTE,
 blockquote.TIP,
 table.CAUTION,
 table.WARNING {
  margin: 4ex auto;
 }
 blockquote.NOTE p,
 blockquote.TIP p {
  margin: 0;
 }
 blockquote.NOTE pre,
 blockquote.NOTE code,
 blockquote.TIP pre,
 blockquote.TIP code {
  margin-left: 0;
  margin-right: 0;
  -moz-box-shadow: none;
  -webkit-box-shadow: none;
  -khtml-box-shadow: none;
  -o-box-shadow: none;
  box-shadow: none;
 }
 .emphasis,
 .c2 {
  font-weight: bold;
 }
 .REPLACEABLE {
  font-style: italic;
 }
 /* Table Styles */
 table {
  margin-left: 2ex;
 }
 table.CALSTABLE td,
 table.CALSTABLE th,
 table.CAUTION td,
 table.CAUTION th,
 table.WARNING td,
 table.WARNING th {
  border-style: solid;
 }
 table.CALSTABLE,
 table.CAUTION,
 table.WARNING {
  border-spacing: 0;
  border-collapse: collapse;
 }
 table.CALSTABLE
 {
  margin: 2ex 0 2ex 2ex;
  background-color: #E0ECEF;
  border: 2px solid #A7C6DF;
 }
 table.CALSTABLE tr:hover td
 {
  background-color: #EFEFEF;
 }
 table.CALSTABLE td {
  background-color: #FFF;
 }
 table.CALSTABLE td,
 table.CALSTABLE th {
  border: 1px solid #A7C6DF;
  padding: 0.5ex 0.5ex;
 }
 table.CAUTION,
 table.WARNING {
  border-collapse: separate;
  display: block;
  padding: 0;
  max-width: 600px;
 }
 table.CAUTION {
  background-color: #F5F5DC;
  border-color: #DEDFA7;
 }
 table.WARNING {
  background-color: #FFD7D7;
  border-color: #DF421E;
 }
 table.CAUTION td,
 table.CAUTION th,
 table.WARNING td,
 table.WARNING th {
  border-width: 0;
  padding-left: 2ex;
  padding-right: 2ex;
 }
 table.CAUTION td,
 table.CAUTION th {
  border-color: #F3E4D5
 }
 table.WARNING td,
 table.WARNING th {
  border-color: #FFD7D7;
 }
 td.c1,
 td.c2,
 td.c3,
 td.c4,
 td.c5,
 td.c6 {
  font-size: 1.1em;
  font-weight: bold;
  border-bottom: 0px solid #FFEFEF;
  padding: 1ex 2ex 0;
 }
 /* Link Styles */
 #docNav a {
  font-weight: bold;
 }
 a:link,
 a:visited,
 a:active,
 a:hover {
  text-decoration: underline;
 }
 a:link,
 a:active {
  color:#0066A2;
 }
 a:visited {
  color:#004E66;
 }
 a:hover {
  color:#000000;
 }
 #docFooter a:link,
 #docFooter a:visited,
 #docFooter a:active {
  color:#666;
 }
 #docContainer code.FUNCTION tt {
  font-size: 1em;
 }
 div.header {
    color: #444;
    margin-top: 5px;
 }
 div.footer {
    text-align: center;
    background-image: url("/resources/footerl.png"), url("/resources/footerr.png"), url("/resources/footerc.png");
    background-position: left top, right top, center top;
    background-repeat: no-repeat, no-repeat, repeat-x;
    padding-top: 45px;
 }
 img {
    border-style: none;
 }
--- a/errcode.h
+++ b/errcode.h
@@ -1,6 +1,6 @@
 /*
 * errcode.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -43,5 +43,10 @@
 #define ERR_BARMAN 19
 #define ERR_REGISTRATION_SYNC 20
 #define ERR_OUT_OF_MEMORY 21
 #define ERR_SWITCHOVER_INCOMPLETE 22
 #define ERR_FOLLOW_FAIL 23
 #define ERR_REJOIN_FAIL 24
 #define ERR_NODE_STATUS 25
 #define ERR_REPMGRD_PAUSE 26
 #endif							/* _ERRCODE_H_ */
--- a/expected/repmgr_extension.out
+++ b/expected/repmgr_extension.out
@@ -38,15 +38,15 @@ SELECT repmgr.am_bdr_failover_handler(-1);
 (1 row)
-SELECT repmgr.get_new_primary();
+SELECT repmgr.am_bdr_failover_handler(NULL);
- get_new_primary 
+ am_bdr_failover_handler 
-----------------
+-------------------------
 (1 row)
-SELECT repmgr.get_voting_status();
+SELECT repmgr.get_new_primary();
- get_voting_status 
+ get_new_primary 
-------------------
+-----------------
 (1 row)
@@ -56,15 +56,9 @@ SELECT repmgr.notify_follow_primary(-1);
 (1 row)
-SELECT repmgr.other_node_is_candidate(-1,-1);
+SELECT repmgr.notify_follow_primary(NULL);
- other_node_is_candidate 
+ notify_follow_primary 
-------------------------
+-----------------------
 (1 row)
 SELECT repmgr.request_vote(-1,-1);
 request_vote 
 --------------
 (1 row)
@@ -80,9 +74,9 @@ SELECT repmgr.set_local_node_id(-1);
 (1 row)
-SELECT repmgr.set_voting_status_initiated();
+SELECT repmgr.set_local_node_id(NULL);
- set_voting_status_initiated 
+ set_local_node_id 
-----------------------------
+-------------------
 (1 row)
--- a/log.c
+++ b/log.c
@@ -1,6 +1,6 @@
 /*
 * log.c - Logging methods
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -42,7 +42,7 @@ _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_li
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
 int			log_type = REPMGR_STDERR;
-int			log_level = LOG_NOTICE;
+int			log_level = LOG_INFO;
 int			last_log_level = LOG_INFO;
 int			verbose_logging = false;
 int			terse_logging = false;
@@ -70,7 +70,7 @@ _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_li
 	/*
 	 * Store the requested level so that if there's a subsequent log_hint() or
-	 * log_detail(), we can suppress that if appropriate.
+	 * log_detail(), we can suppress that if --terse was specified,
 	 */
 	last_log_level = level;
@@ -329,6 +329,21 @@ logger_set_terse(void)
 }
 void
 logger_set_level(int new_log_level)
 {
 	log_level = new_log_level;
 }
 void
 logger_set_min_level(int min_log_level)
 {
 	if (min_log_level > log_level)
 		log_level = min_log_level;
 }
 int
 detect_log_level(const char *level)
 {
--- a/log.h
+++ b/log.h
@@ -1,6 +1,6 @@
 /*
 * log.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -128,6 +128,8 @@ bool		logger_shutdown(void);
 void		logger_set_verbose(void);
 void		logger_set_terse(void);
 void		logger_set_min_level(int min_log_level);
 void		logger_set_level(int new_log_level);
 void
 log_detail(const char *fmt,...)
--- a/repmgr--4.0--4.1.sql
+++ b/repmgr--4.0--4.1.sql
@@ -0,0 +1,2 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION repmgr" to load this file. \quit
--- a/repmgr--4.0.sql
+++ b/repmgr--4.0.sql
@@ -6,7 +6,7 @@ CREATE TABLE repmgr.nodes (
  upstream_node_id INTEGER     NULL REFERENCES nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
-  type             TEXT        NOT NULL CHECK (type IN('primary','standby','bdr')),
+  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
@@ -79,6 +79,19 @@ LEFT JOIN repmgr.nodes un
       ON un.node_id = n.upstream_node_id;
 /* XXX update upgrade scripts! */
 CREATE TABLE repmgr.voting_term (
  term INT NOT NULL
 );
 CREATE UNIQUE INDEX voting_term_restrict
 ON repmgr.voting_term ((TRUE));
 CREATE RULE voting_term_delete AS
   ON DELETE TO repmgr.voting_term
   DO INSTEAD NOTHING;
 /* ================= */
 /* repmgrd functions */
 /* ================= */
@@ -90,6 +103,11 @@ CREATE FUNCTION set_local_node_id(INT)
  AS 'MODULE_PATHNAME', 'set_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_local_node_id()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_set_last_updated'
@@ -102,49 +120,6 @@ CREATE FUNCTION standby_get_last_updated()
 /* failover functions */
 DO $repmgr$
 DECLARE
  DECLARE server_version_num INT;
 BEGIN
  SELECT setting
    FROM pg_catalog.pg_settings
   WHERE name = 'server_version_num'
    INTO server_version_num;
  IF server_version_num >= 90400 THEN
    EXECUTE $repmgr_func$
 CREATE FUNCTION request_vote(INT,INT)
  RETURNS pg_lsn
  AS 'MODULE_PATHNAME', 'request_vote'
  LANGUAGE C STRICT;
    $repmgr_func$;
  ELSE
    EXECUTE $repmgr_func$
 CREATE FUNCTION request_vote(INT,INT)
  RETURNS TEXT
  AS 'MODULE_PATHNAME', 'request_vote'
  LANGUAGE C STRICT;
    $repmgr_func$;
  END IF;
 END$repmgr$;
 CREATE FUNCTION get_voting_status()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_voting_status'
  LANGUAGE C STRICT;
 CREATE FUNCTION set_voting_status_initiated()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'set_voting_status_initiated'
  LANGUAGE C STRICT;
 CREATE FUNCTION other_node_is_candidate(INT, INT)
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'other_node_is_candidate'
  LANGUAGE C STRICT;
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'notify_follow_primary'
@@ -160,13 +135,11 @@ CREATE FUNCTION reset_voting_status()
  AS 'MODULE_PATHNAME', 'reset_voting_status'
  LANGUAGE C STRICT;
 CREATE FUNCTION am_bdr_failover_handler(INT)
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'am_bdr_failover_handler'
  LANGUAGE C STRICT;
 CREATE FUNCTION unset_bdr_failover_handler()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'unset_bdr_failover_handler'
--- a/repmgr--4.1--4.2.sql
+++ b/repmgr--4.1--4.2.sql
@@ -0,0 +1,32 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION repmgr" to load this file. \quit
 CREATE FUNCTION get_repmgrd_pid()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_repmgrd_pidfile()
  RETURNS TEXT
  AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
  LANGUAGE C STRICT;
 CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_is_running()
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'repmgrd_is_running'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_pause(BOOL)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'repmgrd_pause'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_is_paused()
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
  LANGUAGE C STRICT;
--- a/repmgr--4.1.sql
+++ b/repmgr--4.1.sql
@@ -0,0 +1,166 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION repmgr" to load this file. \quit
 CREATE TABLE repmgr.nodes (
  node_id          INTEGER     PRIMARY KEY,
  upstream_node_id INTEGER     NULL REFERENCES nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
  repluser         VARCHAR(63) NOT NULL,
  slot_name        TEXT        NULL,
  config_file      TEXT        NOT NULL
 );
 CREATE TABLE repmgr.events (
  node_id          INTEGER NOT NULL,
  event            TEXT NOT NULL,
  successful       BOOLEAN NOT NULL DEFAULT TRUE,
  event_timestamp  TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
  details          TEXT NULL
 );
 DO $repmgr$
 DECLARE
  DECLARE server_version_num INT;
 BEGIN
  SELECT setting
    FROM pg_catalog.pg_settings
   WHERE name = 'server_version_num'
    INTO server_version_num;
  IF server_version_num >= 90400 THEN
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
  last_apply_time                TIMESTAMP WITH TIME ZONE,
  last_wal_primary_location      PG_LSN NOT NULL,
  last_wal_standby_location      PG_LSN,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
 )
    $repmgr_func$;
  ELSE
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
  last_apply_time                TIMESTAMP WITH TIME ZONE,
  last_wal_primary_location      TEXT NOT NULL,
  last_wal_standby_location      TEXT,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
 )
    $repmgr_func$;
  END IF;
 END$repmgr$;
 CREATE INDEX idx_monitoring_history_time
          ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
 CREATE VIEW repmgr.show_nodes AS
   SELECT n.node_id,
          n.node_name,
          n.active,
          n.upstream_node_id,
          un.node_name AS upstream_node_name,
          n.type,
          n.priority,
          n.conninfo
     FROM repmgr.nodes n
 LEFT JOIN repmgr.nodes un
       ON un.node_id = n.upstream_node_id;
 /* XXX update upgrade scripts! */
 CREATE TABLE repmgr.voting_term (
  term INT NOT NULL
 );
 CREATE UNIQUE INDEX voting_term_restrict
 ON repmgr.voting_term ((TRUE));
 CREATE RULE voting_term_delete AS
   ON DELETE TO repmgr.voting_term
   DO INSTEAD NOTHING;
 /* ================= */
 /* repmgrd functions */
 /* ================= */
 /* monitoring functions */
 CREATE FUNCTION set_local_node_id(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'set_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_local_node_id()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_set_last_updated'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_get_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_get_last_updated'
  LANGUAGE C STRICT;
 /* failover functions */
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'notify_follow_primary'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_new_primary()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_new_primary'
  LANGUAGE C STRICT;
 CREATE FUNCTION reset_voting_status()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'reset_voting_status'
  LANGUAGE C STRICT;
 CREATE FUNCTION am_bdr_failover_handler(INT)
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'am_bdr_failover_handler'
  LANGUAGE C STRICT;
 CREATE FUNCTION unset_bdr_failover_handler()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'unset_bdr_failover_handler'
  LANGUAGE C STRICT;
 CREATE VIEW repmgr.replication_status AS
  SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
 	     n.type AS node_type, n.active, last_monitor_time,
         CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
         m.last_wal_standby_location,
         CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
         CASE WHEN n.type='standby' THEN
           CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
           ELSE NULL
         END AS replication_time_lag,
         CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
         AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
    FROM repmgr.monitoring_history m
    JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
   WHERE (m.standby_node_id, m.last_monitor_time) IN (
 	          SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
 			    FROM repmgr.monitoring_history m1 GROUP BY 1
         );
--- a/repmgr--4.2.sql
+++ b/repmgr--4.2.sql
@@ -0,0 +1,197 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION repmgr" to load this file. \quit
 CREATE TABLE repmgr.nodes (
  node_id          INTEGER     PRIMARY KEY,
  upstream_node_id INTEGER     NULL REFERENCES nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
  repluser         VARCHAR(63) NOT NULL,
  slot_name        TEXT        NULL,
  config_file      TEXT        NOT NULL
 );
 CREATE TABLE repmgr.events (
  node_id          INTEGER NOT NULL,
  event            TEXT NOT NULL,
  successful       BOOLEAN NOT NULL DEFAULT TRUE,
  event_timestamp  TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
  details          TEXT NULL
 );
 DO $repmgr$
 DECLARE
  DECLARE server_version_num INT;
 BEGIN
  SELECT setting
    FROM pg_catalog.pg_settings
   WHERE name = 'server_version_num'
    INTO server_version_num;
  IF server_version_num >= 90400 THEN
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
  last_apply_time                TIMESTAMP WITH TIME ZONE,
  last_wal_primary_location      PG_LSN NOT NULL,
  last_wal_standby_location      PG_LSN,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
 )
    $repmgr_func$;
  ELSE
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
  last_apply_time                TIMESTAMP WITH TIME ZONE,
  last_wal_primary_location      TEXT NOT NULL,
  last_wal_standby_location      TEXT,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
 )
    $repmgr_func$;
  END IF;
 END$repmgr$;
 CREATE INDEX idx_monitoring_history_time
          ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
 CREATE VIEW repmgr.show_nodes AS
   SELECT n.node_id,
          n.node_name,
          n.active,
          n.upstream_node_id,
          un.node_name AS upstream_node_name,
          n.type,
          n.priority,
          n.conninfo
     FROM repmgr.nodes n
 LEFT JOIN repmgr.nodes un
       ON un.node_id = n.upstream_node_id;
 /* XXX update upgrade scripts! */
 CREATE TABLE repmgr.voting_term (
  term INT NOT NULL
 );
 CREATE UNIQUE INDEX voting_term_restrict
 ON repmgr.voting_term ((TRUE));
 CREATE RULE voting_term_delete AS
   ON DELETE TO repmgr.voting_term
   DO INSTEAD NOTHING;
 /* ================= */
 /* repmgrd functions */
 /* ================= */
 /* monitoring functions */
 CREATE FUNCTION set_local_node_id(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'set_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_local_node_id()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_set_last_updated'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_get_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_get_last_updated'
  LANGUAGE C STRICT;
 /* failover functions */
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'notify_follow_primary'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_new_primary()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_new_primary'
  LANGUAGE C STRICT;
 CREATE FUNCTION reset_voting_status()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'reset_voting_status'
  LANGUAGE C STRICT;
 CREATE FUNCTION am_bdr_failover_handler(INT)
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'am_bdr_failover_handler'
  LANGUAGE C STRICT;
 CREATE FUNCTION unset_bdr_failover_handler()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'unset_bdr_failover_handler'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_repmgrd_pid()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_repmgrd_pidfile()
  RETURNS TEXT
  AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
  LANGUAGE C STRICT;
 CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_is_running()
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'repmgrd_is_running'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_pause(BOOL)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'repmgrd_pause'
  LANGUAGE C STRICT;
 CREATE FUNCTION repmgrd_is_paused()
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
  LANGUAGE C STRICT;
 CREATE VIEW repmgr.replication_status AS
  SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
 	     n.type AS node_type, n.active, last_monitor_time,
         CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
         m.last_wal_standby_location,
         CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
         CASE WHEN n.type='standby' THEN
           CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
           ELSE NULL
         END AS replication_time_lag,
         CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
         AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
    FROM repmgr.monitoring_history m
    JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
   WHERE (m.standby_node_id, m.last_monitor_time) IN (
 	          SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
 			    FROM repmgr.monitoring_history m1 GROUP BY 1
         );
--- a/repmgr--unpackaged--4.0.sql
+++ b/repmgr--unpackaged--4.0.sql
@@ -32,7 +32,7 @@ CREATE TABLE repmgr.nodes (
  upstream_node_id INTEGER     NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
-  type             TEXT        NOT NULL CHECK (type IN('primary','standby','bdr')),
+  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
@@ -54,8 +54,34 @@ SELECT id, upstream_node_id, active, name,
 ALTER TABLE repmgr.repl_events RENAME TO events;
 -- create new table "repmgr.voting_term"
 CREATE TABLE repmgr.voting_term (
  term INT NOT NULL
 );
 CREATE UNIQUE INDEX voting_term_restrict
 ON repmgr.voting_term ((TRUE));
 CREATE RULE voting_term_delete AS
   ON DELETE TO repmgr.voting_term
   DO INSTEAD NOTHING;
 INSERT INTO repmgr.voting_term (term) VALUES (1);
 -- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"
 DO $repmgr$
 DECLARE
  DECLARE server_version_num INT;
 BEGIN
  SELECT setting
    FROM pg_catalog.pg_settings
   WHERE name = 'server_version_num'
    INTO server_version_num;
  IF server_version_num >= 90400 THEN
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
@@ -65,12 +91,32 @@ CREATE TABLE repmgr.monitoring_history (
  last_wal_standby_location      PG_LSN,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
-);
+)
    $repmgr_func$;
    INSERT INTO repmgr.monitoring_history
      (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
    SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
      FROM repmgr.repl_monitor;
  ELSE
    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
  last_apply_time                TIMESTAMP WITH TIME ZONE,
  last_wal_primary_location      TEXT NOT NULL,
  last_wal_standby_location      TEXT,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
 )
    $repmgr_func$;
    INSERT INTO repmgr.monitoring_history
      (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
    SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag
      FROM repmgr.repl_monitor;
-INSERT INTO repmgr.monitoring_history
+  END IF;
-  (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
+END$repmgr$;
 SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
  FROM repmgr.repl_monitor;
 CREATE INDEX idx_monitoring_history_time
          ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
@@ -95,6 +141,16 @@ LEFT JOIN repmgr.nodes un
 /* monitoring functions */
 CREATE FUNCTION set_local_node_id(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'set_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_local_node_id()
  RETURNS INT
  AS 'MODULE_PATHNAME', 'get_local_node_id'
  LANGUAGE C STRICT;
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS '$libdir/repmgr', 'standby_set_last_updated'
@@ -108,26 +164,6 @@ CREATE FUNCTION standby_get_last_updated()
 /* failover functions */
 CREATE FUNCTION request_vote(INT,INT)
  RETURNS pg_lsn
  AS '$libdir/repmgr', 'request_vote'
  LANGUAGE C STRICT;
 CREATE FUNCTION get_voting_status()
  RETURNS INT
  AS '$libdir/repmgr', 'get_voting_status'
  LANGUAGE C STRICT;
 CREATE FUNCTION set_voting_status_initiated()
  RETURNS INT
  AS '$libdir/repmgr', 'set_voting_status_initiated'
  LANGUAGE C STRICT;
 CREATE FUNCTION other_node_is_candidate(INT, INT)
  RETURNS BOOL
  AS '$libdir/repmgr', 'other_node_is_candidate'
  LANGUAGE C STRICT;
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS '$libdir/repmgr', 'notify_follow_primary'
--- a/repmgr-action-bdr.c
+++ b/repmgr-action-bdr.c
@@ -1,9 +1,9 @@
 /*
- * repmgr-action-standby.c
+ * repmgr-action-bdr.c
 *
 * Implements BDR-related actions for the repmgr command line utility
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@
 /*
 * do_bdr_register()
 *
- * As each BDR node is its own master, registering a BDR node
+ * As each BDR node is its own primary, registering a BDR node
 * will create the repmgr metadata schema if necessary.
 */
 void
@@ -83,17 +83,50 @@ do_bdr_register(void)
 		exit(ERR_BAD_CONFIG);
 	}
-	if (bdr_nodes.node_count > 2)
+	/* BDR 2 implementation is for 2 nodes only */
 	if (get_bdr_version_num() < 3 && bdr_nodes.node_count > 2)
 	{
-		log_error(_("repmgr can only support BDR clusters with 2 nodes"));
+		log_error(_("repmgr can only support BDR 2.x clusters with 2 nodes"));
 		log_detail(_("this BDR cluster has %i nodes"), bdr_nodes.node_count);
 		PQfinish(conn);
 		pfree(dbname);
 		exit(ERR_BAD_CONFIG);
 	}
-	/* check whether repmgr extension exists, and that any other nodes are BDR */
+	/* check for a matching BDR node */
-	extension_status = get_repmgr_extension_status(conn);
+	{
 		PQExpBufferData bdr_local_node_name;
 		bool		node_match = false;
 		initPQExpBuffer(&bdr_local_node_name);
 		node_match = bdr_node_name_matches(conn, config_file_options.node_name, &bdr_local_node_name);
 		if (node_match == false)
 		{
 			if (strlen(bdr_local_node_name.data))
 			{
 				log_error(_("local node BDR node name is \"%s\", expected: \"%s\""),
 						  bdr_local_node_name.data,
 						  config_file_options.node_name);
 				log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
 			}
 			else
 			{
 				log_error(_("local node does not report BDR node name"));
 				log_hint(_("ensure this is an active BDR node"));
 			}
 			PQfinish(conn);
 			pfree(dbname);
 			termPQExpBuffer(&bdr_local_node_name);
 			exit(ERR_BAD_CONFIG);
 		}
 		termPQExpBuffer(&bdr_local_node_name);
 	}
 	/* check whether repmgr extension exists, and there are no non-BDR nodes registered */
 	extension_status = get_repmgr_extension_status(conn, NULL);
 	if (extension_status == REPMGR_UNKNOWN)
 	{
@@ -142,17 +175,10 @@ do_bdr_register(void)
 	pfree(dbname);
-	/* check for a matching BDR node */
+	if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false)
 	{
-		bool		node_exists = bdr_node_exists(conn, config_file_options.node_name);
+		log_debug("bdr_node_has_repmgr_set() = false");
-
+		bdr_node_set_repmgr_set(conn, config_file_options.node_name);
 		if (node_exists == false)
 		{
 			log_error(_("no BDR node with node_name \"%s\" found"), config_file_options.node_name);
 			log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
 			PQfinish(conn);
 			exit(ERR_BAD_CONFIG);
 		}
 	}
 	/*
@@ -165,7 +191,7 @@ do_bdr_register(void)
 	{
 		NodeInfoList local_node_records = T_NODE_INFO_LIST_INITIALIZER;
-		get_all_node_records(conn, &local_node_records);
+		(void) get_all_node_records(conn, &local_node_records);
 		if (local_node_records.node_count == 0)
 		{
@@ -177,6 +203,7 @@ do_bdr_register(void)
 			if (bdr_nodes.node_count == 0)
 			{
 				log_error(_("unable to retrieve any BDR node records"));
 				log_detail("%s", PQerrorMessage(conn));
 				PQfinish(conn);
 				exit(ERR_BAD_CONFIG);
 			}
@@ -205,14 +232,14 @@ do_bdr_register(void)
 				}
 				/* check repmgr schema exists, skip if not */
-				other_node_extension_status = get_repmgr_extension_status(bdr_node_conn);
+				other_node_extension_status = get_repmgr_extension_status(bdr_node_conn, NULL);
 				if (other_node_extension_status != REPMGR_INSTALLED)
 				{
 					continue;
 				}
-				get_all_node_records(bdr_node_conn, &existing_nodes);
+				(void) get_all_node_records(bdr_node_conn, &existing_nodes);
 				for (cell = existing_nodes.head; cell; cell = cell->next)
 				{
@@ -228,7 +255,35 @@ do_bdr_register(void)
 	}
 	/* Add the repmgr extension tables to a replication set */
-	add_extension_tables_to_bdr_replication_set(conn);
+
 	if (get_bdr_version_num() < 3)
 	{
 		add_extension_tables_to_bdr_replication_set(conn);
 	}
 	else
 	{
 		/* this is the only table we need to replicate */
 		char *replication_set = get_default_bdr_replication_set(conn);
 		/*
 		 * this probably won't happen, but we need to be sure we're using
 		 * the replication set metadata correctly...
 		 */
 		if (conn == NULL)
 		{
 			log_error(_("unable to retrieve default BDR replication set"));
 			log_hint(_("see preceding messages"));
 			log_debug("check query in get_default_bdr_replication_set()");
 			exit(ERR_BAD_CONFIG);
 		}
 		if (is_table_in_bdr_replication_set(conn, "nodes", replication_set) == false)
 		{
 			add_table_to_bdr_replication_set(conn, "nodes", replication_set);
 		}
 		pfree(replication_set);
 	}
 	initPQExpBuffer(&event_details);
@@ -387,7 +442,7 @@ do_bdr_unregister(void)
 		exit(ERR_BAD_CONFIG);
 	}
-	extension_status = get_repmgr_extension_status(conn);
+	extension_status = get_repmgr_extension_status(conn, NULL);
 	if (extension_status != REPMGR_INSTALLED)
 	{
 		log_error(_("repmgr is not installed on database \"%s\""), dbname);
--- a/repmgr-action-bdr.h
+++ b/repmgr-action-bdr.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-bdr.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/repmgr-action-cluster.c
+++ b/repmgr-action-cluster.c
--- a/Show More
+++ b/Show More
		`@@ -0,0 +1,2 @@`
							`-- complain if script is sourced in psql, rather than via CREATE EXTENSION`
							`\echo Use "CREATE EXTENSION repmgr" to load this file. \quit`