Finalize 4.0.5 release

doc: add notes about package compatibility
We need to emphasise that the repmgr packages are only compatible with packages based on the PGDG filesystem layout; 3rd party vendor packages often put application and data directories elsewhere. See e.g. GitHub #427.
2026-03-23 15:16:29 +00:00 · 2018-05-01 11:26:30 +09:00 · 2018-05-01 11:08:59 +09:00 · 2018-05-01 10:27:59 +09:00 · 2018-05-01 10:13:44 +09:00 · 2018-05-01 09:21:32 +09:00
117 changed files with 16357 additions and 5228 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -39,6 +39,10 @@ lib*.pc

 # test output
 /results/
+/regression.diffs
+/regression.out
+
+/doc/Makefile

 # other
 /.lineno
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,7 +2,7 @@ License and Contributions
 =========================

 `repmgr` is licensed under the GPL v3.  All of its code and documentation is
-Copyright 2010-2017, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
+Copyright 2010-2018, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
 details.

 The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -28,4 +28,3 @@ project. For more details see:

 Contributors should reformat their code similarly before submitting code to
 the project, in order to minimize merge conflicts with other work.
->>>>>>> Add further documentation files
--- a/4
+++ b/4
@@ -1,4 +1,4 @@
-Copyright (c) 2010-2017, 2ndQuadrant Limited
+Copyright (c) 2010-2018, 2ndQuadrant Limited
 All rights reserved.

 This program is free software: you can redistribute it and/or modify
@@ -12,5 +12,5 @@ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
-along with this program.  If not, see http://www.gnu.org/licenses/
+along with this program.  If not, see https://www.gnu.org/licenses/
 to obtain one.
--- a/FAQ.md
+++ b/FAQ.md
@@ -0,0 +1,8 @@
+FAQ - Frequently Asked Questions about repmgr
+=============================================
+
+The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](https://repmgr.org/docs/4.0/appendix-faq.html "repmgr FAQ")
+
+The repmgr 3.x FAQ can be found here:
+
+    https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/FAQ.md
--- a/99
+++ b/99
@@ -1,6 +1,97 @@
-4.0     2017-10-04
-        Complete rewrite with many changes; see file "doc/upgrading-from-repmgr3.md"
-        for details.
+4.0.5   2018-05-02
+        repmgr: poll demoted primary after restart as a standby during a
+          switchover operation; GitHub #408 (Ian)
+        repmgr: add configuration parameter "config_directory"; GitHub #424 (Ian)
+        repmgr: add "dbname=replication" to all replication connection strings;
+          GitHub #421 (Ian)
+        repmgr: add sanity check if --upstream-node-id not supplied when executing
+          "standby register"; GitHub #395 (Ian)
+        repmgr: enable provision of "archive_cleanup_command" in recovery.conf;
+          GitHub #416 (Ian)
+        repmgr: actively check for node to rejoin cluster; GitHub #415 (Ian)
+        repmgr: enable pg_rewind to be used with PostgreSQL 9.3/9.4; GitHub #413 (Ian)
+        repmgr: fix minimum accepted value for "degraded_monitoring_timeout";
+          GitHub #411 (Ian)
+        repmgr: fix superuser password handling; GitHub #400 (Ian)
+        repmgr: fix parsing of "archive_ready_critical" configuration file
+          parameter; GitHub #426 (Ian)
+        repmgr: fix display of conninfo parsing error messages (Ian)
+        repmgr: fix "repmgr cluster crosscheck" output; GitHub #389 (Ian)
+        repmgrd: prevent standby connection handle from going stale (Ian)
+        repmgrd: fix memory leaks in witness code; GitHub #402 (AndrzejNowicki, Martín)
+        repmgrd: handle "pg_ctl promote" timeout; GitHub #425 (Ian)
+        repmgrd: handle failover situation with only two nodes in the primary
+          location, and at least one node in another location; GitHub #407 (Ian)
+        repmgrd: set "connect_timeout=2" when pinging a server (Ian)
+
+4.0.4   2018-03-09
+        repmgr: add "standby clone --recovery-conf-only" option; GitHub #382 (Ian)
+        repmgr: make "standby promote" timeout values configurable; GitHub #387 (Ian)
+        repmgr: improve replication slot warnings generated by "node status";
+          GitHub #385 (Ian)
+        repmgr: remove restriction on replication slots when cloning from
+          a Barman server; GitHub #379 (Ian)
+        repmgr: ensure "node rejoin" honours "--dry-run" option; GitHub #383 (Ian)
+        repmgr: fix --superuser handling when cloning a standby; GitHub #380 (Ian)
+        repmgr: update various help options; GitHub #391, #392 (hasegeli)
+        repmgrd: add event "repmgrd_shutdown"; GitHub #393 (Ian)
+        repmgrd: improve detection of status change from primary to standby (Ian)
+        repmgrd: improve log output in various situations (Ian)
+        repmgrd: improve reconnection to the local node after a failover (Ian)
+        repmgrd: ensure witness server connects to new primary after a failover (Ian)
+
+4.0.3   2018-02-15
+        repmgr: improve switchover handling when "pg_ctl" used to control the
+          server and logging output is not explicitly redirected (Ian)
+        repmgr: improve switchover log messages and exit code when old primary could
+          not be shut down cleanly (Ian)
+        repmgr: check demotion candidate can make a replication connection to the
+          promotion candidate before executing a switchover; GitHub #370 (Ian)
+        repmgr: add check for sufficient walsenders/replication slots before executing
+          a switchover; GitHub #371 (Ian)
+        repmgr: add --dry-run mode to "repmgr standby follow"; GitHub #368 (Ian)
+        repmgr: provide information about the primary node for "standby_register" and
+          "standby_follow" event notifications; GitHub #375 (Ian)
+        repmgr: add "standby_register_sync" event notification; GitHub #374 (Ian)
+        repmgr: output any connection error messages in "cluster show"'s list of
+          warnings; GitHub #369 (Ian)
+        repmgr: ensure an inactive data directory can be deleted; GitHub #366 (Ian)
+        repmgr: fix upstream node display in "repmgr node status"; GitHub #363 (fanf2)
+        repmgr: improve/clarify documentation and update --help output for
+          "primary unregister"; GitHub #373 (Ian)
+        repmgr: allow replication slots when Barman is configured; GitHub #379 (Ian)
+        repmgr: fix parsing of "pg_basebackup_options"; GitHub #376 (Ian)
+        repmgr: ensure "pg_subtrans" directory is created when cloning a standby in
+          Barman mode (Ian)
+        repmgr: fix primary node check in "witness register"; GitHub #377 (Ian)
+
+4.0.2   2018-01-18
+        repmgr: add missing -W option to getopt_long() invocation; GitHub #350 (Ian)
+        repmgr: automatically create slot name if missing; GitHub #343 (Ian)
+        repmgr: fixes to parsing output of remote repmgr invocations; GitHub #349 (Ian)
+        repmgr: BDR support - create missing connection replication set
+          if required; GitHub #347 (Ian)
+        repmgr: handle missing node record in "repmgr node rejoin"; GitHub #358 (Ian)
+        repmgr: enable documentation to be build as single HTML file; GitHub #353 (fanf2)
+        repmgr: recognize "--terse" option for "repmgr cluster event"; GitHub #360 (Ian)
+        repmgr: add "--wait-start" option for "repmgr standby register"; GitHub #356 (Ian)
+        repmgr: add "%p" event notification parameter for "repmgr standby switchover"
+          containing the node ID of the demoted primary (Ian)
+        docs: various fixes and updates (Ian, Daymel, Martín, ams)
+
+4.0.1   2017-12-13
+        repmgr: ensure "repmgr node check --action=" returns appropriate return
+          code; GitHub #340 (Ian)
+        repmgr: add missing schema qualification in get_all_node_records_with_upstream()
+          query GitHub #341 (Martín)
+        repmgr: initialise "voting_term" table in application, not extension SQL;
+          GitHub #344 (Ian)
+        repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian)
+        repmgr: fix configuration file sanity check; GitHub #342 (Ian)
+
+4.0.0   2017-11-21
+        Complete rewrite with many changes; for details see the repmgr 4.0.0 release
+        notes at: https://repmgr.org/docs/4.0/release-4.0.0.html

 3.3.2   2017-06-01
        Add support for PostgreSQL 10 (Ian)
@@ -223,7 +314,7 @@
        Add a ssh_options parameter (Jay Taylor)

 2.0beta1 2012-07-27
-        Make CLONE command try to make an exact copy including $PGDATA location (Cedric) 
+        Make CLONE command try to make an exact copy including $PGDATA location (Cedric)
        Add detection of master failure (Jaime)
        Add the notion of a witness server (Jaime)
        Add autofailover capabilities (Jaime)
--- a/Makefile.in
+++ b/Makefile.in
@@ -37,9 +37,10 @@ include Makefile.global
 $(info Building against PostgreSQL $(MAJORVERSION))

 REPMGR_CLIENT_OBJS = repmgr-client.o \
-	repmgr-action-primary.o repmgr-action-standby.o repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o \
+	repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o \
+	repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o \
 	configfile.o log.o strutil.o controldata.o dirutil.o compat.o dbutils.o
-REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o
+REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o compat.o
 DATE=$(shell date "+%Y-%m-%d")

 repmgr_version.h: repmgr_version.h.in
@@ -63,6 +64,12 @@ Makefile: Makefile.in config.status configure
 Makefile.global: Makefile.global.in config.status configure
 	./config.status $@

+doc:
+	$(MAKE) -C doc all
+
+install-doc:
+	$(MAKE) -C doc install
+
 clean: additional-clean

 maintainer-clean: additional-maintainer-clean
@@ -71,6 +78,7 @@ additional-clean:
 	rm -f repmgr-client.o
 	rm -f repmgr-action-primary.o
 	rm -f repmgr-action-standby.o
+	rm -f repmgr-action-witness.o
 	rm -f repmgr-action-bdr.o
 	rm -f repmgr-action-node.o
 	rm -f repmgr-action-cluster.o
--- a/README.md
+++ b/README.md
--- a/TODO.md
+++ b/TODO.md
@@ -0,0 +1,20 @@
+TODO
+====
+
+This file contains a list of improvements which are desireable and/or have
+been requested, and which we aim to address/implement when time and resources
+permit.
+
+It is *not* a roadmap and there's no guarantee of any item being implemented
+within any given timeframe.
+
+
+Enable suspension of repmgrd failover
+-------------------------------------
+
+When performing maintenance, e.g. a switchover, it's necessary to stop all
+repmgrd nodes to prevent unintended failover; this is obviously inconvenient.
+We'll need to implement some way of notifying each repmgrd to suspend automatic
+failover until further notice.
+
+Requested in GitHub #410 ( https://github.com/2ndQuadrant/repmgr/issues/410 )
--- a/compat.c
+++ b/compat.c
@@ -6,7 +6,7 @@
 *    supported PostgreSQL versions. They're unlikely to change but
 *    it would be worth keeping an eye on them for any fixes/improvements.
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
--- a/compat.h
+++ b/compat.h
@@ -1,6 +1,6 @@
 /*
 * compat.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
--- a/config.h.in
+++ b/config.h.in
@@ -1,4 +1,2 @@
 /* config.h.in.  Generated from configure.in by autoheader.  */

-/* Only build repmgr for BDR */
-#undef BDR_ONLY
--- a/configfile.c
+++ b/configfile.c
@@ -1,7 +1,7 @@
 /*
 * config.c - parse repmgr.conf and other configuration-related functionality
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -73,6 +73,59 @@ load_config(const char *config_file, bool verbose, bool terse, t_configuration_o
 		strncpy(config_file_path, config_file, MAXPGPATH);
 		canonicalize_path(config_file_path);

+		/* relative path supplied - convert to absolute path */
+		if (config_file_path[0] != '/')
+		{
+			PQExpBufferData fullpath;
+			char *pwd = NULL;
+
+			initPQExpBuffer(&fullpath);
+
+			/*
+			 * we'll attempt to use $PWD to derive the effective path; getcwd()
+			 * will likely resolve symlinks, which may result in a path which
+			 * isn't permanent (e.g. if filesystem mountpoints change).
+			 */
+			pwd = getenv("PWD");
+
+			if (pwd != NULL)
+			{
+				appendPQExpBuffer(&fullpath,
+								  "%s", pwd);
+			}
+			else
+			{
+				/* $PWD not available - fall back to getcwd() */
+				char cwd[MAXPGPATH] = "";
+
+				if (getcwd(cwd, MAXPGPATH) == NULL)
+				{
+					log_error(_("unable to execute getcwd()"));
+					log_detail("%s", strerror(errno));
+
+					termPQExpBuffer(&fullpath);
+					exit(ERR_BAD_CONFIG);
+				}
+
+				appendPQExpBuffer(&fullpath,
+								  "%s",
+								  cwd);
+			}
+
+			appendPQExpBuffer(&fullpath,
+							  "/%s", config_file_path);
+
+			log_debug("relative configuration file converted to:\n  \"%s\"",
+					  fullpath.data);
+
+			strncpy(config_file_path, fullpath.data, MAXPGPATH);
+
+			termPQExpBuffer(&fullpath);
+
+			canonicalize_path(config_file_path);
+		}
+
+
 		if (stat(config_file_path, &stat_config) != 0)
 		{
 			log_error(_("provided configuration file \"%s\" not found: %s"),
@@ -81,6 +134,7 @@ load_config(const char *config_file, bool verbose, bool terse, t_configuration_o
 			exit(ERR_BAD_CONFIG);
 		}

+
 		if (verbose == true)
 		{
 			log_notice(_("using provided configuration file \"%s\""), config_file);
@@ -234,6 +288,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	memset(options->node_name, 0, sizeof(options->node_name));
 	memset(options->conninfo, 0, sizeof(options->conninfo));
 	memset(options->data_directory, 0, sizeof(options->data_directory));
+	memset(options->config_directory, 0, sizeof(options->data_directory));
 	memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
 	options->replication_type = REPLICATION_TYPE_PHYSICAL;

@@ -249,7 +304,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->log_status_interval = DEFAULT_LOG_STATUS_INTERVAL;

 	/*-----------------------
-	 * standby action settings
+	 * standby clone settings
 	 *------------------------
 	 */
 	options->use_replication_slots = false;
@@ -260,7 +315,16 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->tablespace_mapping.tail = NULL;
 	memset(options->recovery_min_apply_delay, 0, sizeof(options->recovery_min_apply_delay));
 	options->recovery_min_apply_delay_provided = false;
+	memset(options->archive_cleanup_command, 0, sizeof(options->archive_cleanup_command));
 	options->use_primary_conninfo_password = false;
+	memset(options->passfile, 0, sizeof(options->passfile));
+
+	/*-----------------------
+	 * standby promote settings
+	 *------------------------
+	 */
+	options->promote_check_timeout = DEFAULT_PROMOTE_CHECK_TIMEOUT;
+	options->promote_check_interval = DEFAULT_PROMOTE_CHECK_INTERVAL;

 	/*-----------------
 	 * repmgrd settings
@@ -282,6 +346,13 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 	options->async_query_timeout = DEFAULT_ASYNC_QUERY_TIMEOUT;
 	options->primary_notification_timeout = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT;
 	options->primary_follow_timeout = DEFAULT_PRIMARY_FOLLOW_TIMEOUT;
+	options->standby_reconnect_timeout = DEFAULT_STANDBY_RECONNECT_TIMEOUT;
+
+	/*-------------
+	 * witness settings
+	 *-------------
+	 */
+	options->witness_sync_interval = DEFAULT_WITNESS_SYNC_INTERVAL;

 	/*-------------
 	 * BDR settings
@@ -394,6 +465,9 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 			strncpy(options->conninfo, value, MAXLEN);
 		else if (strcmp(name, "data_directory") == 0)
 			strncpy(options->data_directory, value, MAXPGPATH);
+		else if (strcmp(name, "config_directory") == 0)
+			strncpy(options->config_directory, value, MAXPGPATH);
+
 		else if (strcmp(name, "replication_user") == 0)
 		{
 			if (strlen(value) < NAMEDATALEN)
@@ -439,13 +513,24 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 			parse_time_unit_parameter(name, value, options->recovery_min_apply_delay, error_list);
 			options->recovery_min_apply_delay_provided = true;
 		}
+		else if (strcmp(name, "archive_cleanup_command") == 0)
+			strncpy(options->archive_cleanup_command, value, MAXLEN);
 		else if (strcmp(name, "use_primary_conninfo_password") == 0)
 			options->use_primary_conninfo_password = parse_bool(value, name, error_list);
+		else if (strcmp(name, "passfile") == 0)
+			strncpy(options->passfile, value, sizeof(options->passfile));
+
+		/* standby promote settings */
+		else if (strcmp(name, "promote_check_timeout") == 0)
+			options->promote_check_timeout = repmgr_atoi(value, name, error_list, 1);
+
+		else if (strcmp(name, "promote_check_interval") == 0)
+			options->promote_check_interval = repmgr_atoi(value, name, error_list, 1);

 		/* node check settings */
 		else if (strcmp(name, "archive_ready_warning") == 0)
 			options->archive_ready_warning = repmgr_atoi(value, name, error_list, 1);
-		else if (strcmp(name, "archive_ready_critcial") == 0)
+		else if (strcmp(name, "archive_ready_critical") == 0)
 			options->archive_ready_critical = repmgr_atoi(value, name, error_list, 1);
 		else if (strcmp(name, "replication_lag_warning") == 0)
 			options->replication_lag_warning = repmgr_atoi(value, name, error_list, 1);
@@ -486,13 +571,19 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		else if (strcmp(name, "monitoring_history") == 0)
 			options->monitoring_history = parse_bool(value, name, error_list);
 		else if (strcmp(name, "degraded_monitoring_timeout") == 0)
-			options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, 1);
+			options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, -1);
 		else if (strcmp(name, "async_query_timeout") == 0)
 			options->async_query_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "primary_notification_timeout") == 0)
 			options->primary_notification_timeout = repmgr_atoi(value, name, error_list, 0);
 		else if (strcmp(name, "primary_follow_timeout") == 0)
 			options->primary_follow_timeout = repmgr_atoi(value, name, error_list, 0);
+		else if (strcmp(name, "standby_reconnect_timeout") == 0)
+			options->standby_reconnect_timeout = repmgr_atoi(value, name, error_list, 0);
+
+		/* witness settings */
+		else if (strcmp(name, "witness_sync_interval") == 0)
+			options->witness_sync_interval = repmgr_atoi(value, name, error_list, 1);

 		/* BDR settings */
 		else if (strcmp(name, "bdr_local_monitoring_only") == 0)
@@ -604,7 +695,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		 * Raise an error if a known parameter is provided with an empty
 		 * value. Currently there's no reason why empty parameters are needed;
 		 * if we want to accept those, we'd need to add stricter default
-		 * checking, as currently e.g. an empty `node` value will be converted
+		 * checking, as currently e.g. an empty `node_id` value will be converted
 		 * to '0'.
 		 */
 		if (known_parameter == true && !strlen(value))
@@ -677,7 +768,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
 		item_list_append(error_list,
 						 _("use \"barman_host\" for the hostname of the Barman server"));
 		item_list_append(error_list,
-						 _("use \"barman_server\" for the name of the [server] section in the Barman configururation file"));
+						 _("use \"barman_server\" for the name of the [server] section in the Barman configuration file"));

 	}

@@ -961,7 +1052,7 @@ reload_config(t_configuration_options *orig_options)
 		return false;
 	}

-	if (strcmp(new_options.node_name, orig_options->node_name) != 0)
+	if (strncmp(new_options.node_name, orig_options->node_name, MAXLEN) != 0)
 	{
 		log_warning(_("\"node_name\" cannot be changed, keeping current configuration"));
 		return false;
@@ -1005,7 +1096,7 @@ reload_config(t_configuration_options *orig_options)
 	}

 	/* conninfo */
-	if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
+	if (strncmp(orig_options->conninfo, new_options.conninfo, MAXLEN) != 0)
 	{
 		/* Test conninfo string works */
 		conn = establish_db_connection(new_options.conninfo, false);
@@ -1032,7 +1123,7 @@ reload_config(t_configuration_options *orig_options)
 	}

 	/* event_notification_command */
-	if (strcmp(orig_options->event_notification_command, new_options.event_notification_command) != 0)
+	if (strncmp(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN);
 		log_info(_("\"event_notification_command\" is now \"%s\""), new_options.event_notification_command);
@@ -1041,7 +1132,7 @@ reload_config(t_configuration_options *orig_options)
 	}

 	/* event_notifications */
-	if (strcmp(orig_options->event_notifications_orig, new_options.event_notifications_orig) != 0)
+	if (strncmp(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN) != 0)
 	{
 		strncpy(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN);
 		log_info(_("\"event_notifications\" is now \"%s\""), new_options.event_notifications_orig);
@@ -1061,7 +1152,7 @@ reload_config(t_configuration_options *orig_options)
 	}

 	/* follow_command */
-	if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
+	if (strncmp(orig_options->follow_command, new_options.follow_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->follow_command, new_options.follow_command, MAXLEN);
 		log_info(_("\"follow_command\" is now \"%s\""), new_options.follow_command);
@@ -1098,7 +1189,7 @@ reload_config(t_configuration_options *orig_options)


 	/* promote_command */
-	if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
+	if (strncmp(orig_options->promote_command, new_options.promote_command, MAXLEN) != 0)
 	{
 		strncpy(orig_options->promote_command, new_options.promote_command, MAXLEN);
 		log_info(_("\"promote_command\" is now \"%s\""), new_options.promote_command);
@@ -1138,18 +1229,18 @@ reload_config(t_configuration_options *orig_options)
 	 */

 	/* log_facility */
-	if (strcmp(orig_options->log_facility, new_options.log_facility) != 0)
+	if (strncmp(orig_options->log_facility, new_options.log_facility, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_facility, new_options.log_facility);
+		strncpy(orig_options->log_facility, new_options.log_facility, MAXLEN);
 		log_info(_("\"log_facility\" is now \"%s\""), new_options.log_facility);

 		log_config_changed = true;
 	}

 	/* log_file */
-	if (strcmp(orig_options->log_file, new_options.log_file) != 0)
+	if (strncmp(orig_options->log_file, new_options.log_file, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_file, new_options.log_file);
+		strncpy(orig_options->log_file, new_options.log_file, MAXLEN);
 		log_info(_("\"log_file\" is now \"%s\""), new_options.log_file);

 		log_config_changed = true;
@@ -1157,9 +1248,9 @@ reload_config(t_configuration_options *orig_options)


 	/* log_level */
-	if (strcmp(orig_options->log_level, new_options.log_level) != 0)
+	if (strncmp(orig_options->log_level, new_options.log_level, MAXLEN) != 0)
 	{
-		strcpy(orig_options->log_level, new_options.log_level);
+		strncpy(orig_options->log_level, new_options.log_level, MAXLEN);
 		log_info(_("\"log_level\" is now \"%s\""), new_options.log_level);

 		log_config_changed = true;
@@ -1533,31 +1624,109 @@ clear_event_notification_list(t_configuration_options *options)
 }


-bool
-parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
+int
+parse_output_to_argv(const char *string, char ***argv_array)
 {
 	int			options_len = 0;
 	char	   *options_string = NULL;
 	char	   *options_string_ptr = NULL;
+	int			c = 1,
+	   			argc_item = 1;
+	char	   *argv_item = NULL;
+	char	  **local_argv_array = NULL;
+	ItemListCell *cell;

 	/*
 	 * Add parsed options to this list, then copy to an array to pass to
 	 * getopt
 	 */
-	static ItemList option_argv = {NULL, NULL};
+	ItemList option_argv = {NULL, NULL};

-	char	   *argv_item = NULL;
-	int			c,
-				argc_item = 1;
+	options_len = strlen(string) + 1;
+	options_string = pg_malloc0(options_len);
+	options_string_ptr = options_string;
+
+	/* Copy the string before operating on it with strtok() */
+	strncpy(options_string, string, options_len);
+
+	/* Extract arguments into a list and keep a count of the total */
+	while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
+	{
+		item_list_append(&option_argv, trim(argv_item));
+
+		argc_item++;
+
+		if (options_string_ptr != NULL)
+			options_string_ptr = NULL;
+	}
+
+	pfree(options_string);
+
+	/*
+	 * Array of argument values to pass to getopt_long - this will need to
+	 * include an empty string as the first value (normally this would be the
+	 * program name)
+	 */
+	local_argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
+
+	/* Insert a blank dummy program name at the start of the array */
+	local_argv_array[0] = pg_malloc0(1);
+
+	/*
+	 * Copy the previously extracted arguments from our list to the array
+	 */
+	for (cell = option_argv.head; cell; cell = cell->next)
+	{
+		int			argv_len = strlen(cell->string) + 1;
+
+		local_argv_array[c] = (char *)pg_malloc0(argv_len);
+
+		strncpy(local_argv_array[c], cell->string, argv_len);
+
+		c++;
+	}
+
+	local_argv_array[c] = NULL;
+
+	item_list_free(&option_argv);
+
+	*argv_array = local_argv_array;
+
+	return argc_item;
+}
+
+
+void
+free_parsed_argv(char ***argv_array)
+{
+	char	  **local_argv_array = *argv_array;
+	int			i = 0;
+
+	while (local_argv_array[i] != NULL)
+	{
+		pfree((char *)local_argv_array[i]);
+		i++;
+	}
+
+	pfree((char **)local_argv_array);
+	*argv_array = NULL;
+}
+
+
+bool
+parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
+{
+	bool		backup_options_ok = true;
+
+	int			c = 0,
+				argc_item = 0;

 	char	  **argv_array = NULL;
-	ItemListCell *cell = NULL;

 	int			optindex = 0;

 	struct option *long_options = NULL;

-	bool		backup_options_ok = true;

 	/* We're only interested in these options */
 	static struct option long_options_9[] =
@@ -1583,56 +1752,12 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
 	if (!strlen(pg_basebackup_options))
 		return backup_options_ok;

-	options_len = strlen(pg_basebackup_options) + 1;
-	options_string = pg_malloc(options_len);
-	options_string_ptr = options_string;
-
 	if (server_version_num >= 100000)
 		long_options = long_options_10;
 	else
 		long_options = long_options_9;

-	/* Copy the string before operating on it with strtok() */
-	strncpy(options_string, pg_basebackup_options, options_len);
-
-	/* Extract arguments into a list and keep a count of the total */
-	while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
-	{
-		item_list_append(&option_argv, argv_item);
-
-		argc_item++;
-
-		if (options_string_ptr != NULL)
-			options_string_ptr = NULL;
-	}
-
-	/*
-	 * Array of argument values to pass to getopt_long - this will need to
-	 * include an empty string as the first value (normally this would be the
-	 * program name)
-	 */
-	argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
-
-	/* Insert a blank dummy program name at the start of the array */
-	argv_array[0] = pg_malloc0(1);
-
-	c = 1;
-
-	/*
-	 * Copy the previously extracted arguments from our list to the array
-	 */
-	for (cell = option_argv.head; cell; cell = cell->next)
-	{
-		int			argv_len = strlen(cell->string) + 1;
-
-		argv_array[c] = pg_malloc0(argv_len);
-
-		strncpy(argv_array[c], cell->string, argv_len);
-
-		c++;
-	}
-
-	argv_array[c] = NULL;
+	argc_item = parse_output_to_argv(pg_basebackup_options, &argv_array);

 	/* Reset getopt's optind variable */
 	optind = 0;
@@ -1676,15 +1801,7 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
 		backup_options_ok = false;
 	}

-	pfree(options_string);
-
-	{
-		int			i;
-
-		for (i = 0; i < argc_item + 2; i++)
-			pfree(argv_array[i]);
-	}
-	pfree(argv_array);
+	free_parsed_argv(&argv_array);

 	return backup_options_ok;
 }
--- a/configfile.h
+++ b/configfile.h
@@ -1,7 +1,7 @@
 /*
 * configfile.h
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 *
 * This program is free software: you can redistribute it and/or modify
@@ -73,6 +73,7 @@ typedef struct
 	char		conninfo[MAXLEN];
 	char		replication_user[NAMEDATALEN];
 	char		data_directory[MAXPGPATH];
+	char		config_directory[MAXPGPATH];
 	char		pg_bindir[MAXPGPATH];
 	int			replication_type;

@@ -82,14 +83,20 @@ typedef struct
 	char		log_file[MAXLEN];
 	int			log_status_interval;

-	/* standby action settings */
+	/* standby clone settings */
 	bool		use_replication_slots;
 	char		pg_basebackup_options[MAXLEN];
 	char		restore_command[MAXLEN];
 	TablespaceList tablespace_mapping;
 	char		recovery_min_apply_delay[MAXLEN];
 	bool		recovery_min_apply_delay_provided;
+	char		archive_cleanup_command[MAXLEN];
 	bool		use_primary_conninfo_password;
+	char		passfile[MAXPGPATH];
+
+	/* standby promote settings */
+	int			promote_check_timeout;
+	int			promote_check_interval;

 	/* node check settings */
 	int			archive_ready_warning;
@@ -97,6 +104,9 @@ typedef struct
 	int			replication_lag_warning;
 	int			replication_lag_critical;

+	/* witness settings */
+	int			witness_sync_interval;
+
 	/* repmgrd settings */
 	failover_mode_opt failover;
 	char		location[MAXLEN];
@@ -111,6 +121,7 @@ typedef struct
 	int			async_query_timeout;
 	int			primary_notification_timeout;
 	int			primary_follow_timeout;
+	int			standby_reconnect_timeout;

 	/* BDR settings */
 	bool		bdr_local_monitoring_only;
@@ -149,14 +160,18 @@ typedef struct

 #define T_CONFIGURATION_OPTIONS_INITIALIZER { \
 		/* node information */ \
-		UNKNOWN_NODE_ID, "", "", "", "", "", REPLICATION_TYPE_PHYSICAL,	\
+		UNKNOWN_NODE_ID, "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL,	\
 		/* log settings */ \
 		"", "", "", DEFAULT_LOG_STATUS_INTERVAL,	\
-		/* standby action settings */ \
-		false, "", "", { NULL, NULL }, "", false, false, \
+		/* standby clone settings */ \
+		false, "", "", { NULL, NULL }, "", false, "", false, "", \
+		/* standby promote settings */ \
+		DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
 		/* node check settings */ \
 		DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
 		DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
+		/* witness settings */ \
+		DEFAULT_WITNESS_SYNC_INTERVAL, \
 		/* repmgrd settings */ \
 		FAILOVER_MANUAL, DEFAULT_LOCATION, DEFAULT_PRIORITY, "", "", \
 		DEFAULT_MONITORING_INTERVAL, \
@@ -166,6 +181,7 @@ typedef struct
 		DEFAULT_ASYNC_QUERY_TIMEOUT, \
 		DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT,	\
 		DEFAULT_PRIMARY_FOLLOW_TIMEOUT,	\
+		DEFAULT_STANDBY_RECONNECT_TIMEOUT,	\
 		/* BDR settings */ \
 		false, DEFAULT_BDR_RECOVERY_TIMEOUT, \
 		/* service settings */ \
@@ -242,7 +258,6 @@ typedef struct
 }


-
 void		set_progname(const char *argv0);
 const char *progname(void);

@@ -257,12 +272,15 @@ int repmgr_atoi(const char *s,
 			ItemList *error_list,
 			int minval);

-
 bool parse_pg_basebackup_options(const char *pg_basebackup_options,
 							t_basebackup_options *backup_options,
 							int server_version_num,
 							ItemList *error_list);

+int parse_output_to_argv(const char *string, char ***argv_array);
+void free_parsed_argv(char ***argv_array);
+
+
 /* called by repmgr-client and repmgrd */
 void		exit_with_cli_errors(ItemList *error_list);
 void		print_item_list(ItemList *item_list);
--- a/45
+++ b/45
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.69 for repmgr 4.0beta1.
+# Generated by GNU Autoconf 2.69 for repmgr 4.0.5.
 #
 # Report bugs to <pgsql-bugs@postgresql.org>.
 #
@@ -11,7 +11,7 @@
 # This configure script is free software; the Free Software Foundation
 # gives unlimited permission to copy, distribute and modify it.
 #
-# Copyright (c) 2010-2017, 2ndQuadrant Ltd.
+# Copyright (c) 2010-2018, 2ndQuadrant Ltd.
 ## -------------------- ##
 ## M4sh Initialization. ##
 ## -------------------- ##
@@ -582,8 +582,8 @@ MAKEFLAGS=
 # Identity of this package.
 PACKAGE_NAME='repmgr'
 PACKAGE_TARNAME='repmgr'
-PACKAGE_VERSION='4.0beta1'
-PACKAGE_STRING='repmgr 4.0beta1'
+PACKAGE_VERSION='4.0.5'
+PACKAGE_STRING='repmgr 4.0.5'
 PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org'
 PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/'

@@ -633,7 +633,6 @@ SHELL'
 ac_subst_files=''
 ac_user_opts='
 enable_option_checking
-with_bdr_only
 '
      ac_precious_vars='build_alias
 host_alias
@@ -1179,7 +1178,7 @@ if test "$ac_init_help" = "long"; then
  # Omit some internal or obsolete options to make the list less imposing.
  # This message is too long to be a string in the A/UX 3.1 sh.
  cat <<_ACEOF
-\`configure' configures repmgr 4.0beta1 to adapt to many kinds of systems.
+\`configure' configures repmgr 4.0.5 to adapt to many kinds of systems.

 Usage: $0 [OPTION]... [VAR=VALUE]...

@@ -1240,15 +1239,10 @@ fi

 if test -n "$ac_init_help"; then
  case $ac_init_help in
-     short | recursive ) echo "Configuration of repmgr 4.0beta1:";;
+     short | recursive ) echo "Configuration of repmgr 4.0.5:";;
   esac
  cat <<\_ACEOF

-Optional Packages:
-  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
-  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no)
-  --with-bdr-only         BDR-only build
-
 Some influential environment variables:
  PG_CONFIG   Location to find pg_config for target PostgreSQL (default PATH)

@@ -1319,14 +1313,14 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
  cat <<\_ACEOF
-repmgr configure 4.0beta1
+repmgr configure 4.0.5
 generated by GNU Autoconf 2.69

 Copyright (C) 2012 Free Software Foundation, Inc.
 This configure script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it.

-Copyright (c) 2010-2017, 2ndQuadrant Ltd.
+Copyright (c) 2010-2018, 2ndQuadrant Ltd.
 _ACEOF
  exit
 fi
@@ -1338,7 +1332,7 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.

-It was created by repmgr $as_me 4.0beta1, which was
+It was created by repmgr $as_me 4.0.5, which was
 generated by GNU Autoconf 2.69.  Invocation command line was

  $ $0 $@
@@ -1694,20 +1688,6 @@ ac_config_headers="$ac_config_headers config.h"



-
-# Check whether --with-bdr_only was given.
-if test "${with_bdr_only+set}" = set; then :
-  withval=$with_bdr_only;
-fi
-
-if test "x$with_bdr_only" != "x"; then :
-
-$as_echo "#define BDR_ONLY \"1\"" >>confdefs.h
-
-
-fi
-
-
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5
 $as_echo_n "checking for a sed that does not truncate output... " >&6; }
 if ${ac_cv_path_SED+:} false; then :
@@ -1871,6 +1851,8 @@ ac_config_files="$ac_config_files Makefile"

 ac_config_files="$ac_config_files Makefile.global"

+ac_config_files="$ac_config_files doc/Makefile"
+
 cat >confcache <<\_ACEOF
 # This file is a shell script that caches the results of configure
 # tests run on this system so they can be shared between configure
@@ -2377,7 +2359,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by repmgr $as_me 4.0beta1, which was
+This file was extended by repmgr $as_me 4.0.5, which was
 generated by GNU Autoconf 2.69.  Invocation command line was

  CONFIG_FILES    = $CONFIG_FILES
@@ -2440,7 +2422,7 @@ _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-repmgr config.status 4.0beta1
+repmgr config.status 4.0.5
 configured by $0, generated by GNU Autoconf 2.69,
  with options \\"\$ac_cs_config\\"

@@ -2564,6 +2546,7 @@ do
    "config.h") CONFIG_HEADERS="$CONFIG_HEADERS config.h" ;;
    "Makefile") CONFIG_FILES="$CONFIG_FILES Makefile" ;;
    "Makefile.global") CONFIG_FILES="$CONFIG_FILES Makefile.global" ;;
+    "doc/Makefile") CONFIG_FILES="$CONFIG_FILES doc/Makefile" ;;

  *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;;
  esac
--- a/configure.in
+++ b/configure.in
@@ -1,17 +1,11 @@
-AC_INIT([repmgr], [4.0beta1], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
+AC_INIT([repmgr], [4.0.5], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])

-AC_COPYRIGHT([Copyright (c) 2010-2017, 2ndQuadrant Ltd.])
+AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.])

 AC_CONFIG_HEADER(config.h)

 AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)])

-AC_ARG_WITH([bdr_only], [AS_HELP_STRING([--with-bdr-only], [BDR-only build])])
-AS_IF([test "x$with_bdr_only" != "x"],
-    [AC_DEFINE([BDR_ONLY], ["1"], [Only build repmgr for BDR])]
-)
-
-
 AC_PROG_SED

 if test -z "$PG_CONFIG"; then
@@ -65,5 +59,6 @@ AC_SUBST(vpath_build)

 AC_CONFIG_FILES([Makefile])
 AC_CONFIG_FILES([Makefile.global])
+AC_CONFIG_FILES([doc/Makefile])
 AC_OUTPUT

--- a/controldata.c
+++ b/controldata.c
@@ -1,6 +1,6 @@
 /*
 * controldata.c
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
@@ -37,13 +37,8 @@ get_system_identifier(const char *data_directory)
 	uint64		system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;

 	control_file_info = get_controlfile(data_directory);
+	system_identifier = control_file_info->system_identifier;

-	if (control_file_info->control_file_processed == true)
-		system_identifier = control_file_info->control_file->system_identifier;
-	else
-		system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
-
-	pfree(control_file_info->control_file);
 	pfree(control_file_info);

 	return system_identifier;
@@ -57,13 +52,8 @@ get_db_state(const char *data_directory)

 	control_file_info = get_controlfile(data_directory);

-	if (control_file_info->control_file_processed == true)
-		state = control_file_info->control_file->state;
-	else
-		/* if we were unable to parse the control file, assume DB is shut down */
-		state = DB_SHUTDOWNED;
+	state = control_file_info->state;

-	pfree(control_file_info->control_file);
 	pfree(control_file_info);

 	return state;
@@ -78,12 +68,8 @@ get_latest_checkpoint_location(const char *data_directory)

 	control_file_info = get_controlfile(data_directory);

-	if (control_file_info->control_file_processed == false)
-		return InvalidXLogRecPtr;
+	checkPoint = control_file_info->checkPoint;

-	checkPoint = control_file_info->control_file->checkPoint;
-
-	pfree(control_file_info->control_file);
 	pfree(control_file_info);

 	return checkPoint;
@@ -98,16 +84,8 @@ get_data_checksum_version(const char *data_directory)

 	control_file_info = get_controlfile(data_directory);

-	if (control_file_info->control_file_processed == false)
-	{
-		data_checksum_version = -1;
-	}
-	else
-	{
-		data_checksum_version = (int) control_file_info->control_file->data_checksum_version;
-	}
+	data_checksum_version = (int) control_file_info->data_checksum_version;

-	pfree(control_file_info->control_file);
 	pfree(control_file_info);

 	return data_checksum_version;
@@ -139,33 +117,109 @@ describe_db_state(DBState state)


 /*
- * we maintain our own version of get_controlfile() as we need cross-version
+ * We maintain our own version of get_controlfile() as we need cross-version
 * compatibility, and also don't care if the file isn't readable.
 */
 static ControlFileInfo *
 get_controlfile(const char *DataDir)
 {
 	ControlFileInfo *control_file_info;
-	int			fd;
+	FILE	   *fp = NULL;
+	int			fd, ret, version_num;
+	char		PgVersionPath[MAXPGPATH] = "";
 	char		ControlFilePath[MAXPGPATH] = "";
+	char		file_version_string[64] = "";
+	long		file_major, file_minor;
+	char	   *endptr = NULL;
+	void	   *ControlFileDataPtr = NULL;
+	int			expected_size = 0;

 	control_file_info = palloc0(sizeof(ControlFileInfo));
+
+	/* set default values */
 	control_file_info->control_file_processed = false;
-	control_file_info->control_file = palloc0(sizeof(ControlFileData));
+	control_file_info->system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
+	control_file_info->state = DB_SHUTDOWNED;
+	control_file_info->checkPoint = InvalidXLogRecPtr;
+	control_file_info->data_checksum_version = -1;
+
+	/*
+	 * Read PG_VERSION, as we'll need to determine which struct to read
+	 * the control file contents into
+	 */
+	snprintf(PgVersionPath, MAXPGPATH, "%s/PG_VERSION", DataDir);
+
+	fp = fopen(PgVersionPath, "r");
+
+	if (fp == NULL)
+	{
+		log_warning(_("could not open file \"%s\" for reading"),
+					PgVersionPath);
+		log_detail("%s", strerror(errno));
+		return control_file_info;
+	}
+
+	file_version_string[0] = '\0';
+
+	ret = fscanf(fp, "%63s", file_version_string);
+	fclose(fp);
+
+	if (ret != 1 || endptr == file_version_string)
+	{
+		log_warning(_("unable to determine major version number from PG_VERSION"));
+
+		return control_file_info;
+	}
+
+	file_major = strtol(file_version_string, &endptr, 10);
+	file_minor = 0;
+
+	if (*endptr == '.')
+		file_minor = strtol(endptr + 1, NULL, 10);
+
+	version_num = ((int) file_major * 10000) + ((int) file_minor * 100);
+
+	if (version_num < 90300)
+	{
+		log_warning(_("Data directory appears to be initialised for %s"), file_version_string);
+		return control_file_info;
+	}
+

 	snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);

 	if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1)
 	{
-		log_debug("could not open file \"%s\" for reading: %s",
-				  ControlFilePath, strerror(errno));
+		log_warning(_("could not open file \"%s\" for reading"),
+					ControlFilePath);
+		log_detail("%s", strerror(errno));
 		return control_file_info;
 	}

-	if (read(fd, control_file_info->control_file, sizeof(ControlFileData)) != sizeof(ControlFileData))
+
+	if (version_num >= 90500)
 	{
-		log_debug("could not read file \"%s\": %s",
-				  ControlFilePath, strerror(errno));
+		expected_size = sizeof(ControlFileData95);
+		ControlFileDataPtr = palloc0(expected_size);
+	}
+	else if (version_num >= 90400)
+	{
+		expected_size = sizeof(ControlFileData94);
+		ControlFileDataPtr = palloc0(expected_size);
+	}
+	else if (version_num >= 90300)
+	{
+		expected_size = sizeof(ControlFileData93);
+		ControlFileDataPtr = palloc0(expected_size);
+	}
+
+
+	if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
+	{
+		log_warning(_("could not read file \"%s\""),
+					ControlFilePath);
+		log_detail("%s", strerror(errno));
+
 		return control_file_info;
 	}

@@ -173,6 +227,33 @@ get_controlfile(const char *DataDir)

 	control_file_info->control_file_processed = true;

+	if (version_num >= 90500)
+	{
+		ControlFileData95 *ptr = (struct ControlFileData95 *)ControlFileDataPtr;
+		control_file_info->system_identifier = ptr->system_identifier;
+		control_file_info->state = ptr->state;
+		control_file_info->checkPoint = ptr->checkPoint;
+		control_file_info->data_checksum_version = ptr->data_checksum_version;
+	}
+	else if (version_num >= 90400)
+	{
+		ControlFileData94 *ptr = (struct ControlFileData94 *)ControlFileDataPtr;
+		control_file_info->system_identifier = ptr->system_identifier;
+		control_file_info->state = ptr->state;
+		control_file_info->checkPoint = ptr->checkPoint;
+		control_file_info->data_checksum_version = ptr->data_checksum_version;
+	}
+	else if (version_num >= 90300)
+	{
+		ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
+		control_file_info->system_identifier = ptr->system_identifier;
+		control_file_info->state = ptr->state;
+		control_file_info->checkPoint = ptr->checkPoint;
+		control_file_info->data_checksum_version = ptr->data_checksum_version;
+	}
+
+	pfree(ControlFileDataPtr);
+
 	/*
 	 * We don't check the CRC here as we're potentially checking a pg_control
 	 * file from a different PostgreSQL version to the one repmgr was compiled
--- a/controldata.h
+++ b/controldata.h
@@ -1,6 +1,6 @@
 /*
 * controldata.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
@@ -12,12 +12,261 @@
 #include "postgres_fe.h"
 #include "catalog/pg_control.h"

+/*
+ * A simplified representation of pg_control containing only those fields
+ * required by repmgr.
+ */
 typedef struct
 {
 	bool		control_file_processed;
-	ControlFileData *control_file;
+	uint64		system_identifier;
+	DBState		state;
+	XLogRecPtr	checkPoint;
+	uint32		data_checksum_version;
 } ControlFileInfo;

+
+
+/* Same for 9.3, 9.4 */
+typedef struct CheckPoint93
+{
+	XLogRecPtr	redo;			/* next RecPtr available when we began to
+								 * create CheckPoint (i.e. REDO start point) */
+	TimeLineID	ThisTimeLineID; /* current TLI */
+	TimeLineID	PrevTimeLineID; /* previous TLI, if this record begins a new
+								 * timeline (equals ThisTimeLineID otherwise) */
+	bool		fullPageWrites; /* current full_page_writes */
+	uint32		nextXidEpoch;	/* higher-order bits of nextXid */
+	TransactionId nextXid;		/* next free XID */
+	Oid			nextOid;		/* next free OID */
+	MultiXactId nextMulti;		/* next free MultiXactId */
+	MultiXactOffset nextMultiOffset;	/* next free MultiXact offset */
+	TransactionId oldestXid;	/* cluster-wide minimum datfrozenxid */
+	Oid			oldestXidDB;	/* database with minimum datfrozenxid */
+	MultiXactId oldestMulti;	/* cluster-wide minimum datminmxid */
+	Oid			oldestMultiDB;	/* database with minimum datminmxid */
+	pg_time_t	time;			/* time stamp of checkpoint */
+
+	TransactionId oldestActiveXid;
+} CheckPoint93;
+
+
+/* Same for 9.5, 9.6, 10, HEAD */
+typedef struct CheckPoint95
+{
+	XLogRecPtr	redo;			/* next RecPtr available when we began to
+								 * create CheckPoint (i.e. REDO start point) */
+	TimeLineID	ThisTimeLineID; /* current TLI */
+	TimeLineID	PrevTimeLineID; /* previous TLI, if this record begins a new
+								 * timeline (equals ThisTimeLineID otherwise) */
+	bool		fullPageWrites; /* current full_page_writes */
+	uint32		nextXidEpoch;	/* higher-order bits of nextXid */
+	TransactionId nextXid;		/* next free XID */
+	Oid			nextOid;		/* next free OID */
+	MultiXactId nextMulti;		/* next free MultiXactId */
+	MultiXactOffset nextMultiOffset;	/* next free MultiXact offset */
+	TransactionId oldestXid;	/* cluster-wide minimum datfrozenxid */
+	Oid			oldestXidDB;	/* database with minimum datfrozenxid */
+	MultiXactId oldestMulti;	/* cluster-wide minimum datminmxid */
+	Oid			oldestMultiDB;	/* database with minimum datminmxid */
+	pg_time_t	time;			/* time stamp of checkpoint */
+	TransactionId oldestCommitTsXid;	/* oldest Xid with valid commit
+										 * timestamp */
+	TransactionId newestCommitTsXid;	/* newest Xid with valid commit
+										 * timestamp */
+
+	TransactionId oldestActiveXid;
+} CheckPoint95;
+
+
+typedef struct ControlFileData93
+{
+	uint64		system_identifier;
+
+	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
+	uint32		catalog_version_no;		/* see catversion.h */
+
+	DBState		state;			/* see enum above */
+	pg_time_t	time;			/* time stamp of last pg_control update */
+	XLogRecPtr	checkPoint;		/* last check point record ptr */
+	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
+
+	CheckPoint93	checkPointCopy; /* copy of last check point record */
+
+	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
+
+	XLogRecPtr	minRecoveryPoint;
+	TimeLineID	minRecoveryPointTLI;
+	XLogRecPtr	backupStartPoint;
+	XLogRecPtr	backupEndPoint;
+	bool		backupEndRequired;
+
+	int			wal_level;
+	int			MaxConnections;
+	int			max_prepared_xacts;
+	int			max_locks_per_xact;
+
+	uint32		maxAlign;		/* alignment requirement for tuples */
+	double		floatFormat;	/* constant 1234567.0 */
+
+	uint32		blcksz;			/* data block size for this DB */
+	uint32		relseg_size;	/* blocks per segment of large relation */
+
+	uint32		xlog_blcksz;	/* block size within WAL files */
+	uint32		xlog_seg_size;	/* size of each WAL segment */
+
+	uint32		nameDataLen;	/* catalog name field width */
+	uint32		indexMaxKeys;	/* max number of columns in an index */
+
+	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
+
+	/* flag indicating internal format of timestamp, interval, time */
+	bool		enableIntTimes; /* int64 storage enabled? */
+
+	/* flags indicating pass-by-value status of various types */
+	bool		float4ByVal;	/* float4 pass-by-value? */
+	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
+
+	/* Are data pages protected by checksums? Zero if no checksum version */
+	uint32		data_checksum_version;
+
+} ControlFileData93;
+
+
+/*
+ * Following fields added since 9.3:
+ *
+ * 	int			max_worker_processes;
+ *  int			max_prepared_xacts;
+ *  int			max_locks_per_xact;
+ *
+ */
+typedef struct ControlFileData94
+{
+	uint64		system_identifier;
+
+	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
+	uint32		catalog_version_no;		/* see catversion.h */
+
+	DBState		state;			/* see enum above */
+	pg_time_t	time;			/* time stamp of last pg_control update */
+	XLogRecPtr	checkPoint;		/* last check point record ptr */
+	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
+
+	CheckPoint93	checkPointCopy; /* copy of last check point record */
+
+	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
+
+	XLogRecPtr	minRecoveryPoint;
+	TimeLineID	minRecoveryPointTLI;
+	XLogRecPtr	backupStartPoint;
+	XLogRecPtr	backupEndPoint;
+	bool		backupEndRequired;
+
+	int			wal_level;
+	bool		wal_log_hints;
+	int			MaxConnections;
+	int			max_worker_processes;
+	int			max_prepared_xacts;
+	int			max_locks_per_xact;
+
+	uint32		maxAlign;		/* alignment requirement for tuples */
+	double		floatFormat;	/* constant 1234567.0 */
+
+	uint32		blcksz;			/* data block size for this DB */
+	uint32		relseg_size;	/* blocks per segment of large relation */
+
+	uint32		xlog_blcksz;	/* block size within WAL files */
+	uint32		xlog_seg_size;	/* size of each WAL segment */
+
+	uint32		nameDataLen;	/* catalog name field width */
+	uint32		indexMaxKeys;	/* max number of columns in an index */
+
+	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
+	uint32		loblksize;		/* chunk size in pg_largeobject */
+
+	bool		enableIntTimes; /* int64 storage enabled? */
+
+	bool		float4ByVal;	/* float4 pass-by-value? */
+	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
+
+	/* Are data pages protected by checksums? Zero if no checksum version */
+	uint32		data_checksum_version;
+
+} ControlFileData94;
+
+
+
+/*
+ * Following field added since 9.4:
+ *
+ *	bool		track_commit_timestamp;
+ *
+ * Unchanged in 9.6
+ *
+ * In 10, following field appended *after* "data_checksum_version":
+ *
+ *	char		mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
+ *
+ * (but we don't care about that)
+ */
+
+typedef struct ControlFileData95
+{
+	uint64		system_identifier;
+
+	uint32		pg_control_version;		/* PG_CONTROL_VERSION */
+	uint32		catalog_version_no;		/* see catversion.h */
+
+	DBState		state;			/* see enum above */
+	pg_time_t	time;			/* time stamp of last pg_control update */
+	XLogRecPtr	checkPoint;		/* last check point record ptr */
+	XLogRecPtr	prevCheckPoint; /* previous check point record ptr */
+
+	CheckPoint95	checkPointCopy; /* copy of last check point record */
+
+	XLogRecPtr	unloggedLSN;	/* current fake LSN value, for unlogged rels */
+
+	XLogRecPtr	minRecoveryPoint;
+	TimeLineID	minRecoveryPointTLI;
+	XLogRecPtr	backupStartPoint;
+	XLogRecPtr	backupEndPoint;
+	bool		backupEndRequired;
+
+	int			wal_level;
+	bool		wal_log_hints;
+	int			MaxConnections;
+	int			max_worker_processes;
+	int			max_prepared_xacts;
+	int			max_locks_per_xact;
+	bool		track_commit_timestamp;
+
+	uint32		maxAlign;		/* alignment requirement for tuples */
+	double		floatFormat;	/* constant 1234567.0 */
+
+	uint32		blcksz;			/* data block size for this DB */
+	uint32		relseg_size;	/* blocks per segment of large relation */
+
+	uint32		xlog_blcksz;	/* block size within WAL files */
+	uint32		xlog_seg_size;	/* size of each WAL segment */
+
+	uint32		nameDataLen;	/* catalog name field width */
+	uint32		indexMaxKeys;	/* max number of columns in an index */
+
+	uint32		toast_max_chunk_size;	/* chunk size in TOAST tables */
+	uint32		loblksize;		/* chunk size in pg_largeobject */
+
+	bool		enableIntTimes; /* int64 storage enabled? */
+
+	bool		float4ByVal;	/* float4 pass-by-value? */
+	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
+
+	uint32		data_checksum_version;
+
+} ControlFileData95;
+
+
+
 extern DBState get_db_state(const char *data_directory);
 extern const char *describe_db_state(DBState state);
 extern int	get_data_checksum_version(const char *data_directory);
--- a/dbutils.c
+++ b/dbutils.c
--- a/dbutils.h
+++ b/dbutils.h
@@ -1,7 +1,7 @@
 /*
 * dbutils.h
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@
 #include "strutil.h"
 #include "voting.h"

-#define REPMGR_NODES_COLUMNS "node_id, type, upstream_node_id, node_name, conninfo, repluser, slot_name, location, priority, active, config_file, '' AS upstream_node_name "
+#define REPMGR_NODES_COLUMNS "n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name "
 #define BDR_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_status, node_name, node_local_dsn, node_init_from_dsn, node_read_only, node_seq_id"

 #define ERRBUFF_SIZE 512
@@ -38,6 +38,7 @@ typedef enum
 	UNKNOWN = 0,
 	PRIMARY,
 	STANDBY,
+	WITNESS,
 	BDR
 } t_server_type;

@@ -73,17 +74,18 @@ typedef enum
 {
 	NODE_STATUS_UNKNOWN = -1,
 	NODE_STATUS_UP,
+	NODE_STATUS_SHUTTING_DOWN,
 	NODE_STATUS_DOWN,
 	NODE_STATUS_UNCLEAN_SHUTDOWN
 } NodeStatus;

 typedef enum
 {
-	VR_VOTE_REFUSED = -1,
-	VR_POSITIVE_VOTE,
-	VR_NEGATIVE_VOTE
-} VoteRequestResult;
-
+	CONN_UNKNOWN = -1,
+	CONN_OK,
+	CONN_BAD,
+	CONN_ERROR
+} ConnectionStatus;

 typedef enum
 {
@@ -181,11 +183,13 @@ typedef struct s_event_info
 {
 	char	   *node_name;
 	char	   *conninfo_str;
+	int			node_id;
 } t_event_info;

 #define T_EVENT_INFO_INITIALIZER { \
 	NULL, \
- 	NULL \
+	NULL, \
+	UNKNOWN_NODE_ID \
 }


@@ -339,9 +343,6 @@ bool		atobool(const char *value);
 PGconn *establish_db_connection(const char *conninfo,
 						const bool exit_on_error);
 PGconn	   *establish_db_connection_quiet(const char *conninfo);
-PGconn *establish_db_connection_as_user(const char *conninfo,
-								const char *user,
-								const bool exit_on_error);

 PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
 								  const bool exit_on_error);
@@ -352,6 +353,7 @@ PGconn	   *get_primary_connection(PGconn *standby_conn, int *primary_id, char *p
 PGconn	   *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);

 bool		is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
+void		close_connection(PGconn **conn);

 /* conninfo manipulation functions */
 bool		get_conninfo_value(const char *conninfo, const char *keyword, char *output);
@@ -363,8 +365,9 @@ void		conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
 void		param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
 void		param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
 char	   *param_get(t_conninfo_param_list *param_list, const char *param);
-bool		parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char *errmsg, bool ignore_local_params);
+bool		parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params);
 char	   *param_list_to_string(t_conninfo_param_list *param_list);
+bool		has_passfile(void);

 /* transaction functions */
 bool		begin_transaction(PGconn *conn);
@@ -375,10 +378,8 @@ bool		check_cluster_schema(PGconn *conn);
 /* GUC manipulation functions */
 bool		set_config(PGconn *conn, const char *config_param, const char *config_value);
 bool		set_config_bool(PGconn *conn, const char *config_param, bool state);
-int guc_set(PGconn *conn, const char *parameter, const char *op,
-		const char *value);
-int guc_set_typed(PGconn *conn, const char *parameter, const char *op,
-			  const char *value, const char *datatype);
+int		    guc_set(PGconn *conn, const char *parameter, const char *op, const char *value);
+int			guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype);
 bool		get_pg_setting(PGconn *conn, const char *setting, char *output);

 /* server information functions */
@@ -386,10 +387,10 @@ bool		get_cluster_size(PGconn *conn, char *size);
 int			get_server_version(PGconn *conn, char *server_version);
 RecoveryType get_recovery_type(PGconn *conn);
 int			get_primary_node_id(PGconn *conn);
-bool		can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
 int			get_ready_archive_files(PGconn *conn, const char *data_directory);
 bool		identify_system(PGconn *repl_conn, t_system_identification *identification);
 bool		repmgrd_set_local_node_id(PGconn *conn, int local_node_id);
+int			repmgrd_get_local_node_id(PGconn *conn);

 /* extension functions */
 ExtensionStatus get_repmgr_extension_status(PGconn *conn);
@@ -404,6 +405,8 @@ t_server_type parse_node_type(const char *type);
 const char *get_node_type_string(t_server_type type);

 RecordStatus get_node_record(PGconn *conn, int node_id, t_node_info *node_info);
+RecordStatus get_node_record_with_upstream(PGconn *conn, int node_id, t_node_info *node_info);
+
 RecordStatus get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_info);
 t_node_info *get_node_record_pointer(PGconn *conn, int node_id);

@@ -414,17 +417,23 @@ void		get_all_node_records(PGconn *conn, NodeInfoList *node_list);
 void		get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
 void		get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
 void		get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list);
-void		get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
+bool		get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
+bool		get_downstream_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *noede_list);

 bool		create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
 bool		update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
 bool		delete_node_record(PGconn *conn, int node);
+bool		truncate_node_records(PGconn *conn);

 bool		update_node_record_set_active(PGconn *conn, int this_node_id, bool active);
 bool		update_node_record_set_primary(PGconn *conn, int this_node_id);
+bool		update_node_record_set_active_standby(PGconn *conn, int this_node_id);
 bool		update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id);
 bool		update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active);
 bool		update_node_record_conn_priority(PGconn *conn, t_configuration_options *options);
+bool		update_node_record_slot_name(PGconn *primary_conn, int node_id, char *slot_name);
+
+bool		witness_copy_node_records(PGconn *primary_conn, PGconn *witness_conn);

 void		clear_node_info_list(NodeInfoList *nodes);

@@ -438,11 +447,14 @@ void		config_file_list_add(t_configfile_list *list, const char *file, const char
 bool		create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
 bool		create_event_notification(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
 bool		create_event_notification_extended(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details, t_event_info *event_info);
+PGresult   *get_event_records(PGconn *conn, int node_id, const char *node_name, const char *event, bool all, int limit);

 /* replication slot functions */
+void		create_slot_name(char *slot_name, int node_id);
 bool		create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
 bool		drop_replication_slot(PGconn *conn, char *slot_name);
 RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
+int			get_free_replication_slots(PGconn *conn);

 /* tablespace functions */
 bool		get_tablespace_name_by_location(PGconn *conn, const char *location, char *name);
@@ -453,6 +465,8 @@ int			wait_connection_availability(PGconn *conn, long long timeout);

 /* node availability functions */
 bool		is_server_available(const char *conninfo);
+bool		is_server_available_params(t_conninfo_param_list *param_list);
+void		connection_ping(PGconn *conn);

 /* monitoring functions  */
 void
@@ -474,9 +488,9 @@ bool		delete_monitoring_records(PGconn *primary_conn, int keep_history);


 /* node voting functions */
-NodeVotingStatus get_voting_status(PGconn *conn);
-VoteRequestResult request_vote(PGconn *conn, t_node_info *this_node, t_node_info *other_node, int electoral_term);
-int			set_voting_status_initiated(PGconn *conn);
+void		initialize_voting_term(PGconn *conn);
+int			get_current_term(PGconn *conn);
+void		increment_current_term(PGconn *conn);
 bool		announce_candidature(PGconn *conn, t_node_info *this_node, t_node_info *other_node, int electoral_term);
 void		notify_follow_primary(PGconn *conn, int primary_node_id);
 bool		get_new_primary(PGconn *conn, int *primary_node_id);
@@ -487,24 +501,30 @@ XLogRecPtr	get_current_wal_lsn(PGconn *conn);
 XLogRecPtr	get_last_wal_receive_location(PGconn *conn);
 bool		get_replication_info(PGconn *conn, ReplInfo *replication_info);
 int			get_replication_lag_seconds(PGconn *conn);
-void		get_node_replication_stats(PGconn *conn, t_node_info *node_info);
+void		get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *node_info);
 bool		is_downstream_node_attached(PGconn *conn, char *node_name);

 /* BDR functions */
 void		get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list);
 RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info);
 bool		is_bdr_db(PGconn *conn, PQExpBufferData *output);
+bool		is_bdr_db_quiet(PGconn *conn);
 bool		is_active_bdr_node(PGconn *conn, const char *node_name);
 bool		is_bdr_repmgr(PGconn *conn);
 bool		is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
 bool		add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
 void		add_extension_tables_to_bdr_replication_set(PGconn *conn);
-
-bool		bdr_node_exists(PGconn *conn, const char *node_name);
+bool		bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name);
 ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name);
 void		get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf);

 bool		am_bdr_failover_handler(PGconn *conn, int node_id);
 void		unset_bdr_failover_handler(PGconn *conn);
+bool		bdr_node_has_repmgr_set(PGconn *conn, const char *node_name);
+bool		bdr_node_set_repmgr_set(PGconn *conn, const char *node_name);
+
+/* miscellaneous debugging functions */
+const char *print_node_status(NodeStatus node_status);
+const char *print_pqping_status(PGPing ping_status);

 #endif							/* _REPMGR_DBUTILS_H_ */
--- a/dirutil.c
+++ b/dirutil.c
@@ -3,7 +3,7 @@
 * dirmod.c
 *	  directory handling functions
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -21,6 +21,7 @@

 #include <unistd.h>
 #include <dirent.h>
+#include <signal.h>
 #include <sys/stat.h>
 #include <errno.h>
 #include <stdio.h>
@@ -34,34 +35,33 @@
 #include "dirutil.h"
 #include "strutil.h"
 #include "log.h"
+#include "controldata.h"

 static int	unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf);

+/* PID can be negative if backend is standalone */
+typedef long pgpid_t;


 /*
- * make sure the directory either doesn't exist or is empty
- * we use this function to check the new data directory and
- * the directories for tablespaces
+ * Check if a directory exists, and if so whether it is empty.
 *
- * This is the same check initdb does on the new PGDATA dir
- *
- * Returns 0 if nonexistent, 1 if exists and empty, 2 if not empty,
- * or -1 if trouble accessing directory
+ * This function is used for checking both the data directory
+ * and tablespace directories.
 */
-int
+DataDirState
 check_dir(char *path)
 {
-	DIR		   *chkdir;
-	struct dirent *file;
-	int			result = 1;
+	DIR		   *chkdir = NULL;
+	struct dirent *file = NULL;
+	int			result = DIR_EMPTY;

 	errno = 0;

 	chkdir = opendir(path);

 	if (!chkdir)
-		return (errno == ENOENT) ? 0 : -1;
+		return (errno == ENOENT) ? DIR_NOENT : DIR_ERROR;

 	while ((file = readdir(chkdir)) != NULL)
 	{
@@ -73,25 +73,15 @@ check_dir(char *path)
 		}
 		else
 		{
-			result = 2;			/* not empty */
+			result = DIR_NOT_EMPTY;
 			break;
 		}
 	}

-#ifdef WIN32
-
-	/*
-	 * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
-	 * released version
-	 */
-	if (GetLastError() == ERROR_NO_MORE_FILES)
-		errno = 0;
-#endif
-
 	closedir(chkdir);

 	if (errno != 0)
-		return -1;				/* some kind of I/O error? */
+		return DIR_ERROR;				/* some kind of I/O error? */

 	return result;
 }
@@ -106,12 +96,13 @@ create_dir(char *path)
 	if (mkdir_p(path, 0700) == 0)
 		return true;

-	log_error(_("unable to create directory \"%s\": %s"),
-			  path, strerror(errno));
+	log_error(_("unable to create directory \"%s\""), path);
+	log_detail("%s", strerror(errno));

 	return false;
 }

+
 bool
 set_dir_permissions(char *path)
 {
@@ -146,26 +137,6 @@ mkdir_p(char *path, mode_t omode)
 	oumask = 0;
 	retval = 0;

-#ifdef WIN32
-	/* skip network and drive specifiers for win32 */
-	if (strlen(p) >= 2)
-	{
-		if (p[0] == '/' && p[1] == '/')
-		{
-			/* network drive */
-			p = strstr(p + 2, "/");
-			if (p == NULL)
-				return 1;
-		}
-		else if (p[1] == ':' &&
-				 ((p[0] >= 'a' && p[0] <= 'z') ||
-				  (p[0] >= 'A' && p[0] <= 'Z')))
-		{
-			/* local drive */
-			p += 2;
-		}
-	}
-#endif

 	if (p[0] == '/')			/* Skip leading '/'. */
 		++p;
@@ -242,17 +213,91 @@ is_pg_dir(char *path)
 	return false;
 }

+/*
+ * Attempt to determine if a PostgreSQL data directory is in use
+ * by reading the pidfile. This is the same mechanism used by
+ * "pg_ctl".
+ *
+ * This function will abort with appropriate log messages if a file error
+ * is encountered, as the user will need to address the situation before
+ * any further useful progress can be made.
+ */
+PgDirState
+is_pg_running(char *path)
+{
+	long		pid;
+	FILE	   *pidf;
+
+	char pid_file[MAXPGPATH];
+
+	/* it's reasonable to assume the pidfile name will not change */
+	snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", path);
+
+	pidf = fopen(pid_file, "r");
+	if (pidf == NULL)
+	{
+		/*
+		 * No PID file - PostgreSQL shouldn't be running. From 9.3 (the
+		 * earliesty version we care about) removal of the PID file will
+		 * cause the postmaster to shut down, so it's highly unlikely
+		 * that PostgreSQL will still be running.
+		 */
+		if (errno == ENOENT)
+		{
+			return PG_DIR_NOT_RUNNING;
+		}
+		else
+		{
+			log_error(_("unable to open PostgreSQL PID file \"%s\""), pid_file);
+			log_detail("%s", strerror(errno));
+			exit(ERR_BAD_CONFIG);
+		}
+	}
+
+
+	/*
+	 * In the unlikely event we're unable to extract a PID from the PID file,
+	 * log a warning but assume we're not dealing with a running instance
+	 * as PostgreSQL should have shut itself down in these cases anyway.
+	 */
+	if (fscanf(pidf, "%ld", &pid) != 1)
+	{
+		/* Is the file empty? */
+		if (ftell(pidf) == 0 && feof(pidf))
+		{
+			log_warning(_("PostgreSQL PID file \"%s\" is empty"), path);
+		}
+		else
+		{
+			log_warning(_("invalid data in PostgreSQL PID file \"%s\""), path);
+		}
+
+		return PG_DIR_NOT_RUNNING;
+	}
+
+	fclose(pidf);
+
+	if (pid == getpid())
+		return PG_DIR_NOT_RUNNING;
+
+	if (pid == getppid())
+		return PG_DIR_NOT_RUNNING;
+
+	if (kill(pid, 0) == 0)
+		return PG_DIR_RUNNING;
+
+	return PG_DIR_NOT_RUNNING;
+}
+

 bool
 create_pg_dir(char *path, bool force)
 {
-	bool		pg_dir = false;
-
-	/* Check this directory could be used as a PGDATA dir */
+	/* Check this directory can be used as a PGDATA dir */
 	switch (check_dir(path))
 	{
-		case 0:
-			/* dir not there, must create it */
+		case DIR_NOENT:
+			/* directory does not exist, attempt to create it */
 			log_info(_("creating directory \"%s\"..."), path);

 			if (!create_dir(path))
@@ -262,55 +307,62 @@ create_pg_dir(char *path, bool force)
 				return false;
 			}
 			break;
-		case 1:
-			/* Present but empty, fix permissions and use it */
-			log_info(_("checking and correcting permissions on existing directory %s"),
+		case DIR_EMPTY:
+			/* exists but empty, fix permissions and use it */
+			log_info(_("checking and correcting permissions on existing directory \"%s\""),
 					 path);

 			if (!set_dir_permissions(path))
 			{
-				log_error(_("unable to change permissions of directory \"%s\":\n  %s"),
-						  path, strerror(errno));
+				log_error(_("unable to change permissions of directory \"%s\""), path);
+				log_detail("%s", strerror(errno));
 				return false;
 			}
 			break;
-		case 2:
-			/* Present and not empty */
+		case DIR_NOT_EMPTY:
+			/* exists but is not empty */
 			log_warning(_("directory \"%s\" exists but is not empty"),
 						path);

-			pg_dir = is_pg_dir(path);
-
-			if (pg_dir && force)
+			if (is_pg_dir(path))
 			{
-				/* TODO: check DB state, if not running overwrite */
-
-				if (false)
+				if (force == true)
 				{
-					log_notice(_("deleting existing data directory \"%s\""), path);
+					log_notice(_("-F/--force provided - deleting existing data directory \"%s\""), path);
 					nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
+					return true;
 				}
-				/* Let it continue */
-				break;
-			}
-			else if (pg_dir && !force)
-			{
-				log_hint(_("This looks like a PostgreSQL directory.\n"
-						   "If you are sure you want to clone here, "
-						   "please check there is no PostgreSQL server "
-						   "running and use the -F/--force option"));
+
 				return false;
 			}
-
-			return false;
-		default:
+			else
+			{
+				if (force == true)
+				{
+					log_notice(_("deleting existing directory \"%s\""), path);
+					nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
+					return true;
+				}
+				return false;
+			}
+			break;
+		case DIR_ERROR:
 			log_error(_("could not access directory \"%s\": %s"),
 					  path, strerror(errno));
 			return false;
 	}
+
 	return true;
 }

+
+
+int
+rmdir_recursive(char *path)
+{
+	return nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
+}
+
 static int
 unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf)
 {
--- a/dirutil.h
+++ b/dirutil.h
@@ -1,6 +1,6 @@
 /*
 * dirutil.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -19,12 +19,29 @@
 #ifndef _DIRUTIL_H_
 #define _DIRUTIL_H_

+typedef enum
+{
+	DIR_ERROR = -1,
+	DIR_NOENT,
+	DIR_EMPTY,
+	DIR_NOT_EMPTY
+} DataDirState;
+
+typedef enum
+{
+	PG_DIR_ERROR = -1,
+	PG_DIR_NOT_RUNNING,
+	PG_DIR_RUNNING
+} PgDirState;
+
 extern int	mkdir_p(char *path, mode_t omode);
 extern bool set_dir_permissions(char *path);

-extern int	check_dir(char *path);
+extern DataDirState	check_dir(char *path);
 extern bool create_dir(char *path);
 extern bool is_pg_dir(char *path);
+extern PgDirState is_pg_running(char *path);
 extern bool create_pg_dir(char *path, bool force);
+extern int rmdir_recursive(char *path);

 #endif
--- a/doc/.gitignore
+++ b/doc/.gitignore
@@ -0,0 +1,7 @@
+HTML.index
+bookindex.sgml
+html-stamp
+html/
+nochunks.dsl
+repmgr.html
+version.sgml
--- a/doc/Makefile.in
+++ b/doc/Makefile.in
@@ -0,0 +1,76 @@
+repmgr_subdir = doc
+repmgr_top_builddir = ..
+include $(repmgr_top_builddir)/Makefile.global
+
+ifndef JADE
+JADE = $(missing) jade
+endif
+
+SGMLINCLUDE = -D . -D ${srcdir}
+
+SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
+
+JADE.html.call = $(JADE) $(JADEFLAGS) $(SPFLAGS) $(SGMLINCLUDE) $(CATALOG) -t sgml -i output-html
+
+ALLSGML := $(wildcard $(srcdir)/*.sgml)
+# to build bookindex
+ALMOSTALLSGML := $(filter-out %bookindex.sgml,$(ALLSGML))
+GENERATED_SGML = version.sgml bookindex.sgml
+
+Makefile: Makefile.in
+	cd $(repmgr_top_builddir) && ./config.status doc/Makefile
+
+all: html
+
+html: html-stamp
+
+html-stamp: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
+	$(MKDIR_P) html
+	$(JADE.html.call) -d stylesheet.dsl -i include-index $<
+	cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/
+	touch $@
+
+repmgr.html: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
+	sed '/html-index-filename/a\
+(define nochunks  #t)' <stylesheet.dsl >nochunks.dsl
+	$(JADE.html.call) -d nochunks.dsl -i include-index $< >repmgr.html
+
+version.sgml: ${repmgr_top_builddir}/repmgr_version.h
+	{ \
+	  echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \
+	} > $@
+
+HTML.index: repmgr.sgml $(ALMOSTALLSGML) stylesheet.dsl
+	@$(MKDIR_P) html
+	$(JADE.html.call) -d stylesheet.dsl -V html-index $<
+
+website-docs.css:
+	@$(MKDIR_P) html
+	curl http://www.postgresql.org/media/css/docs.css > ${srcdir}/website-docs.css
+
+bookindex.sgml: HTML.index
+ifdef COLLATEINDEX
+	LC_ALL=C $(PERL) $(COLLATEINDEX) -f -g -i 'bookindex' -o $@ $<
+else
+	@$(missing) collateindex.pl $< $@
+endif
+
+clean:
+	rm -f html-stamp
+	rm -f HTML.index $(GENERATED_SGML)
+
+maintainer-clean:
+	rm -rf html
+	rm -rf Makefile
+
+zip: html
+	cp -r html repmgr-docs-$(REPMGR_VERSION)
+	zip -r repmgr-docs-$(REPMGR_VERSION).zip repmgr-docs-$(REPMGR_VERSION)
+	rm -rf repmgr-docs-$(REPMGR_VERSION)
+
+install: html
+	@$(MKDIR_P) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
+	@$(INSTALL_DATA) $(wildcard html/*.html) $(wildcard html/*.css) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
+	@echo Installed docs to $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
+
+.PHONY: html all
--- a/doc/appendix-faq.sgml
+++ b/doc/appendix-faq.sgml
@@ -0,0 +1,307 @@
+<appendix id="appendix-faq" xreflabel="FAQ">
+ <indexterm>
+  <primary>FAQ (Frequently Asked Questions)</primary>
+ </indexterm>
+
+ <title>FAQ (Frequently Asked Questions)</title>
+
+ <sect1 id="faq-general" xreflabel="General">
+  <title>General</title>
+
+  <sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
+    <title>What's the difference between the repmgr versions?</title>
+    <para>
+      &repmgr; 4 is a complete rewrite of the existing &repmgr; code base
+      and implements &repmgr; as a PostgreSQL extension. It
+      supports all PostgreSQL versions from 9.3 (although some &repmgr;
+      features are not available for PostgreSQL 9.3 and 9.4).
+     </para>
+     <para>
+      &repmgr; 3.x builds on the improved replication facilities added
+      in PostgreSQL 9.3, as well as improved automated failover support
+      via <application>repmgrd</application>, and is not compatible with PostgreSQL 9.2
+      and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
+      series will no longer be actively maintained.
+     </para>
+     <para>
+      &repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
+      with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
+      no longer maintained.
+     </para>
+  </sect2>
+
+  <sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
+   <title>What's the advantage of using replication slots?</title>
+   <para>
+    Replication slots, introduced in PostgreSQL 9.4, ensure that the
+    primary server will retain WAL files until they have been consumed
+    by all standby servers. This makes WAL file management much easier,
+    and if used &repmgr; will no longer insist on a fixed minimum number
+    (default: 5000) of WAL files being retained.
+   </para>
+   <para>
+    However this does mean that if a standby is no longer connected to the
+    primary, the presence of the replication slot will cause WAL files
+    to be retained indefinitely.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
+   <title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
+   <para>
+    Normally at least same number as the number of standbys which will connect
+    to the node. Note that changes to <varname>max_replication_slots</varname> require a server
+    restart to take effect, and as there is no particular penalty for unused
+    replication slots, setting a higher figure will make adding new nodes
+    easier.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-hash-index" xreflabel="Hash indexes">
+   <title>Does &repmgr; support hash indexes?</title>
+   <para>
+    Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
+    for use in streaming replication in PostgreSQL 9.6 and earlier. See the
+    <ulink url="https://www.postgresql.org/docs/9.6/static/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
+    for details.
+   </para>
+   <para>
+    From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
+    in a streaming replication cluster.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
+   <title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
+   <para>
+     For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
+     approach is to upgrade a standby to the latest version, perform a
+     <link linkend="performing-switchover">switchover</link> promoting it to a primary,
+     then upgrade the former primary.
+   </para>
+   <para>
+     For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
+     the traditional approach is to "reseed" a cluster by upgrading a single
+     node with <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade</ulink>
+     and recloning standbys from this.
+   </para>
+   <para>
+     To minimize downtime during major upgrades, for more recent PostgreSQL
+     versions (PostgreSQL 9.4 and later),
+     <ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
+     can be used to set up a parallel cluster using the newer PostgreSQL version,
+     which can be kept in sync with the existing production cluster until the
+     new cluster is ready to be put into production.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-libdir-repmgr-error">
+   <title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
+   <para>
+     It means the &repmgr; extension code is not installed in the
+     PostgreSQL application directory. This typically happens when using PostgreSQL
+     packages provided by a third-party vendor, which often have different
+     filesystem layouts.
+   </para>
+   <para>
+     Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
+     is not possible, contact your vendor for assistance.
+   </para>
+  </sect2>
+ </sect1>
+
+ <sect1 id="faq-repmgr" xreflabel="repmgr">
+  <title><command>repmgr</command></title>
+
+  <sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
+   <title>Can I register an existing PostgreSQL server with repmgr?</title>
+   <para>
+    Yes, any existing PostgreSQL server which is part of the same replication
+    cluster can be registered with &repmgr;. There's no requirement for a
+    standby to have been cloned using &repmgr;.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-clone-other-source" >
+   <title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
+
+   <para>
+     For a standby which has been manually cloned or recovered from an external
+     backup manager such as Barman, the command
+     <command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
+     can be used to create the correct <filename>recovery.conf</filename> file for
+     use with &repmgr; (and will create a replication slot if required). Once this has been done,
+     <link linkend="repmgr-standby-register">register the node</link> as usual.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-recovery-conf" >
+    <title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
+   <para>
+     See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
+   <title>How can a failed primary be re-added as a standby?</title>
+   <para>
+    This is a two-stage process. First, the failed primary's data directory
+    must be re-synced with the current primary; secondly the failed primary
+    needs to be re-registered as a standby.
+   </para>
+   <para>
+    It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
+    directory, which will usually be much
+    faster than re-cloning the server. However <command>pg_rewind</command> can only
+    be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
+    data checksums were enabled when the cluster was initialized.
+   </para>
+   <para>
+     Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
+     distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
+   </para>
+   <para>
+    &repmgr; provides the command <command>repmgr node rejoin</command> which can
+    optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin">
+    documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind">.
+   </para>
+   <para>
+    If <command>pg_rewind</command> cannot be used, then the data directory will need
+    to be re-cloned from scratch.
+   </para>
+
+  </sect2>
+
+  <sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
+   <title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
+   <para>
+    Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
+    with the <literal>--dry-run</literal> option; this will report any configuration problems
+    which need to be rectified.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
+   <title>When cloning a standby, how can I get &repmgr; to copy
+     <filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
+     directory in <filename>/etc</filename>?</title>
+   <para>
+    Use the command line option <literal>--copy-external-config-files</literal>. For more details
+    see <xref linkend="repmgr-standby-clone-config-file-copying">.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
+    <title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
+      in <filename>postgresql.conf</filename> if I'm not using <application>repmgrd</application>?</title>
+   <para>
+    No, the <literal>repmgr</literal> shared library is only needed when running <application>repmgrd</application>.
+    If you later decide to run <application>repmgrd</application>, you just need to add
+    <literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
+   <title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
+     but <command>repmgr</command>/<application>repmgrd</application> complains it can't connect to the server... Why?</title>
+   <para>
+    <command>repmgr</command> and <application>repmgrd</application> need to be able to connect to the repmgr database
+    with a normal connection to query metadata. The <literal>replication</literal> connection
+    permission is for PostgreSQL's streaming replication (and doesn't  necessarily need to be the <literal>repmgr</literal> user).
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
+   <title>When cloning a standby, why do I need to provide the connection parameters
+     for the primary server on the command line, not in the configuration file?</title>
+   <para>
+    Cloning a standby is a one-time action; the role of the server being cloned
+    from could change, so fixing it in the configuration file would create
+    confusion. If &repmgr; needs to establish a connection to the primary
+    server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
+    node, and if necessary scan the replication cluster until it locates the active primary.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
+   <title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
+   <para>
+     Provide the option <literal>--waldir</literal>  (<literal>--xlogdir</literal> in PostgreSQL 9.6
+     and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
+     For more details see <xref linkend="cloning-advanced-pg-basebackup-options">.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
+   <title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
+     table?</title>
+   <para>
+     Under some circumstances event notifications can be generated for servers
+     which have not yet been registered; it's also useful to retain a record
+     of events which includes servers removed from the replication cluster
+     which no longer have an entry in the <literal>repmrg.nodes</literal> table.
+   </para>
+  </sect2>
+
+
+
+
+ </sect1>
+
+ <sect1 id="faq-repmgrd" xreflabel="repmgrd">
+  <title><application>repmgrd</application></title>
+
+
+  <sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
+   <title>How can I prevent a node from ever being promoted to primary?</title>
+   <para>
+    In `repmgr.conf`, set its priority to a value of 0 or less; apply the changed setting with
+    <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
+   </para>
+   <para>
+    Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
+    be considered as a promotion candidate.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
+   <title>Does <application>repmgrd</application> support delayed standbys?</title>
+   <para>
+    <application>repmgrd</application> can monitor delayed standbys - those set up with
+    <varname>recovery_min_apply_delay</varname> set to a non-zero value
+    in <filename>recovery.conf</filename> - but as it's not currently possible
+    to directly examine the value applied to the standby, <application>repmgrd</application>
+    may not be able to properly evaluate the node as a promotion candidate.
+   </para>
+   <para>
+    We recommend that delayed standbys are explicitly excluded from promotion
+    by setting <varname>priority</varname> to <literal>0</literal> in
+    <filename>repmgr.conf</filename>.
+   </para>
+   <para>
+    Note that after registering a delayed standby, <application>repmgrd</application> will only start
+    once the metadata added in the primary node has been replicated.
+   </para>
+  </sect2>
+
+  <sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
+   <title>How can I get <application>repmgrd</application> to rotate its logfile?</title>
+   <para>
+     Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation">.
+   </para>
+
+  </sect2>
+
+  <sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
+   <title>I've recloned a failed primary as a standby, but <application>repmgrd</application> refuses to start?</title>
+   <para>
+    Check you registered the standby after recloning. If unregistered, the standby
+    cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
+    <literal>automatic</literal>, which is probably not what you want. <application>repmgrd</application> will start if
+    <varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
+  be monitored, if desired.
+   </para>
+  </sect2>
+
+ </sect1>
+</appendix>
--- a/doc/appendix-packages.sgml
+++ b/doc/appendix-packages.sgml
@@ -0,0 +1,366 @@
+<appendix id="appendix-packages" xreflabel="Package details">
+  <indexterm>
+    <primary>packages</primary>
+  </indexterm>
+
+  <title>&repmgr; package details</title>
+  <para>
+    This section provides technical details about various &repmgr; binary
+    packages, such as location of the installed binaries and
+    configuration files.
+  </para>
+
+  <sect1 id="packages-centos" xreflabel="CentOS packages">
+    <title>CentOS Packages</title>
+    <indexterm>
+      <primary>packages</primary>
+      <secondary>CentOS packages</secondary>
+    </indexterm>
+    <para>
+      Currently, &repmgr; RPM packages are provided for versions 6.x and 7.x of CentOS. These should also
+      work on matching versions of Red Hat Enterprise Linux, Scientific Linux and Oracle Enterprise Linux;
+      together with CentOS, these are the same RedHat-based distributions for which the main community project
+      (PGDG) provides packages (see the <ulink url="https://yum.postgresql.org/">PostgreSQL RPM Building Project</ulink>
+      page for details).
+    </para>
+
+    <para>
+      Note these &repmgr; RPM packages are not designed to work with SuSE/OpenSuSE.
+    </para>
+
+    <note>
+      <para>
+        &repmgr; packages are designed to be compatible with community-provided PostgreSQL packages.
+        They may not work with vendor-specific packages such as those provided by RedHat for RHEL
+        customers, as the filesystem layout may be different to the community RPMs.
+        Please contact your support vendor for assistance.
+      </para>
+    </note>
+
+    <sect2 id="packages-centos-repositories">
+      <title>CentOS repositories</title>
+
+      <para>
+        &repmgr; packages are available from the 2ndQuadrant repository, and also the PostgreSQL
+        community repository. The 2ndQuadrant repository is updated immediately after each
+        &repmgr; release.
+      </para>
+
+      <table id="centos-2ndquadrant-repository">
+        <title>2ndQuadrant repository</title>
+        <tgroup cols="2">
+          <tbody>
+            <row>
+              <entry>Repository URL:</entry>
+              <entry><ulink url="http://packages.2ndquadrant.com/repmgr/">http://packages.2ndquadrant.com/repmgr/</ulink></entry>
+            </row>
+            <row>
+              <entry>Repository documentation:</entry>
+              <entry><ulink url="https://repmgr.org/docs/4.0/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ">https://repmgr.org/docs/4.0/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ</ulink></entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </table>
+
+      <table id="centos-pgdg-repository">
+        <title>PostgreSQL community repository (PGDG)</title>
+        <tgroup cols="2">
+          <tbody>
+            <row>
+              <entry>Repository URL:</entry>
+              <entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
+            </row>
+            <row>
+              <entry>Repository documentation:</entry>
+              <entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </table>
+
+    </sect2>
+
+    <sect2 id="packages-centos-details">
+      <title>CentOS package details</title>
+
+      <para>
+        The two tables below list relevant information, paths, commands etc. for the &repmgr; packages on
+        CentOS 7 (with systemd) and CentOS 6 (no systemd). Substitute the appropriate PostgreSQL major
+        version number for your installation.
+      </para>
+
+      <note>
+        <para>
+          For PostgreSQL 9.6 and lower, the CentOS packages use a mixture of <literal>9.6</literal>
+          and <literal>96</literal> in various places to designate the major version; e.g. the
+          package name is <literal>repmgr96</literal>, but the binary directory is
+          <filename>/var/lib/pgsql/9.6/data</filename>.
+        </para>
+        <para>
+          From PostgreSQL 10, the first part of the version number (e.g. <literal>10</literal>) is
+          the major version, so there is more consistency in file/path/package naming
+          (package <literal>repmgr10</literal>, binary directory <filename>/var/lib/pgsql/10/data</filename>).
+        </para>
+      </note>
+
+
+  <table id="centos-7-packages">
+   <title>CentOS 7 packages</title>
+
+   <tgroup cols="2">
+    <tbody>
+
+     <row>
+      <entry>Package name example:</entry>
+      <entry><filename>repmgr10-4.0.4-1.rhel7.x86_64</filename></entry>
+     </row>
+
+     <row>
+      <entry>Metapackage:</entry>
+      <entry>(none)</entry>
+     </row>
+
+     <row>
+      <entry>Installation command:</entry>
+      <entry><literal>yum install repmgr10</literal></entry>
+     </row>
+
+     <row>
+      <entry>Binary location:</entry>
+      <entry><filename>/usr/pgsql-10/bin</filename></entry>
+     </row>
+
+     <row>
+      <entry>repmgr in default path:</entry>
+      <entry>NO</entry>
+     </row>
+
+     <row>
+      <entry>Configuration file location:</entry>
+      <entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
+     </row>
+
+     <row>
+      <entry>Data directory:</entry>
+      <entry><filename>/var/lib/pgsql/10/data</filename></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd service command:</entry>
+      <entry><command>systemctl [start|stop|restart|reload] repmgr10</command></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd service file location:</entry>
+      <entry><filename>/usr/lib/systemd/system/repmgr10.service</filename></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd log file location:</entry>
+      <entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="centos-6-packages">
+   <title>CentOS 6 packages</title>
+
+   <tgroup cols="2">
+    <tbody>
+
+     <row>
+      <entry>Package name example:</entry>
+      <entry><filename>repmgr96-4.0.4-1.rhel6.x86_64</filename></entry>
+     </row>
+
+     <row>
+      <entry>Metapackage:</entry>
+      <entry>(none)</entry>
+     </row>
+
+     <row>
+      <entry>Installation command:</entry>
+      <entry><literal>yum install repmgr96</literal></entry>
+     </row>
+
+     <row>
+      <entry>Binary location:</entry>
+      <entry><filename>/usr/pgsql-9.6/bin</filename></entry>
+     </row>
+
+     <row>
+      <entry>repmgr in default path:</entry>
+      <entry>NO</entry>
+     </row>
+
+     <row>
+      <entry>Configuration file location:</entry>
+      <entry><filename>/etc/repmgr/9.6/repmgr.conf</filename></entry>
+     </row>
+
+     <row>
+      <entry>Data directory:</entry>
+      <entry><filename>/var/lib/pgsql/9.6/data</filename></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd service command:</entry>
+      <entry><literal>service [start|stop|restart|reload] repmgr-9.6</literal></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd service file location:</entry>
+      <entry><literal>/etc/init.d/repmgr-9.6</literal></entry>
+     </row>
+
+     <row>
+      <entry>repmgrd log file location:</entry>
+      <entry><filename>/var/log/repmgr/repmgrd-9.6.log</filename></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+    </sect2>
+ </sect1>
+
+
+
+
+  <sect1 id="packages-debian-ubuntu" xreflabel="Debian/Ubuntu packages">
+    <title>Debian/Ubuntu Packages</title>
+    <indexterm>
+      <primary>packages</primary>
+      <secondary>Debian/Ubuntu packages</secondary>
+    </indexterm>
+    <para>
+      &repmgr; <literal>.deb</literal> packages are provided via the
+      PostgreSQL Community APT repository, and are available for each community-supported
+      PostgreSQL version, currently supported Debian releases, and currently supported
+      Ubuntu LTS releases.
+    </para>
+
+    <sect2 id="packages-apt-repository">
+      <title>APT repository</title>
+
+      <para>
+        &repmgr; packages are available from the  PostgreSQL Community APT repository,
+        which is updated immediately after each &repmgr; release.
+      </para>
+
+
+      <table id="apt-repository">
+        <title>PostgreSQL Community APT repository (PGDG)</title>
+        <tgroup cols="2">
+          <tbody>
+            <row>
+              <entry>Repository URL:</entry>
+              <entry><ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink></entry>
+            </row>
+            <row>
+              <entry>Repository documentation:</entry>
+              <entry><ulink url="https://wiki.postgresql.org/wiki/Apt)">https://wiki.postgresql.org/wiki/Apt)</ulink></entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </table>
+    </sect2>
+
+   <sect2 id="packages-debian-details">
+      <title>Debian/Ubuntu package details</title>
+
+      <para>
+        The table below lists relevant information, paths, commands etc. for the &repmgr; packages on
+        Debian 9.x ("Stretch"). Substitute the appropriate PostgreSQL major
+        version number for your installation.
+      </para>
+      <para>
+        See also <xref linkend="repmgrd-configuration-debian-ubuntu"> for some specifics related
+        to configuring the <application>repmgrd</application> daemon.
+      </para>
+
+      <table id="debian-9-packages">
+        <title>Debian 9.x packages</title>
+
+        <tgroup cols="2">
+          <tbody>
+
+            <row>
+              <entry>Package name example:</entry>
+              <entry><filename>postgresql-10-repmgr</filename></entry>
+            </row>
+
+            <row>
+              <entry>Metapackage:</entry>
+              <entry><filename>repmgr-common</filename></entry>
+            </row>
+
+            <row>
+              <entry>Installation command:</entry>
+              <entry><literal>apt-get install postgresql-10-repmgr</literal></entry>
+            </row>
+
+            <row>
+              <entry>Binary location:</entry>
+              <entry><filename>/usr/lib/postgresql/10/bin</filename></entry>
+            </row>
+
+            <row>
+              <entry>repmgr in default path:</entry>
+              <entry>Yes (via wrapper script <filename>/usr/bin/repmgr</filename>)</entry>
+            </row>
+
+            <row>
+              <entry>Configuration file location:</entry>
+              <entry>(not set by package)</entry>
+            </row>
+
+            <row>
+              <entry>Data directory:</entry>
+              <entry><filename>/var/lib/postgresql/10/main</filename></entry>
+            </row>
+
+            <row>
+              <entry>PostgreSQL service command:</entry>
+              <entry><command>systemctl [start|stop|restart|reload] postgresql@10-main</command></entry>
+
+            </row>
+
+            <row>
+              <entry>repmgrd service command:</entry>
+              <entry><command>systemctl [start|stop|restart|reload] repmgrd</command></entry>
+            </row>
+
+            <row>
+              <entry>repmgrd service file location:</entry>
+              <entry><filename>/etc/init.d/repmgrd</filename> (defaults in: <filename>/etc/defaults/repmgrd</filename>)</entry>
+            </row>
+
+            <row>
+              <entry>repmgrd log file location:</entry>
+              <entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry>
+            </row>
+
+          </tbody>
+        </tgroup>
+      </table>
+      <note>
+        <para>
+          Instead of using the <application>systemd</application> service command directly,
+          it's recommended to execute <command>pg_ctlcluster</command> (as <literal>root</literal>,
+          either directly or via <command>sudo</command>), e.g.:
+          <programlisting>
+            <command>pg_ctlcluster 10 main [start|stop|restart|reload]</command></programlisting>
+        </para>
+        <para>
+          For pre-<application>systemd</application> systems, <command>pg_ctlcluster</command>
+          can be executed directly by the <literal>postgres</literal> user.
+        </para>
+      </note>
+   </sect2>
+
+  </sect1>
+</appendix>
--- a/doc/appendix-release-notes.sgml
+++ b/doc/appendix-release-notes.sgml
@@ -0,0 +1,956 @@
+<appendix id="appendix-release-notes">
+  <title>Release notes</title>
+  <indexterm>
+    <primary>Release notes</primary>
+  </indexterm>
+
+  <para>
+    Changes to each &repmgr; release are documented in the release notes.
+    Please read the release notes for all versions between
+    your current version and the version you are plan to upgrade to
+    before performing an upgrade, as there may be version-specific upgrade steps.
+  </para>
+
+  <para>
+    See also: <xref linkend="upgrading-repmgr">
+  </para>
+
+  <sect1 id="release-4.0.5">
+    <title>Release 4.0.5</title>
+    <para><emphasis>Wed May 2, 2018</emphasis></para>
+    <para>
+      &repmgr; 4.0.5 contains a number of usability enhancements related to
+      <application>pg_rewind</application> usage, <filename>recovery.conf</filename>
+      generation and (in <application>repmgrd</application>) handling of various
+      corner-case situations, as well as a number of bug fixes.
+    </para>
+    <sect2>
+      <title>Usability enhancements</title>
+
+      <para>
+        <itemizedlist>
+          <listitem>
+            <para>
+              Various documentation improvements, with particular emphasis on
+              the importance of setting appropriate <link linkend="configuration-service-commands">service commands</link>
+              instead of relying on <application>pg_ctl</application>.
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Poll demoted primary after restart as a standby during a switchover operation (GitHub #408).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Add configuration parameter <option>config_directory</option> (GitHub #424).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Add sanity check if <option>--upstream-node-id</option> not supplied when executing
+              <xref linkend="repmgr-standby-register"> (GitHub #395).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Enable <link linkend="repmgr-node-rejoin-pg-rewind">pg_rewind</link> to be used with
+              PostgreSQL 9.3/9.4 (GitHub #413).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              When generating replication connection strings, set <literal>dbname=replication</literal>
+              if appropriate (GitHub #421).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Enable provision of <option>archive_cleanup_command</option> in <filename>recovery.conf</filename>
+              (GitHub #416).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Actively check for node to <link linkend="repmgr-node-rejoin">rejoin</link> cluster (GitHub #415).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: set <literal>connect_timeout=2</literal> (if not explicitly set)
+              when pinging a server.
+           </para>
+          </listitem>
+
+        </itemizedlist>
+      </para>
+
+    </sect2>
+
+   <sect2>
+      <title>Bug fixes</title>
+      <para>
+
+        <itemizedlist>
+
+          <listitem>
+            <para>
+              Fix display of conninfo parsing error messages.
+           </para>
+          </listitem>
+
+
+          <listitem>
+            <para>
+              Fix minimum accepted value for <varname>degraded_monitoring_timeout</varname> (GitHub #411).
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Fix superuser password handling (GitHub #400)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Fix parsing of <varname>archive_ready_critical</varname> configuration file parameter (GitHub #426).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Fix <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
+              output (GitHub #389)
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+               Fix memory leaks in witness code (GitHub #402).
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: handle <command>pg_ctl promote</command> timeout (GitHub #425).
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: handle failover situation with only two nodes in the primary
+          location, and at least one node in another location (GitHub #407).
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+               <application>repmgrd</application>: prevent standby connection handle from going stale.
+           </para>
+          </listitem>
+
+
+
+        </itemizedlist>
+      </para>
+   </sect2>
+
+  </sect1>
+
+
+  <sect1 id="release-4.0.4">
+    <title>Release 4.0.4</title>
+    <para><emphasis>Fri Mar 9, 2018</emphasis></para>
+
+    <para>
+      &repmgr; 4.0.4 contains some bug fixes and and a number of
+      usability enhancements related to logging/diagnostics,
+      event notifications and pre-action checks.
+    </para>
+    <para>
+      This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.3;
+      <application>repmgrd</application> (if running) should be restarted. See <xref linkend="upgrading-repmgr">
+      for more details.
+    </para>
+
+    <note>
+      <para>
+        It is not possible to perform a switchover where the demotion candidate is
+        running &repmgr; 4.0.2 or lower; all nodes should be upgraded to the latest version (4.0.4).
+        This is due to additional checks introduced in 4.0.3 which require the presence of
+        4.0.3 or later versions on all nodes.
+      </para>
+    </note>
+
+    <sect2>
+      <title>Usability enhancements</title>
+
+      <para>
+        <itemizedlist>
+
+          <listitem>
+            <para>
+              add <command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
+              option to enable integration of a standby cloned from another source into a &repmgr; cluster (GitHub #382)
+            </para>
+          </listitem>
+
+         <listitem>
+            <para>
+              remove restriction on using replication slots when cloning from a Barman server (GitHub #379)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              make <command><link linkend="repmgr-standby-promote">repmgr standby promote</link></command>
+              timeout values configurable (GitHub #387)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              add missing options to main <literal>--help</literal> output (GitHub #391, #392)
+            </para>
+          </listitem>
+
+        </itemizedlist>
+      </para>
+
+    </sect2>
+
+    <sect2>
+      <title>Bug fixes</title>
+      <para>
+
+        <itemizedlist>
+
+          <listitem>
+            <para>
+              ensure <command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>
+              honours the <option>--dry-run</option> option (GitHub #383)
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              improve replication slot warnings generated by
+              <command><link linkend="repmgr-node-status">repmgr node status</link></command>
+              (GitHub #385)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              fix --superuser handling when cloning a standby (GitHub #380)
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: improve detection of status change from primary to
+              standby
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>:  improve reconnection to the local node after a
+              failover (previously a connection error due to the node starting up was being
+              interpreted as the node being unavailable)
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: when running on a witness server, correctly connect
+              to new primary after a failover
+           </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <application>repmgrd</application>: add <link linkend="event-notifications">event notification</link>
+              <literal>repmgrd_shutdown</literal> (GitHub #393)
+           </para>
+          </listitem>
+
+        </itemizedlist>
+
+      </para>
+    </sect2>
+
+  </sect1>
+
+  <sect1 id="release-4.0.3">
+    <title>Release 4.0.3</title>
+    <para><emphasis>Thu Feb 15, 2018</emphasis></para>
+
+    <para>
+      &repmgr; 4.0.3 contains some bug fixes and and a number of
+      usability enhancements related to logging/diagnostics,
+      event notifications and pre-action checks.
+    </para>
+
+    <para>
+      This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.2;
+      repmgrd (if running) should be restarted.
+    </para>
+    <note>
+      <para>
+        It is not possible to perform a switchover where the demotion candidate is
+        running &repmgr; 4.0.2 or lower; all nodes should be upgraded to 4.0.3. This is due
+        to additional checks introduced in 4.0.3 which require the presence of
+        4.0.3 or later versions on all nodes.
+      </para>
+    </note>
+    <sect2>
+      <title>Usability enhancements</title>
+
+      <para>
+        <itemizedlist>
+
+          <listitem>
+            <para>
+              improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
+              behaviour when <command>pg_ctl</command> is used to control the server and logging output is
+              not explicitly redirected
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
+              log messages and provide new exit code <literal>ERR_SWITCHOVER_INCOMPLETE</literal> when old primary could
+              not be shut down cleanly
+            </para>
+          </listitem>
+
+         <listitem>
+            <para>
+              add check to verify the demotion candidate can make a replication connection to the
+              promotion candidate before executing a switchover (GitHub #370)
+            </para>
+         </listitem>
+
+         <listitem>
+            <para>
+              add check for sufficient walsenders and replication slots on the promotion candidate  before executing
+              <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
+              (GitHub #371)
+            </para>
+         </listitem>
+
+          <listitem>
+            <para>
+              add --dry-run mode to <command><link linkend="repmgr-standby-switchover">repmgr standby follow</link></command>
+              (GitHub #368)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              provide information about the primary node for
+              <command><link linkend="repmgr-standby-register">repmgr standby register</link></command> and
+              <command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command> event notifications (GitHub #375)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              add <literal>standby_register_sync</literal> <link linkend="event-notifications">event notification</link>, which is fired when
+              <command><link linkend="repmgr-standby-register">repmgr standby register</link></command>
+              is run with the <option>--wait-sync</option> option and the new or updated standby node
+              record has synchronised to the standby (GitHub #374)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              when running <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>,
+              if any node is unreachable, output the error message encountered in the list of warnings
+              (GitHub #369)
+            </para>
+          </listitem>
+
+        </itemizedlist>
+      </para>
+    </sect2>
+
+    <sect2>
+      <title>Bug fixes</title>
+
+      <para>
+        <itemizedlist>
+          <listitem>
+            <para>
+              ensure an inactive data directory can be overwritten when
+              cloning a standby (GitHub #366)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <command><link linkend="repmgr-node-status">repmgr node status</link></command>
+              upstream node display fixed (GitHub #363)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <command><link linkend="repmgr-primary-unregister">repmgr primary unregister</link></command>:
+              clarify usage and fix <literal>--help</literal> output (GitHub #373)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              parsing of <varname>pg_basebackup_options</varname> fixed (GitHub #376)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              ensure the <filename>pg_subtrans</filename> directory is created when cloning a
+              standby in Barman mode
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              <command><link linkend="repmgr-witness-register">repmgr witness register</link></command>:
+              fix primary node check (GitHub #377).
+            </para>
+          </listitem>
+        </itemizedlist>
+
+      </para>
+    </sect2>
+
+  </sect1>
+
+
+  <sect1 id="release-4.0.2">
+    <title>Release 4.0.2</title>
+    <para><emphasis>Thu Jan 18, 2018</emphasis></para>
+
+    <para>
+      &repmgr; 4.0.2 contains some bug fixes and small usability enhancements.
+    </para>
+    <para>
+      This release can be installed as a simple package upgrade from &repmgr; 4.0.1 or 4.0;
+      <application>repmgrd</application> (if running) should be restarted.
+    </para>
+
+    <sect2>
+      <title>Usability enhancements</title>
+
+      <para>
+        <itemizedlist>
+          <listitem>
+            <para>
+              Recognize the <option>-t</option>/<option>--terse</option> option for
+              <command><link linkend="repmgr-cluster-event">repmgr cluster event</link></command> to hide
+              the <literal>Details</literal> column (GitHub #360)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Add "--wait-start" option for
+              <command><link linkend="repmgr-standby-register">repmgr standby register</link></command>
+              (GitHub #356)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Add <literal>%p</literal> <link linkend="event-notifications">event notification parameter</link>
+              for <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
+            </para>
+          </listitem>
+        </itemizedlist>
+      </para>
+
+    </sect2>
+
+    <sect2>
+      <title>Bug fixes</title>
+
+      <para>
+        <itemizedlist>
+          <listitem>
+            <para>
+              Add missing -W option to <literal>getopt_long()</literal> invocation (GitHub #350)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Automatically create slot name if missing (GitHub #343)
+            </para>
+          </listitem>
+
+          <listitem>
+            <para>
+              Fixes to parsing output of remote repmgr invocations (GitHub #349)
+            </para>
+          </listitem>
+
+
+          <listitem>
+            <para>
+              When registering BDR nodes, automatically create missing connection replication set (GitHub #347)
+            </para>
+          </listitem>
+
+
+          <listitem>
+            <para>
+              Handle missing node record in <command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>
+              (GitHub #358)
+            </para>
+          </listitem>
+
+
+        </itemizedlist>
+      </para>
+
+    </sect2>
+
+    <sect2>
+      <title>Documentation</title>
+
+      <para>
+        <itemizedlist>
+          <listitem>
+            <para>
+              The documentation can now be built as a single HTML file (GitHub pull request #353)
+            </para>
+          </listitem>
+        </itemizedlist>
+      </para>
+    </sect2>
+
+  </sect1>
+
+ <sect1 id="release-4.0.1">
+  <title>Release 4.0.1</title>
+
+  <para><emphasis>Wed Dec 13, 2017</emphasis></para>
+
+  <para>
+    &repmgr; 4.0.1 is a bugfix release.
+  </para>
+  <sect2>
+    <title>Bug fixes</title>
+    <para>
+      <itemizedlist>
+        <listitem>
+          <para>
+            ensure correct return codes are returned for
+            <command><link linkend="repmgr-node-check">repmgr node check --action=</link></command> operations
+            (GitHub #340)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Fix <xref linkend="repmgr-cluster-show"> when <literal>repmgr</literal> schema not set in search path
+            (GitHub #341)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            When using <literal>--force-rewind</literal> with <xref linkend="repmgr-node-rejoin">
+            delete any replication slots copied by <application>pg_rewind</application>
+            (GitHub #334)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Only perform sanity check on accessibility of configuration files outside
+            the data directory when <literal>--copy-external-config-files</literal>
+            provided (GitHub #342)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Initialise "voting_term" table in application, not extension SQL
+            (GitHub #344)
+          </para>
+        </listitem>
+
+      </itemizedlist>
+    </para>
+  </sect2>
+ </sect1>
+
+
+
+ <sect1 id="release-4.0.0">
+  <title>Release 4.0.0</title>
+
+  <para><emphasis>Tue Nov 21, 2017</emphasis></para>
+
+  <para>
+    repmgr 4.0 is an entirely new version of &repmgr;, implementing &repmgr;
+    as a native PostgreSQL extension, adding new and improving existing features,
+    and making &repmgr; more user-friendly and intuitive to use. The new code base
+    will make it easier to add additional functionality for future releases.
+  </para>
+  <note>
+    <simpara>
+      With the new version, the opportunity has been taken to
+      make some changes in the way &repmgr; is set up and
+      configured. In particular changes have been made to some
+      configuration file settings consistency for and clarity.
+      Changes are covered in detail below
+    </simpara>
+    <simpara>
+      To standardise terminology, from this release <literal>primary</literal> is used to
+      denote the read/write node in a streaming replication cluster. <literal>master</literal>
+      is still accepted as an alias for &repmgr; commands
+      (e.g. <link linkend="repmgr-primary-register"><command>repmgr master register</command></link>).
+    </simpara>
+  </note>
+
+  <para>
+    For detailed instructions on upgrading from repmgr 3.x, see <xref linkend="upgrading-from-repmgr-3">.
+  </para>
+
+  <sect2>
+    <title>Features and improvements</title>
+    <para>
+
+      <itemizedlist>
+        <listitem>
+          <para>
+            <emphasis>improved switchover</emphasis>:
+            the <command>switchover</command> process has been improved and streamlined,
+            speeding up the switchover process and can also instruct other standbys
+            to follow the new primary once the switchover has completed. See
+             <xref linkend="performing-switchover"> for more details.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+           <emphasis>"--dry-run" option</emphasis>: many &repmgr; commands now provide
+           a <literal>--dry-run</literal> option which will execute the command as far
+           as possible without making any changes, which will enable possible issues
+           to be identified before the intended operation is actually carried out.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <emphasis>easier upgrades</emphasis>: &repmgr; is now implemented as a native
+            PostgreSQL extension, which means future upgrades can be carried out by
+            installing the upgraded package and issuing
+            <ulink url="https://www.postgresql.org/docs/current/static/sql-alterextension.html">ALTER EXTENSION repmgr UPDATE</ulink>.
+          </para>
+        </listitem>
+
+
+        <listitem>
+          <para>
+            <emphasis>improved logging output</emphasis>:
+            &repmgr; (and <application>repmgrd</application>) now provide more explicit
+            logging output giving a better picture of what is going on. Where appropriate,
+            <literal>DETAIL</literal> and <literal>HINT</literal> log lines provide additional
+            detail and suggestions for resolving problems. Additionally, <application>repmgrd</application>
+            now emits informational log lines at regular, configurable intervals
+            to confirm that it's running correctly and which node(s) it's monitoring.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <emphasis>automatic configuration file location in packages</emphasis>:
+            Many operating system packages place the &repmgr; configuration files
+            in a version-specific subdirectory, e.g. <filename>/etc/repmgr/9.6/repmgr.conf</filename>;
+            &repmgr; now makes it easy for package maintainers to provide a patch
+            with the actual file location, meaning <filename>repmgr.conf</filename>
+            does not need to be provided explicitly. This is currently the case
+            for 2ndQuadrant-provided <literal>.deb</literal> and <literal>.rpm</literal> packages.
+          </para>
+        </listitem>
+
+
+        <listitem>
+          <para>
+            <emphasis>monitoring and status checks</emphasis>:
+            New commands <xref linkend="repmgr-node-check"> and
+            <xref linkend="repmgr-node-status"> providing information
+            about a node's status and replication-related monitoring
+            output.
+          </para>
+        </listitem>
+
+
+        <listitem>
+          <para>
+            <emphasis>node rejoin</emphasis>:
+            New commands <xref linkend="repmgr-node-rejoin"> enables a failed
+            primary to be rejoined to a replication cluster, optionally using
+            <application>pg_rewind</application> to synchronise its data,
+            (note that <application>pg_rewind</application> may not be useable
+            in some circumstances).
+          </para>
+        </listitem>
+
+
+        <listitem>
+          <para>
+            <emphasis>automatic failover</emphasis>:
+            improved detection of node status; promotion decision based on a consensual
+            model, with the promoted primary explicitly informing other standbys to
+            follow it. The <application>repmgrd</application> daemon will continue
+            functioning even if the monitored PostgreSQL instance is down, and resume
+            monitoring if it reappears. Additionally, if the instance's role has changed
+            (typically from a primary to a standby, e.g. following reintegration of a
+            failed primary using <xref linkend="repmgr-node-rejoin">) <application>repmgrd</application>
+            will automatically resume monitoring it as a standby.
+          </para>
+        </listitem>
+
+
+
+        <listitem>
+          <para>
+            <emphasis>new documentation</emphasis>:
+            the existing documentation spread over multiple text files
+            has been consolidated into DocBook format (as used by the
+            main PostgreSQL project) and is now available online in
+            HTML format.
+          </para>
+          <para>
+            The DocBook files can easily be used to create versions
+            of the documentation in other formats such as PDF.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </para>
+  </sect2>
+  <sect2>
+    <title>New command line options</title>
+    <para>
+      <itemizedlist>
+
+        <listitem><para>
+          <literal>--dry-run</literal>: &repmgr; will attempt to perform
+          the action as far as possible without making any changes to the
+          database
+        </para></listitem>
+
+        <listitem>
+          <para>
+            <literal>--upstream-node-id</literal>: use to specify the upstream node
+            the standby will connect later stream from, when <link linkend="repmgr-standby-clone">cloning</link>
+            and <link linkend="repmgr-standby-register">registering</link> a standby.
+          </para>
+          <para>
+            This replaces the configuration file parameter <varname>upstream_node</varname>.
+            as the upstream node is set when the standby is initially cloned, but can change
+            over the lifetime of an installation (due to failovers, switchovers etc.) so it's
+            pointless/confusing keeping the original value around in <filename>repmgr.conf</filename>.
+        </para></listitem>
+
+      </itemizedlist>
+    </para>
+  </sect2>
+
+  <sect2>
+    <title>Changed command line options</title>
+    <para>
+      <application>repmgr</application>
+      <itemizedlist>
+
+        <listitem><para>
+            <literal>--replication-user</literal> has been deprecated; it has been replaced
+            by the configuration file option <varname>replication_user</varname>.
+            The value (which defaults to the user provided in the <varname>conninfo</varname>
+            string) will be stored in the &repmgr; metadata for use by
+            <xref linkend="repmgr-standby-clone"> and <xref linkend="repmgr-standby-follow">.
+        </para></listitem>
+
+        <listitem><para>
+            <literal>--recovery-min-apply-delay</literal> is now a configuration file parameter
+            <varname>recovery_min_apply_delay</varname>, to ensure the setting does not get lost
+            when a standby follows a new upstream.
+        </para></listitem>
+
+        <listitem><para>
+            <literal>--no-conninfo-password</literal> is deprecated; a password included in
+            the environment variable <varname>PGPASSWORD</varname> will no longer be added
+            to <varname>primary_conninfo</varname> by default; to force the inclusion
+            of a password (not recommended), use the new configuration file parameter
+            <varname>use_primary_conninfo_password</varname>. For details, ee section
+            <xref linkend="cloning-advanced-managing-passwords">.
+        </para></listitem>
+
+      </itemizedlist>
+    </para>
+
+    <para>
+      <application>repmgrd</application>
+      <itemizedlist>
+
+        <listitem><para>
+            <literal>--monitoring-history</literal> is deprecated and is replaced by the
+            configuration file option <varname>monitoring_history</varname>.
+            This enables the setting to be changed without having to modify system service
+            files.
+        </para></listitem>
+
+      </itemizedlist>
+    </para>
+
+  </sect2>
+
+  <sect2>
+    <title>Configuration file changes</title>
+
+    <para><emphasis>Required settings</emphasis></para>
+    <para>The following 4 parameters are mandatory in <filename>repmgr.conf</filename>:
+      <itemizedlist spacing="compact" mark="bullet">
+
+        <listitem>
+          <simpara>node_id</simpara>
+        </listitem>
+
+        <listitem>
+          <simpara>node_name</simpara>
+        </listitem>
+
+        <listitem>
+          <simpara>conninfo</simpara>
+        </listitem>
+
+        <listitem>
+          <simpara>data_directory</simpara>
+        </listitem>
+      </itemizedlist>
+    </para>
+
+   <para><emphasis>Renamed settings</emphasis></para>
+   <para>
+     Some settings have been renamed for clarity and consistency:
+     <itemizedlist spacing="compact" mark="bullet">
+
+       <listitem>
+         <simpara><varname>node</varname> is now <varname>node_id</varname></simpara>
+       </listitem>
+
+       <listitem>
+         <simpara><varname>name</varname> is now <varname>node_name</varname></simpara>
+       </listitem>
+
+       <listitem>
+         <simpara><varname>barman_server</varname> is now <varname>barman_host</varname></simpara>
+       </listitem>
+
+       <listitem>
+         <simpara><varname>master_reponse_timeout</varname> is now
+           <varname>async_query_timeout</varname> (to better indicate its purpose)
+         </simpara>
+       </listitem>
+
+     </itemizedlist>
+   </para>
+
+   <para>
+     The following configuration file parameters have been renamed for consistency
+     with other parameters (and conform to the pattern used by PostgreSQL itself,
+     which uses the prefix <varname>log_</varname> for logging parameters):
+
+    <itemizedlist spacing="compact" mark="bullet">
+
+      <listitem>
+        <simpara><varname>loglevel</varname> is now <varname>log_level</varname></simpara>
+      </listitem>
+
+      <listitem>
+        <simpara><varname>logfile</varname> is now <varname>log_file</varname></simpara>
+      </listitem>
+
+      <listitem>
+        <simpara><varname>logfacility</varname> is now <varname>log_facility</varname></simpara>
+      </listitem>
+
+    </itemizedlist>
+   </para>
+
+   <para><emphasis>Removed settings</emphasis></para>
+   <para>
+     <itemizedlist spacing="compact" mark="bullet">
+
+      <listitem>
+        <simpara><varname>cluster</varname> has been removed</simpara>
+      </listitem>
+      <listitem>
+        <simpara><varname>upstream_node</varname> - see note about
+          <literal>--upstream-node-id</literal> above</simpara>
+      </listitem>
+
+      <listitem>
+        <simpara><varname>retry_promote_interval_secs</varname>this is now redundant due
+          to changes in the failover/promotion mechanism; the new equivalent is
+          <varname>primary_notification_timeout</varname> </simpara>
+      </listitem>
+     </itemizedlist>
+   </para>
+
+   <para><emphasis>Logging changes</emphasis></para>
+   <para>
+     <itemizedlist spacing="compact" mark="bullet">
+
+      <listitem>
+        <simpara>
+          default value for <varname>log_level</varname> is <literal>INFO</literal>
+          rather than <literal>NOTICE</literal>.
+        </simpara>
+      </listitem>
+
+      <listitem>
+        <simpara>
+          new parameter <varname>log_status_interval</varname>, which causes
+          <application>repmgrd</application> to emit a status log
+          line at the specified interval
+        </simpara>
+      </listitem>
+
+     </itemizedlist>
+
+   </para>
+
+  </sect2>
+  <sect2>
+    <title>repmgrd</title>
+    <para>
+      The shared library has been renamed from <literal>repmgr_funcs</literal> to
+      <literal>repmgr</literal>,  meaning <varname>shared_preload_libraries</varname>
+      in <filename>postgresql.conf</filename> needs to be updated to the new name:
+      <programlisting>
+        shared_preload_libraries = 'repmgr'</programlisting>
+    </para>
+  </sect2>
+
+ </sect1>
+
+</appendix>
--- a/doc/appendix-signatures.sgml
+++ b/doc/appendix-signatures.sgml
@@ -0,0 +1,66 @@
+<appendix id="appendix-signatures" xreflabel="Verifying digital signatures">
+ <title>Verifying digital signatures</title>
+
+ <sect1 id="repmgr-source-key" xreflabel="repmgr source key">
+   <title>repmgr source code signing key</title>
+   <para>
+     The signing key ID used for <application>repmgr</application> source code bundles is:
+     <ulink url="http://packages.2ndquadrant.com/repmgr/SOURCE-GPG-KEY-repmgr">
+       <literal>0x297F1DCC</literal></ulink>.
+   </para>
+
+   <para>
+     To download the <application>repmgr</application> source key to your computer:
+     <programlisting>
+       curl -s http://packages.2ndquadrant.com/repmgr/SOURCE-GPG-KEY-repmgr | gpg --import
+       gpg --fingerprint 0x297F1DCC
+     </programlisting>
+     then verify that the fingerprint is the expected value:
+     <programlisting>
+       085A BE38 6FD9 72CE 6365  340D 8365 683D 297F 1DCC</programlisting>
+   </para>
+
+   <para>
+     For checking tarballs, first download and import the <application>repmgr</application>
+     source signing key as shown above. Then download both source tarball and the detached
+     key (e.g. <filename>repmgr-4.0beta1.tar.gz</filename> and
+     <filename>repmgr-4.0beta1.tar.gz.asc</filename>) from
+     <ulink url="https://repmgr.org/download/">https://repmgr.org/download/</ulink>
+     and use <application>gpg</application> to verify the key, e.g.:
+     <programlisting>
+       gpg --verify repmgr-4.0beta1.tar.gz.asc</programlisting>
+   </para>
+
+ </sect1>
+
+ <sect1 id="repmgr-rpm-key" xreflabel="repmgr rpm key">
+   <title>repmgr RPM signing key</title>
+   <para>
+     The signing key ID used for <application>repmgr</application> source code bundles is:
+     <ulink url="http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr">
+       <literal>0x702D883A</literal></ulink>.
+   </para>
+
+   <para>
+     To download the <application>repmgr</application> source key to your computer:
+     <programlisting>
+       curl -s http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr | gpg --import
+       gpg --fingerprint 0x702D883A
+     </programlisting>
+     then verify that the fingerprint is the expected value:
+     <programlisting>
+       AE4E 390E A58E 0037 6148  3F29 888D 018B 702D 883A</programlisting>
+   </para>
+
+   <para>
+     To check a repository RPM, use <application>rpmkeys</application> to load the
+      packaging signing key into the RPM database then use <literal>rpm -K</literal>, e.g.:
+     <programlisting>
+       sudo rpmkeys --import http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr
+       rpm -K postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch.rpm
+     </programlisting>
+   </para>
+
+ </sect1>
+
+</appendix>
--- a/doc/bdr-failover.md
+++ b/doc/bdr-failover.md
@@ -1,288 +1,8 @@
 BDR failover with repmgrd
 =========================

-`repmgr 4` provides support for monitoring BDR nodes and taking action in case
-one of the nodes fails.
+This document has been integrated into the main `repmgr` documentation
+and is now located here:

-    *NOTE* Due to the nature of BDR, it's only safe to use this solution for
-    a two-node scenario. Introducing additional nodes will create an inherent
-    risk of node desynchronisation if a node goes down without being cleanly
-    removed from the cluster.
+> [BDR failover with repmgrd](https://repmgr.org/docs/4.0/repmgrd-bdr.html)

-In contrast to streaming replication, there's no concept of "promoting" a new
-primary node with BDR. Instead, "failover" involves monitoring both nodes
-with `repmgrd` and redirecting queries from the failed node to the remaining
-active node. This can be done by using the event notification script generated by
-`repmgrd` to dynamically reconfigure a proxy server/connection pooler such
-as PgBouncer.
-
-
-Prerequisites
-------------
-
-`repmgr 4` requires PostgreSQL 9.6 with the BDR 2 extension enabled and
-configured for a two-node BDR network. `repmgr 4` packages
-must be installed on each node before attempting to configure repmgr.
-
-    *NOTE* `repmgr 4` will refuse to install if it detects more than two
-    BDR nodes.
-
-Application database connections *must* be passed through a proxy server/
-connection pooler such as PgBouncer, and it must be possible to dynamically
-reconfigure that from `repmgrd`. The example demonstrated in this document
-will use PgBouncer.
-
-The proxy server / connection poolers must not be installed on the database
-servers.
-
-For this example, it's assumed password-less SSH connections are available
-from the PostgreSQL servers to the servers where PgBouncer runs, and
-that the user on those servers has permission to alter the PgBouncer
-configuration files.
-
-PostgreSQL connections must be possible between each node, and each node
-must be able to connect to each PgBouncer instance.
-
-
-Configuration
-------------
-
-Sample configuration for `repmgr.conf`:
-
-    node_id=1
-    node_name='node1'
-    conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
-    replication_type='bdr'
-
-    event_notifications=bdr_failover
-    event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
-
-    # repmgrd options
-    monitor_interval_secs=5
-    reconnect_attempts=6
-    reconnect_interval=5
-
-Adjust settings as appropriate; copy and adjust for the second node (particularly
-the values `node_id`, `node_name` and `conninfo`).
-
-Note that the values provided for the `conninfo` string must be valid for
-connections from *both* nodes in the cluster. The database must be the BDR
-database.
-
-If defined, `event_notifications` will restrict execution of `event_notification_command`
-to the specified events.
-
-`event_notification_command` is the script which does the actual "heavy lifting"
-of reconfiguring the proxy server/ connection pooler. It is fully user-definable;
-a sample implementation is documented below.
-
-
-repmgr user permissions
-----------------------
-
-`repmgr` will create an extension in the BDR database containing objects
-for administering `repmgr` metadata. The user defined in the `conninfo`
-setting must be able to access all objects. Additionally, superuser permissions
-are required to install the `repmgr` extension. The easiest way to do this
-is create the `repmgr` user as a superuser, however if this is not
-desirable, the `repmgr` user can be created as a normal user and a
-superuser specified with `--superuser` when registering a BDR node.
-
-repmgr setup
------------
-
-Register both nodes:
-
-    $ repmgr -f /etc/repmgr.conf bdr register
-    NOTICE: attempting to install extension "repmgr"
-    NOTICE: "repmgr" extension successfully installed
-    NOTICE: node record created for node 'node1' (ID: 1)
-    NOTICE: BDR node 1 registered (conninfo: host=localhost dbname=bdrtest user=repmgr port=5501)
-
-    $ repmgr -f /etc/repmgr.conf bdr register
-    NOTICE: node record created for node 'node2' (ID: 2)
-    NOTICE: BDR node 2 registered (conninfo: host=localhost dbname=bdrtest user=repmgr port=5502)
-
-The `repmgr` extension will be automatically created when the first
-node is registered, and will be propagated to the second node.
-
-    *IMPORTANT* ensure the repmgr package is available on both nodes before
-    attempting to register the first node
-
-
-At this point the meta data for both nodes has been created; executing
-`repmgr cluster show` (on either node) should produce output like this:
-
-    $ repmgr -f /etc/repmgr.conf cluster show
-     ID | Name  | Role | Status    | Upstream | Connection string
-    ----+-------+------+-----------+----------+--------------------------------------------------------
-     1  | node1 | bdr  | * running |          | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
-     2  | node2 | bdr  | * running |          | host=node2 dbname=bdrtest user=repmgr connect_timeout=2
-
-Additionally it's possible to see a log of significant events; so far
-this will only record the two node registrations (in reverse chronological order):
-
-     Node ID | Event        | OK | Timestamp           | Details
-    ---------+--------------+----+---------------------+----------------------------------------------
-     2       | bdr_register | t  | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
-     1       | bdr_register | t  | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
-
-
-Defining the "event_notification_command"
-----------------------------------------
-
-Key to "failover" execution is the `event_notification_command`, which is a
-user-definable script which should reconfigure the  proxy server/
-connection pooler.
-
-Each time `repmgr` (or `repmgrd`) records an event, it can optionally
-execute the script defined in `event_notification_command` to
-take further action; details of the event will be passed as parameters.
-Following placeholders are available to the script:
-
-    %n - node ID
-    %e - event type
-    %s - success (1 or 0)
-    %t - timestamp
-    %d - details
-    %c - conninfo string of the next available node
-    %a - name of the next available node
-
-Note that `%c` and `%a` will only be provided during `bdr_failover`
-events, which is what is of interest here.
-
-The provided sample script (`scripts/bdr-pgbouncer.sh`) is configured like
-this:
-
-    event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'
-
-and parses the configures parameters like this:
-
-    NODE_ID=$1
-    EVENT_TYPE=$2
-    SUCCESS=$3
-    NEXT_CONNINFO=$4
-    NEXT_NODE_NAME=$5
-
-It also contains some hard-coded values about the PgBouncer configuration for
-both nodes; these will need to be adjusted for your local environment of course
-(ideally the scripts would be maintained as templates and generated by some
-kind of provisioning system).
-
-The script performs following steps:
-
- - pauses PgBouncer on all nodes
- - recreates the PgBouncer configuration file on each node using the information
-   provided by `repmgrd` (mainly the `conninfo` string) to configure PgBouncer
-   to point to the remaining node
- - reloads the PgBouncer configuration
- - resumes PgBouncer
-
-From that point, any connections to PgBouncer on the failed BDR node will be redirected
-to the active node.
-
-
-repmgrd
-------
-
-
-
-Node monitoring and failover
----------------------------
-
-At the intervals specified by `monitor_interval_secs` in `repmgr.conf`, `repmgrd`
-will ping each node to check if it's available. If a node isn't available,
-`repmgrd` will enter failover mode and  check `reconnect_attempts` times
-at intervals of `reconnect_interval` to confirm the node is definitely unreachable.
-This buffer period is necessary to avoid false positives caused by transient
-network outages.
-
-If the node is still unavailable, `repmgrd` will enter failover mode and execute
-the script defined in `event_notification_command`; an entry will be logged
-in the `repmgr.events` table and `repmgrd` will (unless otherwise configured)
-resume monitoring of the node in "degraded" mode until it reappears.
-
-`repmgrd` logfile output during a failover event will look something like this
-one one node (usually the node which has failed, here "node2"):
-
-    ...
-    [2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
-    [2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
-    [2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
-    [2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
-    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
-    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
-    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
-    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
-    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
-    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
-    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
-    [2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
-    [2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
-    [2017-07-27 21:09:28] [DETAIL] command is:
-      /path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
-    [2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
-    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
-    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
-    ...
-
-Output on the other node ("node1") during the same event will look like this:
-
-    [2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
-    [2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
-    [2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
-    [2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
-    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
-    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
-    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
-    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
-    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
-    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
-    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
-    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
-    [2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
-    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
-    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
-
-This assumes only the PostgreSQL instance on "node2" has failed. In this case the
-`repmgrd` instance running on "node2" has performed the failover. However if
-the entire server becomes unavailable, `repmgrd` on "node1" will perform
-the failover.
-
-
-Node recovery
-------------
-
-Following failure of a BDR node, if the node subsequently becomes available again,
-a `bdr_recovery` event will be generated. This could potentially be used to
-reconfigure PgBouncer automatically to bring the node back into the available pool,
-however it would be prudent to manually verify the node's status before
-exposing it to the application.
-
-If the failed node comes back up and connects correctly, output similar to this
-will be visible in the `repmgrd` log:
-
-    [2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
-    [2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
-    [2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
-    [2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
-    [2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds
-
-
-Shutdown of both nodes
----------------------
-
-If both PostgreSQL instances are shut down, `repmgrd` will try and handle the
-situation as gracefully as possible, though with no failover candidates available
-there's not much it can do. Should this case ever occur, we recommend shutting
-down `repmgrd` on both nodes and restarting it once the PostgreSQL instances
-are running properly.
--- a/doc/changes-in-repmgr4.md
+++ b/doc/changes-in-repmgr4.md
@@ -1,106 +1,7 @@
+Changes in repmgr 4
+===================

-Standardisation on `primary`
----------------------------
+This document has been integrated into the main `repmgr` documentation
+and is now located here:

-To standardise terminology, `primary` is used to denote the read/write
-node in a streaming replication cluster. `master` is still accepted
-as a synonym (e.g. `repmgr master register`).
-
-
-New command line options
------------------------
-
- `--dry-run`: repmgr will attempt to perform the action as far as possible
-   without making any changes to the database
-
- `--upstream-node-id`: use to specify the upstream node the standby will
-  connect later stream from, when cloning a standby. This replaces the configuration
-  file parameter `upstream_node`, as the upstream node is set when the standby
-  is initially cloned, but can change over the lifetime of an installation (due
-  to failovers, switchovers etc.) so it's pointless/confusing keeping the original
-  value around in the config file.
-
-Changed command line options
----------------------------
-
-### repmgr
-
- `--replication-user` has been deprecated; it has been replaced by the
-  configuration file option `replication_user`.  The value (which defaults
-  to the user in the `conninfo` string) will be stored in the repmgr metadata
-  for use by  standby clone/follow..
-
- `--recovery-min-apply-delay` is now a configuration file parameter
-  `recovery_min_apply_delay, to ensure the setting does not get lost when
-  a standby follows a new upstream.
-
-### repmgrd
-
- `--monitoring-history` is deprecated and has been replaced by the
-  configuration file option `monitoring_history`. This enables the
-  setting to be changed without having to modify system service files.
-
-Changes to repmgr commands
--------------------------
-
-
-### `repmgr cluster show`
-
-This now displays the role of each node (e.g. `primary`, `standby`)
-and its status in separate columns.
-
-The `--csv` option now emits a third column indicating the recovery
-status of the node.
-
-
-Configuration file changes
--------------------------
-
-### Required settings
-
-The following 4 parameters are mandatory in `repmgr.conf`:
-
- `node_id`
- `node_name`
- `conninfo`
- `data_directory`
-
-
-### Renamed settings
-
-Some settings have been renamed for clarity and consistency:
-
- `node`: now `node_id`
- `name`: now `node_name`
- `master_reponse_timeout`: now `async_query_timeout` to better indicate its
-   purpose
-
- The following configuration file parameters have been renamed for consistency
-  with other parameters (and conform to the pattern used by PostgreSQL itself,
-  which uses the prefix `log_` for logging parameters):
-  - `loglevel` has been renamed to `log_level`
-  - `logfile` has been renamed to `log_file`
-  - `logfacility` has been renamed to `log_facility`
-
-### Removed settings
-
- `cluster`: has been removed
- `upstream_node`: see note about `--upstream-node-id` above.
- `retry_promote_interval_secs`: this is now redundant due to changes in the
-   failover/promotion mechanism; the new equivalent is `primary_notification_timeout`
-
-
-### Logging changes
-
- default value for `log_level` is `INFO` rather than `NOTICE`.
- new parameter `log_status_interval`, which causes `repmgrd` to emit a status log
-  line at the specified interval
-
-
-repmgrd
-------
-
-The `repmgr` shared library has been renamed from `repmgr_funcs` to `repmgr`,
-meaning `shared_preload_libraries` needs to be updated to the new name:
-
-    shared_preload_libraries = 'repmgr'
+> [Release notes](https://repmgr.org/docs/4.0/release-4.0.html)
--- a/doc/cloning-standbys.sgml
+++ b/doc/cloning-standbys.sgml
@@ -0,0 +1,455 @@
+<chapter id="cloning-standbys" xreflabel="cloning standbys">
+ <title>Cloning standbys</title>
+
+ <sect1 id="cloning-from-barman" xreflabel="Cloning from Barman">
+   <indexterm>
+    <primary>cloning</primary>
+    <secondary>from Barman</secondary>
+   </indexterm>
+   <indexterm>
+    <primary>Barman</primary>
+    <secondary>cloning a standby</secondary>
+   </indexterm>
+
+   <title>Cloning a standby from Barman</title>
+   <para>
+    <xref linkend="repmgr-standby-clone"> can use
+    <ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s
+    <ulink url="https://www.pgbarman.org/">Barman</ulink> application
+    to clone a standby (and also as a fallback source for WAL files).
+   </para>
+   <tip>
+    <simpara>
+     Barman (aka PgBarman) should be considered as an integral part of any
+     PostgreSQL replication cluster. For more details see:
+     <ulink url="https://www.pgbarman.org/">https://www.pgbarman.org/</ulink>.
+    </simpara>
+   </tip>
+   <para>
+    Barman support provides the following advantages:
+    <itemizedlist spacing="compact" mark="bullet">
+     <listitem>
+      <para>
+       the primary node does not need to perform a new backup every time a
+       new standby is cloned
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       a standby node can be disconnected for longer periods without losing
+       the ability to catch up, and without causing accumulation of WAL
+       files on the primary node
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+       WAL management on the primary becomes much easier as there's no need
+       to use replication slots, and <varname>wal_keep_segments</varname>
+       does not need to be set.
+     </para>
+    </listitem>
+   </itemizedlist>
+   </para>
+
+  <sect2 id="cloning-from-barman-prerequisites">
+   <title>Prerequisites for cloning from Barman</title>
+   <para>
+    In order to enable Barman support for <command>repmgr standby clone</command>, following
+    prerequisites must be met:
+   <itemizedlist spacing="compact" mark="bullet">
+     <listitem>
+      <para>
+        the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the
+        server configured in Barman;
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+        the <varname>barman_host</varname> setting in <filename>repmgr.conf</filename> is set to the SSH
+        hostname of the Barman server;
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+        the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename> is configured to
+        use a copy of the <command>barman-wal-restore</command> script shipped with the
+        <literal>barman-cli</literal> package (see section <xref linkend="cloning-from-barman-restore-command">
+        below).
+      </para>
+     </listitem>
+     <listitem>
+      <para>
+        the Barman catalogue includes at least one valid backup for this  server.
+      </para>
+     </listitem>
+   </itemizedlist>
+   </para>
+   <note>
+    <simpara>
+     Barman support is automatically enabled if <varname>barman_server</varname>
+     is set. Normally it is good practice to use Barman, for instance
+     when fetching a base backup while cloning a standby; in any case,
+     Barman mode can be disabled using the <literal>--without-barman</literal>
+     command line option.
+    </simpara>
+   </note>
+   <tip>
+    <simpara>
+      If you have a non-default SSH configuration on the Barman
+      server, e.g. using a port other than 22, then you can set those
+      parameters in a dedicated Host section in <filename>~/.ssh/config</filename>
+      corresponding to the value of<varname>barman_host</varname> in
+      <filename>repmgr.conf</filename>. See the <literal>Host</literal>
+      section in <command>man 5 ssh_config</command> for more details.
+    </simpara>
+   </tip>
+   <para>
+    It's now possible to clone a standby from Barman, e.g.:
+    <programlisting>
+    NOTICE: using configuration file "/etc/repmgr.conf"
+    NOTICE: destination directory "/var/lib/postgresql/data" provided
+    INFO: connecting to Barman server to verify backup for test_cluster
+    INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data"
+    INFO: creating directory "/var/lib/postgresql/data/repmgr"...
+    INFO: connecting to Barman server to fetch server parameters
+    INFO: connecting to upstream node
+    INFO: connected to source node, checking its state
+    INFO: successfully connected to source node
+    DETAIL: current installation size is 29 MB
+    NOTICE: retrieving backup from Barman...
+    receiving file list ...
+    (...)
+    NOTICE: standby clone (from Barman) complete
+    NOTICE: you can now start your PostgreSQL server
+    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
+
+   </para>
+  </sect2>
+  <sect2 id="cloning-from-barman-restore-command" xreflabel="Using Barman as a WAL file source">
+  <indexterm>
+    <primary>Barman</primary>
+    <secondary>fetching archived WAL</secondary>
+   </indexterm>
+
+   <title>Using Barman as a WAL file source</title>
+   <para>
+    As a fallback in case streaming replication is interrupted, PostgreSQL can optionally
+    retrieve WAL files from an archive, such as that provided by Barman. This is done by
+    setting <varname>restore_command</varname> in <filename>recovery.conf</filename> to
+    a valid shell command which can retrieve a specified WAL file from the archive.
+   </para>
+   <para>
+     <command>barman-wal-restore</command> is a Python script provided as part of the <literal>barman-cli</literal>
+     package (Barman 2.0 and later; for Barman 1.x the script is provided separately as
+     <command>barman-wal-restore.py</command>) which performs this function for Barman.
+   </para>
+   <para>
+    To use <command>barman-wal-restore</command> with &repmgr;
+    and assuming Barman is located on the <literal>barmansrv</literal> host
+    and that <command>barman-wal-restore</command> is located as an executable at
+    <filename>/usr/bin/barman-wal-restore</filename>,
+    <filename>repmgr.conf</filename> should include the following lines:
+    <programlisting>
+    barman_host=barmansrv
+    barman_server=somedb
+    restore_command=/usr/bin/barman-wal-restore barmansrv somedb %f %p</programlisting>
+   </para>
+   <note>
+    <simpara>
+      <command>barman-wal-restore</command> supports command line switches to
+      control parallelism (<literal>--parallel=N</literal>) and compression (
+      <literal>--bzip2</literal>, <literal>--gzip</literal>).
+    </simpara>
+   </note>
+   <note>
+    <para>
+     To use a non-default Barman configuration file on the Barman server,
+     specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
+     <programlisting>
+      barman_config=/path/to/barman.conf</programlisting>
+    </para>
+   </note>
+  </sect2>
+ </sect1>
+
+<sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots">
+   <indexterm>
+     <primary>cloning</primary>
+     <secondary>replication slots</secondary>
+   </indexterm>
+
+   <indexterm>
+     <primary>replication slots</primary>
+     <secondary>cloning</secondary>
+   </indexterm>
+   <title>Cloning and replication slots</title>
+   <para>
+    Replication slots were introduced with PostgreSQL 9.4 and are designed to ensure
+    that any standby connected to the primary using a replication slot will always
+    be able to retrieve the required WAL files. This removes the need to manually
+    manage WAL file retention by estimating the number of WAL files that need to
+    be maintained on the primary using <varname>wal_keep_segments</varname>.
+    Do however be aware that if a standby is disconnected, WAL will continue to
+    accumulate on the primary until either the standby reconnects or the replication
+    slot is dropped.
+   </para>
+   <para>
+     To enable &repmgr; to use replication slots, set the boolean parameter
+     <varname>use_replication_slots</varname> in <filename>repmgr.conf</filename>:
+     <programlisting>
+       use_replication_slots=true</programlisting>
+   </para>
+   <para>
+    Replication slots must be enabled in <filename>postgresql.conf</filename> by
+    setting the parameter <varname>max_replication_slots</varname> to at least the
+    number of expected standbys (changes to this parameter require a server restart).
+   </para>
+   <para>
+    When cloning a standby, &repmgr; will automatically generate an appropriate
+    slot name, which is stored in the <literal>repmgr.nodes</literal> table, and create the slot
+    on the upstream node:
+     <programlisting>
+    repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name
+               FROM repmgr.nodes ORDER BY node_id;
+     node_id | upstream_node_id | active | node_name |  type   | priority |   slot_name
+    ---------+------------------+--------+-----------+---------+----------+---------------
+           1 |                  | t      | node1     | primary |      100 | repmgr_slot_1
+           2 |                1 | t      | node2     | standby |      100 | repmgr_slot_2
+           3 |                1 | t      | node3     | standby |      100 | repmgr_slot_3
+     (3 rows)</programlisting>
+
+    <programlisting>
+    repmgr=# SELECT slot_name, slot_type, active, active_pid FROM pg_replication_slots ;
+       slot_name   | slot_type | active | active_pid
+    ---------------+-----------+--------+------------
+     repmgr_slot_2 | physical  | t      |      23658
+     repmgr_slot_3 | physical  | t      |      23687
+    (2 rows)</programlisting>
+   </para>
+   <para>
+    Note that a slot name will be created by default for the primary but not
+    actually used unless the primary is converted to a standby using e.g.
+    <command>repmgr standby switchover</command>.
+   </para>
+   <para>
+    Further information on replication slots in the PostgreSQL documentation:
+    <ulink url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS">https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS</ulink>
+   </para>
+   <tip>
+    <simpara>
+     While replication slots can be useful for streaming replication, it's
+     recommended to monitor for inactive slots as these will cause WAL files to
+     build up indefinitely, possibly leading to server failure.
+    </simpara>
+    <simpara>
+     As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
+     which offloads WAL management to a separate server, negating the need to use replication
+     slots to reserve WAL. See section <xref linkend="cloning-from-barman">
+     for more details on using &repmgr; together with Barman.
+    </simpara>
+   </tip>
+ </sect1>
+
+ <sect1 id="cloning-cascading" xreflabel="Cloning and cascading replication">
+   <indexterm>
+     <primary>cloning</primary>
+     <secondary>cascading replication</secondary>
+   </indexterm>
+   <title>Cloning and cascading replication</title>
+   <para>
+    Cascading replication, introduced with PostgreSQL 9.2, enables a standby server
+    to replicate from another standby server rather than directly from the primary,
+    meaning replication changes "cascade" down through a hierarchy of servers. This
+    can be used to reduce load on the primary and minimize bandwith usage between
+    sites. For more details, see the
+    <ulink url="https://www.postgresql.org/docs/current/static/warm-standby.html#CASCADING-REPLICATION">
+    PostgreSQL cascading replication documentation</ulink>.
+   </para>
+   <para>
+    &repmgr; supports cascading replication. When cloning a standby,
+    set the command-line parameter <literal>--upstream-node-id</literal> to the
+    <varname>node_id</varname> of the server the standby should connect to, and
+    &repmgr; will create <filename>recovery.conf</filename> to point to it. Note
+    that if <literal>--upstream-node-id</literal> is not explicitly provided,
+    &repmgr; will set the standby's <filename>recovery.conf</filename> to
+    point to the primary node.
+   </para>
+   <para>
+    To demonstrate cascading replication, first ensure you have a primary and standby
+    set up as shown in the <xref linkend="quickstart">.
+    Then create an additional standby server with <filename>repmgr.conf</filename> looking
+    like this:
+    <programlisting>
+    node_id=3
+    node_name=node3
+    conninfo='host=node3 user=repmgr dbname=repmgr'
+    data_directory='/var/lib/postgresql/data'</programlisting>
+   </para>
+   <para>
+    Clone this standby (using the connection parameters for the existing standby),
+    ensuring <literal>--upstream-node-id</literal> is provide with the <varname>node_id</varname>
+    of the previously created standby (if following the example, this will be <literal>2</literal>):
+    <programlisting>
+    $ repmgr -h node2 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=2
+    NOTICE: using configuration file "/etc/repmgr.conf"
+    NOTICE: destination directory "/var/lib/postgresql/data" provided
+    INFO: connecting to upstream node
+    INFO: connected to source node, checking its state
+    NOTICE: checking for available walsenders on upstream node (2 required)
+    INFO: sufficient walsenders available on upstream node (2 required)
+    INFO: successfully connected to source node
+    DETAIL: current installation size is 29 MB
+    INFO: creating directory "/var/lib/postgresql/data"...
+    NOTICE: starting backup (using pg_basebackup)...
+    HINT: this may take some time; consider using the -c/--fast-checkpoint option
+    INFO: executing: 'pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node2 -U repmgr -X stream '
+    NOTICE: standby clone (using pg_basebackup) complete
+    NOTICE: you can now start your PostgreSQL server
+    HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
+
+    then register it (note that <literal>--upstream-node-id</literal> must be provided here
+    too):
+    <programlisting>
+     $ repmgr -f /etc/repmgr.conf standby register --upstream-node-id=2
+     NOTICE: standby node "node2" (ID: 2) successfully registered
+    </programlisting>
+   </para>
+   <para>
+    After starting the standby, the cluster will look like this, showing that <literal>node3</literal>
+    is attached to <literal>node2</literal>, not the primary (<literal>node1</literal>).
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr
+    </programlisting>
+   </para>
+   <tip>
+    <simpara>
+     Under some circumstances when setting up a cascading replication
+     cluster, you may wish to clone a downstream standby whose upstream node
+     does not yet exist. In this case you can clone from the primary (or
+     another upstream node); provide the parameter <literal>--upstream-conninfo</literal>
+     to explictly set the upstream's <varname>primary_conninfo</varname> string
+     in <filename>recovery.conf</filename>.
+    </simpara>
+   </tip>
+ </sect1>
+
+ <sect1 id="cloning-advanced" xreflabel="Advanced cloning options">
+   <indexterm>
+     <primary>cloning</primary>
+     <secondary>advanced options</secondary>
+   </indexterm>
+   <title>Advanced cloning options</title>
+
+   <sect2 id="cloning-advanced-pg-basebackup-options" xreflabel="pg_basebackup options when cloning a standby">
+    <title>pg_basebackup options when cloning a standby</title>
+    <para>
+      As &repmgr; uses <command>pg_basebackup</command> to clone a standby, it's possible to
+      provide additional parameters for <command>pg_basebackup</command> to customise the
+      cloning process.
+    </para>
+    <para>
+     By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup
+     process. However, a normal checkpoint may take some time to complete;
+     a fast checkpoint can be forced with the <literal>-c/--fast-checkpoint</literal> option.
+     Note that this may impact performance of the server being cloned from (typically the primary)
+     so should be used with care.
+    </para>
+    <tip>
+      <simpara>
+        If <application>Barman</application> is set up for the cluster, it's possible to
+        clone the standby directly from Barman, without any impact on the server the standby
+        is being cloned from. For more details see <xref linkend="cloning-from-barman">.
+      </simpara>
+    </tip>
+    <para>
+      Other options can be passed to <command>pg_basebackup</command> by including them
+      in the <filename>repmgr.conf</filename> setting <varname>pg_basebackup_options</varname>.
+    </para>
+    <para>
+      If using a separate directory to store WAL files, provide the option <literal>--waldir</literal>
+      (<literal>--xlogdir</literal> in PostgreSQL 9.6 and earlier) with the absolute path to the
+      WAL directory. Any WALs generated during the cloning process will be copied here, and
+      a symlink will automatically be created from the main data directory.
+    </para>
+    <para>
+     See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
+     for more details of available options.
+    </para>
+   </sect2>
+
+   <sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords">
+    <title>Managing passwords</title>
+    <indexterm>
+      <primary>cloning</primary>
+      <secondary>using passwords</secondary>
+    </indexterm>
+
+    <para>
+     If replication connections to a standby's upstream server are password-protected,
+     the standby must be able to provide the password so it can begin streaming replication.
+    </para>
+
+    <para>
+     The recommended way to do this is to store the password in the <literal>postgres</literal> system
+     user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the
+     environment variable <varname>PGPASSWORD</varname>, however this is not recommended for
+     security reasons. For more details see the
+     <ulink url="https://www.postgresql.org/docs/current/static/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
+    </para>
+
+    <note>
+      <para>
+        If using a <filename>pgpass</filename> file, an entry for the replication user (by default the
+        user who connects to the <literal>repmgr</literal> database) <emphasis>must</emphasis>
+        be provided, with database name set to <literal>replication</literal>, e.g.:
+        <programlisting>
+          node1:5432:replication:repmgr:12345</programlisting>
+      </para>
+    </note>
+
+    <para>
+     If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>,
+     set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in
+     <filename>repmgr.conf</filename>. This will read a password set in <varname>PGPASSWORD</varname>
+     (but not <filename>~/.pgpass</filename>) and place it into the <varname>primary_conninfo</varname>
+     string in <filename>recovery.conf</filename>. Note that <varname>PGPASSWORD</varname>
+     will need to be set during any action which causes <filename>recovery.conf</filename> to be
+     rewritten, e.g. <xref linkend="repmgr-standby-follow">.
+    </para>
+    <para>
+     It is of course also possible to include the password value in the <varname>conninfo</varname>
+     string for each node, but this is obviously a security risk and should be avoided.
+    </para>
+    <para>
+      From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname>
+      parameter in connection strings, which can be used to specify a password file other than
+      the default <filename>~/.pgpass</filename>.
+    </para>
+    <para>
+      To have &repmgr; write a custom password file in <varname>primary_conninfo</varname>,
+      specify its location in <varname>passfile</varname> in <filename>repmgr.conf</filename>.
+    </para>
+   </sect2>
+
+   <sect2 id="cloning-advanced-replication-user" xreflabel="Separate replication user">
+    <title>Separate replication user</title>
+    <para>
+     In some circumstances it might be desirable to create a dedicated replication-only
+     user (in addition to the user who manages the &repmgr; metadata). In this case,
+     the replication user should be set in <filename>repmgr.conf</filename> via the parameter
+     <varname>replication_user</varname>; &repmgr; will use this value when making
+     replication connections and generating <filename>recovery.conf</filename>. This
+     value will also be stored in the parameter <literal>repmgr.nodes</literal>
+     table for each node; it no longer needs to be explicitly specified when
+     cloning a node or executing <xref linkend="repmgr-standby-follow">.
+    </para>
+   </sect2>
+ </sect1>
+
+
+</chapter>
--- a/doc/configuration-file-settings.sgml
+++ b/doc/configuration-file-settings.sgml
@@ -0,0 +1,122 @@
+<sect1 id="configuration-file-settings" xreflabel="configuration file settings">
+  <indexterm>
+    <primary>repmgr.conf</primary>
+    <secondary>basic settings</secondary>
+  </indexterm>
+
+ <title>Basic configuration file settings</title>
+ <para>
+   Each <filename>repmgr.conf</filename> file must contain the following parameters:
+ </para>
+ <para>
+  <variablelist>
+   <varlistentry id="repmgr-conf-node-id" xreflabel="node_id">
+    <term><varname>node_id</varname> (<type>int</type>)
+     <indexterm>
+      <primary><varname>node_id</varname> configuration file parameter</primary>
+     </indexterm>
+    </term>
+    <listitem>
+     <para>
+      A unique integer greater than zero which identifies the node.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry id="repmgr-conf-node-name" xreflabel="node_name">
+    <term><varname>node_name</varname> (<type>string</type>)
+     <indexterm>
+      <primary><varname>node_name</varname> configuration file parameter</primary>
+     </indexterm>
+    </term>
+    <listitem>
+     <para>
+       An arbitrary (but unique) string; we recommend using the server's hostname
+       or another identifier unambiguously associated with the server to avoid
+       confusion. Avoid choosing names which reflect the node's current role,
+       e.g. <varname>primary</varname> or <varname>standby1</varname>
+       as roles can change and if you end up in a solution where the current primary is
+       called <varname>standby1</varname> (for example), things will be confusing
+       to say the least.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry id="repmgr-conf-conninfo" xreflabel="conninfo">
+    <term><varname>conninfo</varname> (<type>string</type>)
+     <indexterm>
+      <primary><varname>conninfo</varname> configuration file parameter</primary>
+     </indexterm>
+    </term>
+    <listitem>
+     <para>
+      Database connection information as a conninfo string.
+      All servers in the cluster must be able to connect to
+      the local node using this string.
+     </para>
+     <para>
+       For details on conninfo strings, see section <ulink
+       url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">Connection Strings</>
+        in the PosgreSQL documentation.
+     </para>
+     <para>
+        If repmgrd is in use, consider explicitly setting
+        <varname>connect_timeout</varname> in the <varname>conninfo</varname>
+        string to determine the length of time which elapses before a network
+        connection attempt is abandoned; for details see <ulink
+        url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT">
+        the PostgreSQL documentation</>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry id="repmgr-conf-data-directory" xreflabel="data_directory">
+    <term><varname>data_directory</varname> (<type>string</type>)
+     <indexterm>
+      <primary><varname>data_directory</varname> configuration file parameter</primary>
+     </indexterm>
+    </term>
+    <listitem>
+     <para>
+       The node's data directory. This is needed by repmgr
+       when performing operations when the PostgreSQL instance
+       is not running and there's no other way of determining
+       the data directory.
+     </para>
+    </listitem>
+   </varlistentry>
+
+
+  </variablelist>
+ </para>
+
+  <para>
+    For a full list of annotated configuration items, see the file
+    <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
+  </para>
+  <para>
+    For <application>repmgrd</application>-specific settings, see <xref linkend="repmgrd-configuration">.
+  </para>
+
+  <note>
+    <para>
+    The following parameters in the configuration file can be overridden with
+    command line options:
+    <itemizedlist>
+     <listitem>
+       <simpara>
+         <literal>-L/--log-level</literal> overrides <literal>log_level</literal> in
+         <filename>repmgr.conf</filename>
+       </simpara>
+     </listitem>
+     <listitem>
+       <simpara>
+         <literal>-b/--pg_bindir</literal> overrides <literal>pg_bindir</literal> in
+         <filename>repmgr.conf</filename>
+       </simpara>
+     </listitem>
+    </itemizedlist>
+    </para>
+  </note>
+
+</sect1>
--- a/doc/configuration-file.sgml
+++ b/doc/configuration-file.sgml
@@ -0,0 +1,69 @@
+<sect1 id="configuration-file" xreflabel="configuration file location">
+  <indexterm>
+    <primary>repmgr.conf</primary>
+    <secondary>location</secondary>
+  </indexterm>
+
+  <indexterm>
+    <primary>configuration</primary>
+    <secondary>repmgr.conf location</secondary>
+  </indexterm>
+
+  <title>Configuration file location</title>
+  <para>
+    <application>repmgr</application> and <application>repmgrd</application>
+    use a common configuration file, by default called
+    <filename>repmgr.conf</filename> (although any name can be used if explicitly specified).
+    <filename>repmgr.conf</filename> must contain a number of required parameters, including
+    the database connection string for the local node and the location
+    of its data directory; other values will be inferred from defaults if
+    not explicitly supplied. See section <xref linkend="configuration-file-settings">
+    for more details.
+  </para>
+
+  <para>
+   The configuration file will be searched for in the following locations:
+   <itemizedlist spacing="compact" mark="bullet">
+    <listitem>
+     <para>a configuration file specified by the <literal>-f/--config-file</literal> command line option</para>
+    </listitem>
+    <listitem>
+     <para>
+      a location specified by the package maintainer (if <application>repmgr</application>
+      as installed from a package and the package maintainer has specified the configuration
+      file location)
+     </para>
+    </listitem>
+    <listitem>
+     <para><filename>repmgr.conf</filename> in the local directory</para>
+    </listitem>
+    <listitem>
+      <para><filename>/etc/repmgr.conf</filename></para>
+    </listitem>
+    <listitem>
+     <para>the directory reported by <application>pg_config --sysconfdir</application></para>
+    </listitem>
+   </itemizedlist>
+  </para>
+
+  <para>
+   Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
+   an error will be raised if it is not found or not readable, and no attempt will be made to
+   check default locations; this is to prevent <application>repmgr</application> unexpectedly
+   reading the wrong configuraton file.
+  </para>
+
+  <note>
+    <para>
+      If providing the configuration file location with <literal>-f/--config-file</literal>,
+      avoid using a relative path, particularly when executing <xref linkend="repmgr-primary-register">
+      and <xref linkend="repmgr-standby-register">, as &repmgr; stores the configuration file location
+      in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
+      <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
+      a relative path into an absolute one, but this may not be the same as the path you
+      would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
+      to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
+      <filename>/path/to/repmgr.conf</filename>).
+    </para>
+  </note>
+</sect1>
--- a/doc/configuration-service-commands.sgml
+++ b/doc/configuration-service-commands.sgml
@@ -0,0 +1,115 @@
+<sect1 id="configuration-service-commands" xreflabel="service command settings">
+  <indexterm>
+    <primary>repmgr.conf</primary>
+    <secondary>service command settings</secondary>
+  </indexterm>
+  <indexterm>
+    <primary>service command settings</primary>
+    <secondary>configuration in repmgr.conf</secondary>
+  </indexterm>
+  <title>Service command settings</title>
+
+  <para>
+    In some circumstances, &repmgr; (and <application>repmgrd</application>) need to
+    be able to stop, start or restart PostgreSQL. &repmgr; commands which need to do this
+    include <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>,
+    <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link> and
+    <link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link>.
+  </para>
+  <para>
+    By default, &repmgr; will use PostgreSQL's <command>pg_ctl</command> to control the PostgreSQL
+    server. However this can lead to various problems, particularly when PostgreSQL has been
+    installed from packages, and expecially so if <application>systemd</application> is in use.
+  </para>
+
+
+  <note>
+    <para>
+      If using <application>systemd</application>, ensure you have <varname>RemoteIPC</varname> set to <literal>off</literal>.
+      See the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
+      entry in the <ulink url="https://wiki.postgresql.org/wiki/Main_Page">PostgreSQL wiki</ulink> for details.
+    </para>
+  </note>
+
+
+  <para>
+    With this in mind, we recommend to <emphasis>always</emphasis> configure &repmgr; to use the
+    available system service commands.
+  </para>
+
+  <para>
+    To do this, specify the appropriate command for each action
+    in <filename>repmgr.conf</filename> using the following configuration
+    parameters:
+    <programlisting>
+    service_start_command
+    service_stop_command
+    service_restart_command
+    service_reload_command</programlisting>
+  </para>
+
+  <note>
+    <para>
+      It's also possible to specify a <varname>service_promote_command</varname>;
+      this overrides any value contained in the setting <varname>promote_command</varname>.
+      This is intended for systems which provide a package-level promote command,
+      such as Debian's <application>pg_ctlcluster</application>.
+    </para>
+  </note>
+
+  <para>
+    To confirm which command &repmgr; will execute for each action, use
+    <command>repmgr node service --list --action=...</command>, e.g.:
+    <programlisting>
+      repmgr -f /etc/repmgr.conf node service --list --action=stop
+      repmgr -f /etc/repmgr.conf node service --list --action=start
+      repmgr -f /etc/repmgr.conf node service --list --action=restart
+      repmgr -f /etc/repmgr.conf node service --list --action=reload</programlisting>
+  </para>
+
+  <para>
+     These commands will be executed by the system user which &repmgr; runs as (usually <literal>postgres</literal>)
+     and will probably require passwordless sudo access to be able to execute the command.
+  </para>
+  <para>
+    For example, using <application>systemd</application> on CentOS 7, the service commands can be
+    set as follows:
+    <programlisting>
+      service_start_command   = 'sudo systemctl start postgresql-9.6'
+      service_stop_command    = 'sudo systemctl stop postgresql-9.6'
+      service_restart_command = 'sudo systemctl restart postgresql-9.6'
+      service_reload_command  = 'sudo systemctl reload postgresql-9.6'</programlisting>
+    and <filename>/etc/sudoers</filename> should be set as follows:
+    <programlisting>
+      Defaults:postgres !requiretty
+      postgres ALL = NOPASSWD: /usr/bin/systemctl stop postgresql-9.6, \
+        /usr/bin/systemctl start postgresql-9.6, \
+        /usr/bin/systemctl restart postgresql-9.6 \
+        /usr/bin/systemctl reload postgresql-9.6</programlisting>
+  </para>
+
+  <important>
+    <indexterm>
+      <primary>pg_ctlcluster</primary>
+      <secondary>service command settings</secondary>
+    </indexterm>
+    <para>
+      Debian/Ubuntu users: instead of calling <command>sudo systemctl</command> directly, use
+      <command>sudo pg_ctlcluster</command>, e.g.:
+    <programlisting>
+      service_start_command   = 'sudo pg_ctlcluster 9.6 main start'
+      service_stop_command    = 'sudo pg_ctlcluster 9.6 main stop'
+      service_restart_command = 'sudo pg_ctlcluster 9.6 main restart'
+      service_reload_command  = 'sudo pg_ctlcluster 9.6 main reload'</programlisting>
+      and set <filename>/etc/sudoers</filename> accordingly.
+    </para>
+    <para>
+      While <command>pg_ctlcluster</command> will work when executed as user <literal>postgres</literal>,
+      it's strongly recommended to use <command>sudo pg_ctlcluster</command> on <application>systemd</application>
+      systems, to ensure <application>systemd</application> has a correct picture of
+      the PostgreSQL application state.
+    </para>
+
+  </important>
+
+</sect1>
--- a/doc/configuration.sgml
+++ b/doc/configuration.sgml
@@ -0,0 +1,25 @@
+<chapter id="configuration" xreflabel="Configuration">
+  <title>repmgr configuration</title>
+
+  &configuration-file;
+  &configuration-file-settings;
+  &configuration-service-commands;
+
+  <sect1 id="configuration-permissions" xreflabel="User permissions">
+    <indexterm>
+      <primary>configuration</primary>
+      <secondary>user permissions</secondary>
+    </indexterm>
+
+    <title>repmgr user permissions</title>
+    <para>
+      &repmgr; will create an extension database containing objects
+      for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname>
+      setting must be able to access all objects. Additionally, superuser permissions
+      are required to install the &repmgr; extension. The easiest way to do this
+      is create the &repmgr; user as a superuser, however if this is not
+      desirable, the &repmgr; user can be created as a normal user and a
+      superuser specified with <literal>--superuser</literal> when registering a &repmgr; node.
+    </para>
+  </sect1>
+</chapter>
--- a/doc/configuring-witness-server.sgml
+++ b/doc/configuring-witness-server.sgml
@@ -0,0 +1,86 @@
+<chapter id="using-witness-server">
+ <indexterm>
+  <primary>witness server</primary>
+  <seealso>Using a witness server with repmgrd</seealso>
+ </indexterm>
+
+
+ <title>Using a witness server</title>
+ <para>
+   A <xref linkend="witness-server"> is a normal PostgreSQL instance which
+   is not part of the streaming replication cluster; its purpose is, if a
+   failover situation occurs, to provide proof that the primary server
+   itself is unavailable.
+ </para>
+
+ <para>
+   A typical use case for a witness server is a two-node streaming replication
+   setup, where the primary and standby are in different locations (data centres).
+   By creating a witness server in the same location as the primary, if the primary
+   becomes unavailable  it's possible for the standby to decide whether it can
+   promote itself without risking a "split brain" scenario: if it can't see either the
+   witness or the primary server, it's likely there's a network-level interruption
+   and it should not promote itself. If it can seen the witness but not the primary,
+   this proves there is no network interruption and the primary itself is unavailable,
+   and it can therefore promote itself (and ideally take action to fence the
+   former primary).
+ </para>
+ <para>
+   For more complex replication scenarios,e.g. with multiple datacentres, it may
+   be preferable to use location-based failover, which ensures that only nodes
+   in the same location as the primary will ever be promotion candidates;
+   see <xref linkend="repmgrd-network-split"> for more details.
+ </para>
+
+ <note>
+   <simpara>
+     A witness server will only be useful if <application>repmgrd</application>
+     is in use.
+   </simpara>
+ </note>
+
+ <sect1 id="creating-witness-server">
+   <title>Creating a witness server</title>
+ <para>
+   To create a witness server, set up a normal PostgreSQL instance on a server
+   in the same physical location as the cluster's primary server.
+ </para>
+ <para>
+   This instance should *not* be on the same physical host as the primary server,
+   as otherwise if the primary server fails due to hardware issues, the witness
+   server will be lost too.
+ </para>
+ <note>
+   <simpara>
+     &repmgr; 3.3 and earlier provided a <command>repmgr create witness</command>
+     command, which would automatically create a PostgreSQL instance. However
+     this often resulted in an unsatisfactory, hard-to-customise instance.
+   </simpara>
+ </note>
+ <para>
+   The witness server should be configured in the same way as a normal
+   &repmgr; node; see section <xref linkend="configuration">.
+ </para>
+ <para>
+   Register the witness server with <xref linkend="repmgr-witness-register">.
+   This will create the &repmgr; extension on the witness server, and make
+   a copy of the &repmgr; metadata.
+ </para>
+ <note>
+   <simpara>
+    As the witness server is not part of the replication cluster, further
+    changes to the &repmgr; metadata will be synchronised by
+    <application>repmgrd</application>.
+   </simpara>
+ </note>
+ <para>
+   Once the witness server has been configured, <application>repmgrd</application>
+   should be started; for more details see <xref linkend="repmgrd-witness-server">.
+ </para>
+
+ <para>
+  To unregister a witness server, use <xref linkend="repmgr-witness-unregister">.
+ </para>
+
+ </sect1>
+</chapter>
--- a/doc/event-notifications.sgml
+++ b/doc/event-notifications.sgml
@@ -0,0 +1,251 @@
+<chapter id="event-notifications" xreflabel="event notifications">
+
+ <indexterm>
+   <primary>event notifications</primary>
+ </indexterm>
+
+ <title>Event Notifications</title>
+ <para>
+  Each time &repmgr; or <application>repmgrd</application> perform a significant event, a record
+  of that event is written into the <literal>repmgr.events</literal> table together with
+  a timestamp, an indication of failure or success, and further details
+  if appropriate. This is useful for gaining an overview of events
+  affecting the replication cluster. However note that this table has
+  advisory character and should be used in combination with the &repmgr;
+  and PostgreSQL logs to obtain details of any events.
+ </para>
+ <para>
+  Example output after a primary was registered and a standby cloned
+  and registered:
+  <programlisting>
+    repmgr=# SELECT * from repmgr.events ;
+     node_id |      event       | successful |        event_timestamp        |                                       details
+    ---------+------------------+------------+-------------------------------+-------------------------------------------------------------------------------------
+           1 | primary_register | t          | 2016-01-08 15:04:39.781733+09 |
+           2 | standby_clone    | t          | 2016-01-08 15:04:49.530001+09 | Cloned from host 'repmgr_node1', port 5432; backup method: pg_basebackup; --force: N
+           2 | standby_register | t          | 2016-01-08 15:04:50.621292+09 |
+    (3 rows)</programlisting>
+ </para>
+ <para>
+  Alternatively, use <xref linkend="repmgr-cluster-event"> to output a
+  formatted list of events.
+ </para>
+ <para>
+  Additionally, event notifications can be passed to a user-defined program
+  or script which can take further action, e.g. send email notifications.
+  This is done by setting the <literal>event_notification_command</literal> parameter in
+  <filename>repmgr.conf</filename>.
+ </para>
+ <para>
+  The following format placeholders are provided for all event notifications:
+ </para>
+
+ <variablelist>
+  <varlistentry>
+   <term><option>%n</option></term>
+   <listitem>
+    <para>
+      node ID
+    </para>
+   </listitem>
+  </varlistentry>
+
+  <varlistentry>
+   <term><option>%e</option></term>
+   <listitem>
+    <para>
+     event type
+    </para>
+   </listitem>
+  </varlistentry>
+
+  <varlistentry>
+   <term><option>%s</option></term>
+   <listitem>
+    <para>
+     success (1) or failure (0)
+    </para>
+   </listitem>
+  </varlistentry>
+  <varlistentry>
+   <term><option>%t</option></term>
+   <listitem>
+    <para>
+     timestamp
+    </para>
+   </listitem>
+  </varlistentry>
+
+  <varlistentry>
+   <term><option>%d</option></term>
+   <listitem>
+    <para>
+     details
+    </para>
+   </listitem>
+  </varlistentry>
+ </variablelist>
+
+ <para>
+  The values provided for <literal>%t</literal> and <literal>%d</literal>
+  will probably contain spaces, so should be quoted in the provided command
+  configuration, e.g.:
+  <programlisting>
+    event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
+  </programlisting>
+ </para>
+
+ <para>
+   The following parameters are provided for a subset of event notifications:
+ </para>
+
+ <variablelist>
+  <varlistentry>
+   <term><option>%p</option></term>
+   <listitem>
+    <para>
+     node ID of the current primary (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
+    </para>
+    <para>
+     node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"> only)
+    </para>
+   </listitem>
+  </varlistentry>
+  <varlistentry>
+   <term><option>%c</option></term>
+   <listitem>
+    <para>
+     <literal>conninfo</literal> string of the primary node
+     (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
+    </para>
+    <para>
+      <literal>conninfo</literal> string of the next available node
+      (<varname>bdr_failover</varname> and  <varname>bdr_recovery</varname>)
+    </para>
+   </listitem>
+  </varlistentry>
+
+  <varlistentry>
+   <term><option>%a</option></term>
+   <listitem>
+    <para>
+     name of the current primary node (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
+    </para>
+    <para>
+     name of the next available node (<varname>bdr_failover</varname> and  <varname>bdr_recovery</varname>)
+    </para>
+   </listitem>
+  </varlistentry>
+
+ </variablelist>
+
+ <para>
+  The values provided for <literal>%c</literal> and <literal>%a</literal>
+  will probably contain spaces, so should always be quoted.
+ </para>
+
+ <para>
+  By default, all notification types will be passed to the designated script;
+  the notification types can be filtered to explicitly named ones using the
+  <varname>event_notifications</varname> parameter:
+
+  <itemizedlist spacing="compact" mark="bullet">
+
+   <listitem>
+    <simpara><literal>primary_register</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>primary_unregister</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_register</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_register_sync</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_unregister</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_clone</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_promote</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_follow</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_disconnect_manual</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_failure</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>standby_recovery</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>witness_register</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>witness_unregister</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>node_rejoin</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_start</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_shutdown</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_failover_promote</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_failover_follow</literal></simpara>
+   </listitem>
+   <listitem>
+     <simpara><literal>repmgrd_failover_aborted</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_upstream_disconnect</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_upstream_reconnect</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_promote_error</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>repmgrd_failover_promote</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>bdr_failover</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>bdr_reconnect</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>bdr_recovery</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>bdr_register</literal></simpara>
+   </listitem>
+   <listitem>
+    <simpara><literal>bdr_unregister</literal></simpara>
+   </listitem>
+
+  </itemizedlist>
+ </para>
+
+ <para>
+  Note that under some circumstances (e.g. when no replication cluster primary
+  could be located), it will not be possible to write an entry into the
+  <literal>repmgr.events</literal>
+  table, in which case executing a script via <varname>event_notification_command</varname>
+  can serve as a fallback by generating some form of notification.
+ </para>
+
+
+</chapter>
--- a/doc/filelist.sgml
+++ b/doc/filelist.sgml
@@ -0,0 +1,87 @@
+<!-- doc/filelist.sgml -->
+
+<!ENTITY legal      SYSTEM "legal.sgml">
+
+<!ENTITY bookindex  SYSTEM "bookindex.sgml">
+
+<!--
+ Some parts of the documentation are also source for some plain-text
+ files used during installation.  To selectively ignore or include
+ some parts (e.g., external xref's) when generating these files we use
+ these parameter entities.  See also standalone-install.sgml.
+ -->
+<!ENTITY % standalone-ignore  "INCLUDE">
+<!ENTITY % standalone-include "IGNORE">
+
+<!-- doc/filelist.sgml -->
+
+<!--
+ By default, no index is included.  Use -i include-index on the command line
+ to include it.
+ -->
+<!ENTITY % include-index "IGNORE">
+
+<!--
+ Create empty index element for processing by XSLT stylesheet.
+ -->
+<!ENTITY % include-xslt-index "IGNORE">
+
+<!--
+ Include external documentation sections
+ -->
+
+<!ENTITY overview      SYSTEM "overview.sgml">
+<!ENTITY install SYSTEM "install.sgml">
+<!ENTITY install-requirements      SYSTEM "install-requirements.sgml">
+<!ENTITY install-packages      SYSTEM "install-packages.sgml">
+<!ENTITY install-source      SYSTEM "install-source.sgml">
+<!ENTITY quickstart      SYSTEM "quickstart.sgml">
+<!ENTITY configuration      SYSTEM "configuration.sgml">
+<!ENTITY configuration-file      SYSTEM "configuration-file.sgml">
+<!ENTITY configuration-file-settings      SYSTEM "configuration-file-settings.sgml">
+<!ENTITY configuration-service-commands   SYSTEM "configuration-service-commands.sgml">
+<!ENTITY cloning-standbys  SYSTEM "cloning-standbys.sgml">
+<!ENTITY promoting-standby  SYSTEM "promoting-standby.sgml">
+<!ENTITY follow-new-primary  SYSTEM "follow-new-primary.sgml">
+<!ENTITY switchover  SYSTEM "switchover.sgml">
+<!ENTITY configuring-witness-server SYSTEM "configuring-witness-server.sgml">
+
+<!ENTITY event-notifications  SYSTEM "event-notifications.sgml">
+<!ENTITY upgrading-repmgr  SYSTEM "upgrading-repmgr.sgml">
+
+<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.sgml">
+<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.sgml">
+<!ENTITY repmgrd-demonstration SYSTEM "repmgrd-demonstration.sgml">
+<!ENTITY repmgrd-monitoring SYSTEM "repmgrd-monitoring.sgml">
+<!ENTITY repmgrd-degraded-monitoring SYSTEM "repmgrd-degraded-monitoring.sgml">
+<!ENTITY repmgrd-cascading-replication SYSTEM "repmgrd-cascading-replication.sgml">
+<!ENTITY repmgrd-network-split SYSTEM "repmgrd-network-split.sgml">
+<!ENTITY repmgrd-witness-server SYSTEM "repmgrd-witness-server.sgml">
+<!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.sgml">
+
+<!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.sgml">
+<!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.sgml">
+<!ENTITY repmgr-standby-clone SYSTEM "repmgr-standby-clone.sgml">
+<!ENTITY repmgr-standby-register SYSTEM "repmgr-standby-register.sgml">
+<!ENTITY repmgr-standby-unregister SYSTEM "repmgr-standby-unregister.sgml">
+<!ENTITY repmgr-standby-promote SYSTEM "repmgr-standby-promote.sgml">
+<!ENTITY repmgr-standby-follow SYSTEM "repmgr-standby-follow.sgml">
+<!ENTITY repmgr-standby-switchover SYSTEM "repmgr-standby-switchover.sgml">
+<!ENTITY repmgr-witness-register SYSTEM "repmgr-witness-register.sgml">
+<!ENTITY repmgr-witness-unregister SYSTEM "repmgr-witness-unregister.sgml">
+<!ENTITY repmgr-node-status SYSTEM "repmgr-node-status.sgml">
+<!ENTITY repmgr-node-check SYSTEM "repmgr-node-check.sgml">
+<!ENTITY repmgr-node-rejoin SYSTEM "repmgr-node-rejoin.sgml">
+<!ENTITY repmgr-cluster-show SYSTEM "repmgr-cluster-show.sgml">
+<!ENTITY repmgr-cluster-matrix SYSTEM "repmgr-cluster-matrix.sgml">
+<!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.sgml">
+<!ENTITY repmgr-cluster-event SYSTEM "repmgr-cluster-event.sgml">
+<!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.sgml">
+
+<!ENTITY appendix-release-notes  SYSTEM "appendix-release-notes.sgml">
+<!ENTITY appendix-faq      SYSTEM "appendix-faq.sgml">
+<!ENTITY appendix-signatures      SYSTEM "appendix-signatures.sgml">
+<!ENTITY appendix-packages      SYSTEM "appendix-packages.sgml">
+
+<!ENTITY bookindex  SYSTEM "bookindex.sgml">
+
--- a/doc/follow-new-primary.sgml
+++ b/doc/follow-new-primary.sgml
@@ -0,0 +1,48 @@
+<chapter id="follow-new-primary">
+ <indexterm>
+  <primary>Following a new primary</primary>
+  <seealso>repmgr standby follow</seealso>
+ </indexterm>
+
+ <title>Following a new primary</title>
+ <para>
+   Following the failure or removal of the replication cluster's existing primary
+   server, <xref linkend="repmgr-standby-follow"> can be used to make 'orphaned' standbys
+   follow the new primary and catch up to its current state.
+ </para>
+ <para>
+  To demonstrate this, assuming a replication cluster in the same state as the
+  end of the preceding section (<xref linkend="promoting-standby">),
+  execute this:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf repmgr standby follow
+    INFO: changing node 3's primary to node 2
+    NOTICE: restarting server using "pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/postgresql/data' restart"
+    waiting for server to shut down......... done
+    server stopped
+    waiting for server to start.... done
+    server started
+    NOTICE: STANDBY FOLLOW successful
+    DETAIL: node 3 is now attached to node 2
+  </programlisting>
+ </para>
+ <para>
+   The standby is now replicating from the new primary and
+   <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>
+   output reflects this:
+   <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
+ </para>
+ <para>
+  Note that with cascading replication, <command>repmgr standby follow</command> can also be
+  used to detach a standby from its current upstream server and follow the
+  primary. However it's currently not possible to have it follow another standby;
+  we hope to improve this in a future release.
+ </para>
+
+</chapter>
--- a/doc/install-packages.sgml
+++ b/doc/install-packages.sgml
@@ -0,0 +1,173 @@
+<sect1 id="installation-packages" xreflabel="Installing from packages">
+ <title>Installing &repmgr; from packages</title>
+ <para>
+  We recommend installing &repmgr; using the available packages for your
+  system.
+ </para>
+
+ <sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, Fedora and CentOS">
+
+  <indexterm>
+   <primary>installation</primary>
+   <secondary>on Red Hat/CentOS/Fedora etc.</secondary>
+  </indexterm>
+
+  <title>RedHat/Fedora/CentOS</title>
+  <para>
+   RPM packages for &repmgr; are available via Yum through
+   the PostgreSQL Global Development Group RPM repository
+   (<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>).
+   Follow the instructions for your distribution (RedHat, CentOS,
+   Fedora, etc.) and architecture as detailed there.
+  </para>
+  <para>
+   <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink> also provides its
+   own RPM packages which are made available
+   at the same time as each &repmgr; release, as it can take some days for
+   them to become available via the main PGDG repository. See following section for details:
+  </para>
+  <note>
+    <para>
+      &repmgr; packages are designed to be compatible with the community-provided PostgreSQL packages.
+      They may not work with vendor-specific packages such as those provided by RedHat for RHEL
+      customers, as the filesystem layout may be different to the community RPMs.
+      Please contact your support vendor for assistance.
+    </para>
+  </note>
+
+  <para>
+    For more information on the package contents, including details of installation
+    paths and relevant <link linkend="configuration-service-commands">service commands</link>,
+    see the appendix section <xref linkend="packages-centos">.
+  </para>
+
+
+  <sect3 id="installation-packages-redhat-2ndq">
+    <title>2ndQuadrant repmgr yum repository</title>
+    <para>
+      Beginning with <ulink url="http://repmgr.org/release-notes-3.1.3.html">repmgr 3.1.3</ulink>,
+      <ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal>
+      repository for &repmgr; releases. This repository complements the main
+      <ulink url="https://yum.postgresql.org/repopackages.php">PGDG community repository</ulink>,
+      but enables repmgr users to access the latest &repmgr; packages before they are
+      available via the PGDG repository, which can take several days to be updated following
+      a fresh  &repmgr; release.
+    </para>
+    <para>
+      <emphasis>Installation</emphasis>
+
+      <itemizedlist>
+        <listitem>
+          <para>
+            Import the repository public key (optional but recommended):
+            <programlisting>
+              rpm --import http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr</programlisting>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Install the repository RPM for your distribution (this enables the 2ndQuadrant
+            repository as a source of repmgr packages):
+            <itemizedlist>
+              <listitem>
+                <simpara>
+                  <emphasis>Fedora:</emphasis>
+                  <ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm</ulink>
+                </simpara>
+              </listitem>
+              <listitem>
+                <simpara>
+                  <emphasis>RHEL, CentOS etc:</emphasis>
+                  <ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</ulink>
+                </simpara>
+              </listitem>
+            </itemizedlist>
+          </para>
+          <para>
+            e.g.:
+            <programlisting>
+              $ yum install http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</programlisting>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Install the repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr96</literal>), e.g.:
+            <programlisting>
+              $ yum install repmgr96</programlisting>
+          </para>
+        </listitem>
+      </itemizedlist>
+    </para>
+
+    <para>
+      <emphasis>Compatibility with PGDG Repositories</emphasis>
+    </para>
+    <para>
+        The 2ndQuadrant &repmgr; yum repository uses exactly the same package definitions as the
+        main PGDG repository and is effectively a selective mirror for &repmgr; packages only.
+    </para>
+    <para>
+        Normally yum should prioritize the repository with the most recent &repmgr; version.
+        Once the PGDG repository has been updated, it doesn't matter which repository
+        the packages are installed from.
+    </para>
+    <para>
+      To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal>
+      and set the repository priorities accordingly.
+    </para>
+
+    <para>
+      <emphasis>Installing a specific package version</emphasis>
+    </para>
+    <para>
+      To install a specific package version, execute <command>yum --showduplicates list</command>
+      for the package in question:
+      <programlisting>
+        [root@localhost ~]# yum --showduplicates list repmgr96
+        Loaded plugins: fastestmirror
+        Loading mirror speeds from cached hostfile
+         * base: ftp.iij.ad.jp
+         * extras: ftp.iij.ad.jp
+         * updates: ftp.iij.ad.jp
+        Available Packages
+        repmgr96.x86_64               3.2-1.el6                    2ndquadrant-repmgr
+        repmgr96.x86_64               3.2.1-1.el6                  2ndquadrant-repmgr
+        repmgr96.x86_64               3.3-1.el6                    2ndquadrant-repmgr
+        repmgr96.x86_64               3.3.1-1.el6                  2ndquadrant-repmgr
+        repmgr96.x86_64               3.3.2-1.el6                  2ndquadrant-repmgr
+        repmgr96.x86_64               3.3.2-1.rhel6                pgdg96
+        repmgr96.x86_64               4.0.0-1.el6                  2ndquadrant-repmgr
+        repmgr96.x86_64               4.0.0-1.rhel6                pgdg96</programlisting>
+      then append the appropriate version number to the package name with a hyphen, e.g.:
+      <programlisting>
+        [root@localhost ~]# yum install repmgr96-3.3.2-1.el6</programlisting>
+    </para>
+  </sect3>
+ </sect2>
+
+
+
+ <sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu">
+
+  <indexterm>
+   <primary>installation</primary>
+   <secondary>on Debian/Ubuntu etc.</secondary>
+  </indexterm>
+
+  <title>Debian/Ubuntu</title>
+  <para>.deb packages for &repmgr; are available from the
+  PostgreSQL Community APT repository (<ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink>).
+  Instructions can be found in the APT section of the PostgreSQL Wiki
+  (<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
+  </para>
+  <para>
+    For more information on the package contents, including details of installation
+    paths and relevant <link linkend="configuration-service-commands">service commands</link>,
+    see the appendix section <xref linkend="packages-debian-ubuntu">.
+  </para>
+
+ </sect2>
+
+</sect1>
--- a/doc/install-requirements.sgml
+++ b/doc/install-requirements.sgml
@@ -0,0 +1,72 @@
+<sect1 id="install-requirements" xreflabel="installation requirements">
+
+  <indexterm>
+   <primary>installation</primary>
+   <secondary>requirements</secondary>
+  </indexterm>
+
+  <title>Requirements for installing repmgr</title>
+  <para>
+    repmgr is developed and tested on Linux and OS X, but should work on any
+    UNIX-like system supported by PostgreSQL itself. There is no support for
+    Microsoft Windows.
+  </para>
+
+  <para>
+   From version 4.0, repmgr is compatible with all PostgreSQL versions from 9.3, including PostgreSQL 10.
+   Note that some &repmgr; functionality is not available in PostgreSQL 9.3 and PostgreSQL 9.4.
+  </para>
+
+  <note>
+   <simpara>
+    If upgrading from &repmgr; 3.x, please see the section <xref linkend="upgrading-from-repmgr-3">.
+   </simpara>
+  </note>
+
+  <para>
+   All servers in the replication cluster must be running the same major version of
+   PostgreSQL, and we recommend that they also run the same minor version.
+  </para>
+
+  <para>
+   &repmgr; must be installed on each server in the replication cluster.
+   If installing repmgr from packages, the package version must match the PostgreSQL
+   version. If installing from source, repmgr must be compiled against the same
+   major version.
+  </para>
+
+  <para>
+   A dedicated system user for &repmgr; is *not* required; as many &repmgr; and
+   <application>repmgrd</application> actions require direct access to the PostgreSQL data directory,
+   these commands should be executed by the <literal>postgres</literal> user.
+  </para>
+
+  <para>
+   Passwordless <command>ssh</command> connectivity between all servers in the replication cluster
+   is not required, but is necessary in the following cases:
+   <itemizedlist>
+     <listitem>
+       <simpara>if you need &repmgr; to copy configuration files from outside the PostgreSQL
+       data directory (in which case <command>rsync</command> is also required)</simpara>
+     </listitem>
+     <listitem>
+       <simpara>to perform <link linkend="performing-switchover">switchover operations</link></simpara>
+     </listitem>
+     <listitem>
+       <simpara>
+        when executing <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command>
+        and <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
+       </simpara>
+     </listitem>
+   </itemizedlist>
+  </para>
+
+  <tip>
+   <simpara>
+    We recommend using a session multiplexer utility such as <command>screen</command> or
+    <command>tmux</command> when performing long-running actions (such as cloning a database)
+    on a remote server - this will ensure the &repmgr; action won't be prematurely
+    terminated if your <command>ssh</command> session to the server is interrupted or closed.
+    </simpara>
+  </tip>
+ </sect1>
--- a/doc/install-source.sgml
+++ b/doc/install-source.sgml
@@ -0,0 +1,175 @@
+<sect1 id="installation-source" xreflabel="Installing from source code">
+  <indexterm>
+   <primary>installation</primary>
+   <secondary>from source</secondary>
+  </indexterm>
+
+ <title>Installing &repmgr; from source</title>
+
+ <sect2 id="installation-source-prereqs">
+  <title>Prerequisites for installing from source</title>
+  <para>
+   To install &repmgr; the prerequisites for compiling
+   &postgres; must be installed. These are described in &postgres;'s
+   documentation
+   on <ulink url="https://www.postgresql.org/docs/current/install-requirements.html">build requirements</ulink>
+   and <ulink url="https://www.postgresql.org/docs/current/docguide-toolsets.html">build requirements for documentation</ulink>.
+  </para>
+
+  <para>
+   Most mainstream Linux distributions and other UNIX variants provide simple
+   ways to install the prerequisites from packages.
+   <itemizedlist spacing="compact" mark="bullet">
+    <listitem>
+     <para>
+      <literal>Debian</literal> and <literal>Ubuntu</literal>: First
+      add the <ulink
+      url="http://apt.postgresql.org/">apt.postgresql.org</ulink>
+      repository to your <filename>sources.list</filename> if you
+      have not already done so. Then install the pre-requisites for
+      building PostgreSQL with:
+      <programlisting>
+       sudo apt-get update
+       sudo apt-get build-dep postgresql-9.6</programlisting>
+      </para>
+    </listitem>
+    <listitem>
+     <para>
+      <literal>RHEL or CentOS 6.x or 7.x</literal>: install the appropriate repository RPM
+      for your system from <ulink url="https://yum.postgresql.org/repopackages.php">
+      yum.postgresql.org</ulink>. Then install the prerequisites for building
+      PostgreSQL with:
+      <programlisting>
+       sudo yum check-update
+       sudo yum groupinstall "Development Tools"
+       sudo yum install yum-utils openjade docbook-dtds docbook-style-dsssl docbook-style-xsl
+       sudo yum-builddep postgresql96</programlisting>
+     </para>
+    </listitem>
+   </itemizedlist>
+  </para>
+
+  <note>
+    <simpara>
+      Select the appropriate PostgreSQL versions for your target repmgr version.
+    </simpara>
+  </note>
+ </sect2>
+
+
+<sect2 id="installation-get-source">
+  <title>Getting &repmgr; source code</title>
+
+  <para>
+   There are two ways to get the &repmgr; source code: with git, or by downloading tarballs of released versions.
+  </para>
+
+  <sect3>
+   <title>Using <application>git</application> to get the &repmgr; sources</title>
+
+   <para>
+    Use <application><ulink url="https://git-scm.com">git</ulink></application> if you expect
+    to update often, you want to keep track of development or if you want to contribute
+    changes to &repmgr;. There is no reason <emphasis>not</emphasis> to use <application>git</application>
+    if you're familiar with it.
+   </para>
+
+   <para>
+    The source for &repmgr; is maintained at
+    <ulink url="https://github.com/2ndQuadrant/repmgr">https://github.com/2ndQuadrant/repmgr</ulink>.
+   </para>
+
+   <para>
+    There are also tags for each &repmgr; release, e.g. <filename>REL4_0_STABLE</filename>.
+   </para>
+
+   <para>
+    Clone the source code using <application>git</application>:
+    <programlisting>
+     git clone https://github.com/2ndQuadrant/repmgr</programlisting>
+   </para>
+
+   <para>
+    For more information on using <application>git</application> see
+    <ulink url="https://git-scm.com/">git-scm.com</ulink>.
+   </para>
+
+  </sect3>
+
+  <sect3>
+   <title>Downloading release source tarballs</title>
+
+   <para>
+    Official release source code is uploaded as tarballs to the
+    &repmgr; website along with a tarball checksum and a matching GnuPG
+    signature. See
+    <ulink url="http://repmgr.org/">http://repmgr.org/</ulink>
+    for the download information. See <xref linkend="appendix-signatures">
+    for information on verifying digital signatures.
+   </para>
+
+   <para>
+    You will need to download the repmgr source, e.g. <filename>repmgr-4.0.tar.gz</filename>.
+    You may optionally verify the package checksums from the
+    <literal>.md5</literal> files and/or verify the GnuPG signatures
+    per <xref linkend="appendix-signatures">.
+   </para>
+
+   <para>
+    After you unpack the source code archives using <literal>tar xf</literal>
+    the installation process is the same as if you were installing from a git
+    clone.
+   </para>
+
+  </sect3>
+
+ </sect2>
+
+ <sect2 id="installation-repmgr-source">
+  <title>Installation of &repmgr; from source</title>
+  <para>
+   To installing &repmgr; from source, simply execute:
+
+   <programlisting>
+    ./configure && make install</programlisting>
+
+   Ensure <command>pg_config</command> for the target PostgreSQL version is in
+   <varname>$PATH</varname>.
+  </para>
+ </sect2>
+
+
+
+ <sect2 id="installation-build-repmgr-docs">
+   <title>Building &repmgr; documentation</title>
+   <para>
+    The &repmgr; documentation is (like the main PostgreSQL project)
+    written in DocBook format. To build it locally as HTML, you'll need to
+    install the required packages as described in the
+    <ulink url="https://www.postgresql.org/docs/9.6/static/docguide-toolsets.html">
+      PostgreSQL documentation</ulink> then execute:
+   <programlisting>
+    ./configure && make install-doc</programlisting>
+   </para>
+   <para>
+     The generated HTML files will be placed in the <filename>doc/html</filename>
+     subdirectory of your source tree.
+   </para>
+
+   <para>
+     To build the documentation as a single HTML file, execute:
+   <programlisting>
+    cd doc/ && make repmgr.html</programlisting>
+   </para>
+
+   <note>
+     <simpara>
+       Due to changes in PostgreSQL's documentation build system from PostgreSQL 10,
+       the documentation can currently only be built agains PostgreSQL 9.6 or earlier.
+       This limitation will be fixed when time and resources permit.
+     </simpara>
+   </note>
+ </sect2>
+
+
+</sect1>
--- a/doc/install.sgml
+++ b/doc/install.sgml
@@ -0,0 +1,28 @@
+<chapter id="installation" xreflabel="Installation">
+ <indexterm>
+  <primary>installation</primary>
+ </indexterm>
+
+ <title>Installation</title>
+
+ <para>
+  &repmgr; can be installed from binary packages provided by your operating
+  system's packaging system, or from source.
+ </para>
+ <para>
+  In general we recommend using binary packages, unless unavailable for your operating system.
+ </para>
+ <para>
+  Source installs are mainly useful if you want to keep track of the very
+  latest repmgr development and contribute to development.  They're also the
+  only option if there are no packages for your operating system yet.
+ </para>
+ <para>
+  Before installing &repmgr; make sure you satisfy the <xref linkend="install-requirements">.
+ </para>
+
+ &install-requirements;
+ &install-packages;
+ &install-source;
+
+</chapter>
--- a/doc/legal.sgml
+++ b/doc/legal.sgml
@@ -0,0 +1,37 @@
+<!-- doc/legal.sgml -->
+
+<date>2017</date>
+
+<copyright>
+ <year>2010-2018</year>
+ <holder>2ndQuadrant, Ltd.</holder>
+</copyright>
+
+<legalnotice id="legalnotice">
+ <title>Legal Notice</title>
+
+ <para>
+  <productname>repmgr</productname> is Copyright &copy; 2010-2018
+  by 2ndQuadrant, Ltd. All rights reserved.
+ </para>
+
+ <para>
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+ </para>
+ <para>
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+ </para>
+ <para>
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see
+   <ulink url="https://www.gnu.org/licenses/">https://www.gnu.org/licenses/</ulink>
+   to obtain one.
+ </para>
+
+</legalnotice>
--- a/doc/overview.sgml
+++ b/doc/overview.sgml
@@ -0,0 +1,241 @@
+<chapter id="overview" xreflabel="Overview">
+ <title>repmgr overview</title>
+
+ <para>
+  This chapter provides a high-level overview of &repmgr;'s components and
+  functionality.
+ </para>
+ <sect1 id="repmgr-concepts" xreflabel="Concepts">
+
+  <indexterm>
+    <primary>concepts</primary>
+  </indexterm>
+
+  <title>Concepts</title>
+
+  <para>
+   This guide assumes that you are familiar with PostgreSQL administration and
+   streaming replication concepts. For further details on streaming
+   replication, see the PostgreSQL documentation section on <ulink
+   url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION">
+   streaming replication</>.
+  </para>
+  <para>
+   The following terms are used throughout the &repmgr; documentation.
+   <variablelist>
+    <varlistentry>
+     <term>replication cluster</term>
+     <listitem>
+      <simpara>
+       In the &repmgr; documentation, "replication cluster" refers to the network
+       of PostgreSQL servers connected by streaming replication.
+      </simpara>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>node</term>
+     <listitem>
+      <simpara>
+       A node is a single PostgreSQL server within a replication cluster.
+      </simpara>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>upstream node</term>
+     <listitem>
+      <simpara>
+       The node a standby server connects to, in order to receive streaming replication.
+       This is either the primary server, or in the case of cascading replication, another
+       standby.
+      </simpara>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>failover</term>
+     <listitem>
+      <simpara>
+       This is the action which occurs if a primary server fails and a suitable standby
+       is  promoted as the new primary. The <application>repmgrd</application> daemon supports automatic failover
+       to minimise downtime.
+      </simpara>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>switchover</term>
+     <listitem>
+      <simpara>
+       In certain circumstances, such as hardware or operating system maintenance,
+       it's necessary to take a primary server offline; in this case a controlled
+       switchover is necessary, whereby a suitable standby is promoted and the
+       existing primary removed from the replication cluster in a controlled manner.
+       The &repmgr; command line client provides this functionality.
+      </simpara>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>fencing</term>
+     <listitem>
+      <simpara>
+       In a failover situation, following the promotion of a new standby, it's
+       essential that the previous primary does not unexpectedly come back on
+       line, which would result in a split-brain situation. To prevent this,
+       the failed primary should be isolated from applications, i.e. "fenced off".
+      </simpara>
+     </listitem>
+    </varlistentry>
+   <varlistentry id="witness-server">
+     <term>witness server</term>
+     <listitem>
+      <para>
+        &repmgr; provides functionality to set up a so-called "witness server" to
+        assist in determining a new primary server in a failover situation with more
+        than one standby. The witness server itself is not part of the replication
+        cluster, although it does contain a copy of the repmgr metadata schema.
+      </para>
+      <para>
+        The purpose of a witness server is to provide a "casting vote" where servers
+        in the replication cluster are split over more than one location. In the event
+        of a loss of connectivity between locations, the presence or absence of
+        the witness server will decide whether a server at that location is promoted
+        to primary; this is to prevent a "split-brain" situation where an isolated
+        location interprets a network outage as a failure of the (remote) primary and
+        promotes a (local) standby.
+      </para>
+      <para>
+        A witness server only needs to be created if <application>repmgrd</application>
+        is in use.
+      </para>
+     </listitem>
+    </varlistentry>
+   </variablelist>
+  </para>
+ </sect1>
+ <sect1 id="repmgr-components" xreflabel="Components">
+  <title>Components</title>
+  <para>
+  &repmgr; is a suite of open-source tools to manage replication and failover
+  within a cluster of PostgreSQL servers. It supports and enhances PostgreSQL's
+  built-in streaming replication, which provides a single read/write primary server
+  and one or more read-only standbys containing near-real time copies of the primary
+  server's database. It provides two main tools:
+   <variablelist>
+    <varlistentry>
+     <term>repmgr</term>
+     <listitem>
+      <para>
+       A command-line tool used to perform administrative tasks such as:
+       <itemizedlist>
+        <listitem>
+          <simpara>setting up standby servers</simpara>
+        </listitem>
+        <listitem>
+          <simpara>promoting a standby server to primary</simpara>
+        </listitem>
+        <listitem>
+          <simpara>switching over primary and standby servers</simpara>
+        </listitem>
+        <listitem>
+          <simpara>displaying the status of servers in the replication cluster</simpara>
+        </listitem>
+       </itemizedlist>
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
+     <term>repmgrd</term>
+     <listitem>
+      <para>
+       A daemon which actively monitors servers in a replication cluster
+       and performs the following tasks:
+       <itemizedlist>
+        <listitem>
+          <simpara>monitoring and recording replication performance</simpara>
+        </listitem>
+        <listitem>
+          <simpara>performing failover by detecting failure of the primary and
+            promoting the most suitable standby server
+          </simpara>
+        </listitem>
+        <listitem>
+          <simpara>provide notifications about events in the cluster to a user-defined
+      script which can perform tasks such as sending alerts by email</simpara>
+        </listitem>
+       </itemizedlist>
+      </para>
+     </listitem>
+    </varlistentry>
+   </variablelist>
+  </para>
+ </sect1>
+
+ <sect1 id="repmgr-user-metadata" xreflabel="Repmgr user and metadata">
+  <title>Repmgr user and metadata</title>
+  <para>
+   In order to effectively manage a replication cluster, &repmgr; needs to store
+   information about the servers in the cluster in a dedicated database schema.
+   This schema is automatically created by the &repmgr; extension, which is installed
+   during the first step in initializing a &repmgr;-administered cluster
+   (<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
+   and contains the following objects:
+   <variablelist>
+    <varlistentry>
+     <term>Tables</term>
+     <listitem>
+      <para>
+       <itemizedlist>
+        <listitem>
+          <simpara><literal>repmgr.events</literal>: records events of interest</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>repmgr.nodes</literal>: connection and status information for each server in the
+    replication cluster</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
+            written by <application>repmgrd</application></simpara>
+        </listitem>
+       </itemizedlist>
+      </para>
+     </listitem>
+    </varlistentry>
+    <varlistentry>
+     <term>Views</term>
+     <listitem>
+      <para>
+       <itemizedlist>
+        <listitem>
+          <simpara>repmgr.show_nodes: based on the table <literal>repmgr.nodes</literal>, additionally showing the
+           name of the server's upstream node</simpara>
+        </listitem>
+        <listitem>
+          <simpara>repmgr.replication_status: when <application>repmgrd</application>'s monitoring is enabled, shows
+            current monitoring status for each standby.</simpara>
+        </listitem>
+       </itemizedlist>
+      </para>
+     </listitem>
+    </varlistentry>
+   </variablelist>
+  </para>
+
+  <para>
+   The &repmgr; metadata schema can be stored in an existing database or in its own
+   dedicated database. Note that the &repmgr; metadata schema cannot reside on a database
+   server which is not part of the replication cluster managed by &repmgr;.
+  </para>
+  <para>
+   A database user must be available for &repmgr; to access this database and perform
+   necessary changes. This user does not need to be a superuser, however some operations
+   such as initial installation of the &repmgr; extension will require a superuser
+   connection (this can be specified where required with the command line option
+   <literal>--superuser</literal>).
+  </para>
+ </sect1>
+
+</chapter>
--- a/doc/promoting-standby.sgml
+++ b/doc/promoting-standby.sgml
@@ -0,0 +1,79 @@
+<chapter id="promoting-standby" xreflabel="Promoting a standby">
+ <indexterm>
+   <primary>promoting a standby</primary>
+   <seealso>repmgr standby promote</seealso>
+ </indexterm>
+ <title>Promoting a standby server with repmgr</title>
+ <para>
+   If a primary server fails or needs to be removed from the replication cluster,
+   a new primary server must be designated, to ensure the cluster continues
+   to function correctly. This can be done with <xref linkend="repmgr-standby-promote">,
+   which promotes the standby on the current server to primary.
+ </para>
+
+ <para>
+  To demonstrate this, set up a replication cluster with a primary and two attached
+  standby servers so that the cluster looks like this:
+  <programlisting>
+     $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
+ </para>
+ <para>
+  Stop the current primary with e.g.:
+  <programlisting>
+   $ pg_ctl -D /var/lib/postgresql/data -m fast stop</programlisting>
+ </para>
+ <para>
+  At this point the replication cluster will be in a partially disabled state, with
+  both standbys accepting read-only connections while attempting to connect to the
+  stopped primary. Note that the &repmgr; metadata table will not yet have been updated;
+  executing <xref linkend="repmgr-cluster-show"> will note the discrepancy:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status        | Upstream | Location | Connection string
+    ----+-------+---------+---------------+----------+----------+--------------------------------------
+     1  | node1 | primary | ? unreachable |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running     | node1    | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running     | node1    | default  | host=node3 dbname=repmgr user=repmgr
+
+    WARNING: following issues were detected
+    node "node1" (ID: 1) is registered as an active primary but is unreachable</programlisting>
+ </para>
+ <para>
+  Now promote the first standby with:
+  <programlisting>
+   $ repmgr -f /etc/repmgr.conf standby promote</programlisting>
+ </para>
+ <para>
+  This will produce output similar to the following:
+  <programlisting>
+    INFO: connecting to standby database
+    NOTICE: promoting standby
+    DETAIL: promoting server using "pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/postgresql/data' promote"
+    server promoting
+    INFO: reconnecting to promoted server
+    NOTICE: STANDBY PROMOTE successful
+    DETAIL: node 2 was successfully promoted to primary</programlisting>
+ </para>
+ <para>
+  Executing <xref linkend="repmgr-cluster-show"> will show the current state; as there is now an
+  active primary, the previous warning will not be displayed:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
+ </para>
+ <para>
+  However the sole remaining standby (<literal>node3</literal>) is still trying to replicate from the failed
+  primary; <xref linkend="repmgr-standby-follow"> must now be executed to rectify this situation
+  (see <xref linkend="follow-new-primary"> for example).
+ </para>
+</chapter>
+
--- a/doc/quickstart.sgml
+++ b/doc/quickstart.sgml
@@ -0,0 +1,455 @@
+<chapter id="quickstart" xreflabel="Quick-start guide">
+ <title>Quick-start guide</title>
+
+ <para>
+  This section gives a quick introduction to &repmgr;, including setting up a
+  sample &repmgr; installation and a basic replication cluster.
+ </para>
+ <para>
+  These instructions for demonstration purposes and are not suitable for a production
+  install, as issues such as account security considerations, and system administration
+  best practices are omitted.
+ </para>
+ <note>
+   <simpara>
+     To upgrade an existing &repmgr; 3.x installation, see section
+     <xref linkend="upgrading-from-repmgr-3">.
+   </simpara>
+ </note>
+
+ <sect1 id="quickstart-prerequisites">
+   <title>Prerequisites for setting up a basic replication cluster with &repmgr;</title>
+    <para>
+     The following section will describe how to set up a basic replication cluster
+     with a primary and a standby server using the <application>repmgr</application>
+     command line tool.
+    </para>
+    <para>
+      We'll assume the primary is called <literal>node1</literal> with IP address
+      <literal>192.168.1.11</literal>, and the standby is called <literal>node2</literal>
+      with IP address <literal>192.168.1.12</literal>
+    </para>
+    <para>
+     Following software must be installed on both servers:
+     <itemizedlist spacing="compact" mark="bullet">
+      <listitem>
+       <simpara><application>PostgreSQL</application></simpara>
+      </listitem>
+      <listitem>
+       <simpara>
+        <application>repmgr</application> (matching the installed
+        <application>PostgreSQL</application> major version)
+       </simpara>
+      </listitem>
+     </itemizedlist>
+    </para>
+
+    <para>
+      At network level, connections between the PostgreSQL port (default: <literal>5432</literal>)
+      must be possible in both directions.
+    </para>
+    <para>
+      If you want <application>repmgr</application> to copy configuration files which are
+      located outside the PostgreSQL data directory, and/or to test <command>switchover</command>
+      functionality, you will also need passwordless SSH connections between both servers, and
+      <application>rsync</application> should be installed.
+    </para>
+    <tip>
+     <simpara>
+      For testing <application>repmgr</application>, it's possible to use multiple PostgreSQL
+      instances running on different ports on the same computer, with
+      passwordless SSH access to <filename>localhost</filename> enabled.
+     </simpara>
+    </tip>
+ </sect1>
+
+ <sect1 id="quickstart-postgresql-configuration">
+   <title>PostgreSQL configuration</title>
+   <para>
+    On the primary server, a PostgreSQL instance must be initialised and running.
+    The following replication settings may need to be adjusted:
+   </para>
+   <programlisting>
+
+    # Enable replication connections; set this figure to at least one more
+    # than the number of standbys which will connect to this server
+    # (note that repmgr will execute `pg_basebackup` in WAL streaming mode,
+    # which requires two free WAL senders)
+
+    max_wal_senders = 10
+
+    # Ensure WAL files contain enough information to enable read-only queries
+    # on the standby.
+    #
+    #  PostgreSQL 9.5 and earlier: one of 'hot_standby' or 'logical'
+    #  PostgreSQL 9.6 and later: one of 'replica' or 'logical'
+    #    ('hot_standby' will still be accepted as an alias for 'replica')
+    #
+    # See: https://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-LEVEL
+
+    wal_level = 'hot_standby'
+
+    # Enable read-only queries on a standby
+    # (Note: this will be ignored on a primary but we recommend including
+    # it anyway)
+
+    hot_standby = on
+
+    # Enable WAL file archiving
+    archive_mode = on
+
+    # Set archive command to a script or application that will safely store
+    # you WALs in a secure place. /bin/true is an example of a command that
+    # ignores archiving. Use something more sensible.
+    archive_command = '/bin/true'
+
+    # If you have configured "pg_basebackup_options"
+    # in "repmgr.conf" to include the setting "--xlog-method=fetch" (from
+    # PostgreSQL 10 "--wal-method=fetch"), *and* you have not set
+    # "restore_command" in "repmgr.conf"to fetch WAL files from another
+    # source such as Barman, you'll need to set "wal_keep_segments" to a
+    # high enough value to ensure that all WAL files generated while
+    # the standby is being cloned are retained until the standby starts up.
+    #
+    # wal_keep_segments = 5000
+   </programlisting>
+   <tip>
+    <simpara>
+      Rather than editing these settings in the default <filename>postgresql.conf</filename>
+     file, create a separate file such as <filename>postgresql.replication.conf</filename> and
+      include it from the end of the main configuration file with:
+     <command>include 'postgresql.replication.conf</command>.
+    </simpara>
+   </tip>
+   <para>
+     Additionally, if you are intending to use <application>pg_rewind</application>,
+     and the cluster was not initialised using data checksums, you may want to consider enabling
+     <varname>wal_log_hints</varname>; for more details see <xref linkend="repmgr-node-rejoin-pg-rewind">.
+   </para>
+ </sect1>
+
+ <sect1 id="quickstart-repmgr-user-database">
+  <title>Create the repmgr user and database</title>
+  <para>
+   Create a dedicated PostgreSQL superuser account and a database for
+   the &repmgr; metadata, e.g.
+  </para>
+  <programlisting>
+   createuser -s repmgr
+   createdb repmgr -O repmgr
+  </programlisting>
+
+  <para>
+   For the examples in this document, the name <literal>repmgr</literal> will be
+   used for both user and database, but any names can be used.
+  </para>
+  <note>
+   <para>
+    For the sake of simplicity, the <literal>repmgr</literal> user is created
+    as a superuser. If desired, it's possible to create the <literal>repmgr</literal>
+    user as a normal user. However for certain operations superuser permissions
+    are requiredl; in this case the command line option <command>--superuser</command>
+    can be provided to specify a superuser.
+   </para>
+   <para>
+    It's also assumed that the <literal>repmgr</literal> user will be used to make the
+    replication connection from the standby to the primary; again this can be
+    overridden by specifying a separate replication user when registering each node.
+   </para>
+  </note>
+
+  <tip>
+    <para>
+     &repmgr; will install the <literal>repmgr</literal> extension, which creates a
+     <literal>repmgr</literal> schema containing the &repmgr;'s metadata tables as
+     well as other functions and views. We also recommend that you set the
+     <literal>repmgr</literal> user's search path to include this schema name, e.g.
+     <programlisting>
+       ALTER USER repmgr SET search_path TO repmgr, "$user", public;</programlisting>
+    </para>
+  </tip>
+
+ </sect1>
+
+ <sect1 id="quickstart-authentication">
+  <title>Configuring authentication in pg_hba.conf</title>
+  <para>
+   Ensure the <literal>repmgr</literal> user has appropriate permissions in <filename>pg_hba.conf</filename> and
+   can connect in replication mode; <filename>pg_hba.conf</filename> should contain entries
+   similar to the following:
+  </para>
+  <programlisting>
+    local   replication   repmgr                              trust
+    host    replication   repmgr      127.0.0.1/32            trust
+    host    replication   repmgr      192.168.1.0/24          trust
+
+    local   repmgr        repmgr                              trust
+    host    repmgr        repmgr      127.0.0.1/32            trust
+    host    repmgr        repmgr      192.168.1.0/24          trust
+  </programlisting>
+  <para>
+   Note that these are simple settings for testing purposes.
+   Adjust according to your network environment and authentication requirements.
+  </para>
+ </sect1>
+
+ <sect1 id="quickstart-standby-preparation">
+  <title>Preparing the standby</title>
+  <para>
+   On the standby, do not create a PostgreSQL instance, but do ensure the destination
+   data directory (and any other directories which you want PostgreSQL to use)
+   exist and are owned by the <literal>postgres</literal> system user. Permissions
+   must be set to <literal>0700</literal> (<literal>drwx------</literal>).
+  </para>
+  <para>
+   Check the primary database is reachable from the standby using <application>psql</application>:
+  </para>
+  <programlisting>
+    psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>
+
+  <note>
+   <para>
+    &repmgr; stores connection information as <ulink
+    url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">libpq
+    connection strings</ulink> throughout. This documentation refers to them as <literal>conninfo</literal>
+    strings; an alternative name is <literal>DSN</literal> (<literal>data source name</literal>).
+    We'll use these in place of the <command>-h hostname -d databasename -U username</command> syntax.
+   </para>
+  </note>
+ </sect1>
+
+ <sect1 id="quickstart-repmgr-conf">
+  <title>repmgr configuration file</title>
+  <para>
+   Create a <filename>repmgr.conf</filename> file on the primary server. The file must
+   contain at least the following parameters:
+  </para>
+  <programlisting>
+    node_id=1
+    node_name=node1
+    conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
+    data_directory='/var/lib/postgresql/data'
+  </programlisting>
+
+  <para>
+   <filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory,
+   as it could be overwritten when setting up or reinitialising the PostgreSQL
+   server. See sections on <xref linkend="configuration-file"> and <xref linkend="configuration-file-settings">
+   for further details about <filename>repmgr.conf</filename>.
+  </para>
+  <tip>
+   <simpara>
+    For Debian-based distributions we recommend explictly setting
+    <literal>pg_bindir</literal> to the directory where <command>pg_ctl</command> and other binaries
+    not in the standard path are located. For PostgreSQL 9.6 this would be <filename>/usr/lib/postgresql/9.6/bin/</filename>.
+   </simpara>
+  </tip>
+
+  <para>
+   See the file
+   <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</>
+    for details of all available configuration parameters.
+  </para>
+
+ </sect1>
+
+
+ <sect1 id="quickstart-primary-register">
+  <title>Register the primary server</title>
+  <para>
+   To enable &repmgr; to support a replication cluster, the primary node must
+   be registered with &repmgr;. This installs the <literal>repmgr</literal>
+   extension and metadata objects, and adds a metadata record for the primary server:
+  </para>
+
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf primary register
+    INFO: connecting to primary database...
+    NOTICE: attempting to install extension "repmgr"
+    NOTICE: "repmgr" extension successfully installed
+    NOTICE: primary node record (id: 1) registered</programlisting>
+
+  <para>
+    Verify status of the cluster like this:
+  </para>
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Connection string
+    ----+-------+---------+-----------+----------+--------------------------------------------------------
+     1  | node1 | primary | * running |          | host=node1 dbname=repmgr user=repmgr connect_timeout=2
+  </programlisting>
+  <para>
+    The record in the <literal>repmgr</literal> metadata table will look like this:
+  </para>
+  <programlisting>
+    repmgr=# SELECT * FROM repmgr.nodes;
+    -[ RECORD 1 ]----+-------------------------------------------------------
+    node_id          | 1
+    upstream_node_id |
+    active           | t
+    node_name        | node1
+    type             | primary
+    location         | default
+    priority         | 100
+    conninfo         | host=node1 dbname=repmgr user=repmgr connect_timeout=2
+    repluser         | repmgr
+    slot_name        |
+    config_file      | /etc/repmgr.conf</programlisting>
+  <para>
+    Each server in the replication cluster will have its own record. If <application>repmgrd</application>
+    is in use, the fields <literal>upstream_node_id</literal>, <literal>active</literal> and
+    <literal>type</literal> will be updated when the node's status or role changes.
+  </para>
+ </sect1>
+
+ <sect1 id="quickstart-standby-clone">
+  <title>Clone the standby server</title>
+  <para>
+   Create a <filename>repmgr.conf</filename> file on the standby server. It must contain at
+   least the same parameters as the primary's <filename>repmgr.conf</filename>, but with
+   the mandatory values <literal>node</literal>, <literal>node_name</literal>, <literal>conninfo</literal>
+   (and possibly <literal>data_directory</literal>) adjusted accordingly, e.g.:
+  </para>
+  <programlisting>
+    node_id=2
+    node_name=node2
+    conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
+    data_directory='/var/lib/postgresql/data'</programlisting>
+  <para>
+   Use the <command>--dry-run</command> option to check the standby can be cloned:
+  </para>
+  <programlisting>
+    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run
+    NOTICE: using provided configuration file "/etc/repmgr.conf"
+    NOTICE: destination directory "/var/lib/postgresql/data" provided
+    INFO: connecting to source node
+    NOTICE: checking for available walsenders on source node (2 required)
+    INFO: sufficient walsenders available on source node (2 required)
+    NOTICE: standby will attach to upstream node 1
+    HINT: consider using the -c/--fast-checkpoint option
+    INFO: all prerequisites for "standby clone" are met</programlisting>
+  <para>
+    If no problems are reported, the standby can then be cloned with:
+  </para>
+  <programlisting>
+    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone
+
+    NOTICE: using configuration file "/etc/repmgr.conf"
+    NOTICE: destination directory "/var/lib/postgresql/data" provided
+    INFO: connecting to source node
+    NOTICE: checking for available walsenders on source node (2 required)
+    INFO: sufficient walsenders available on source node (2 required)
+    INFO: creating directory "/var/lib/postgresql/data"...
+    NOTICE: starting backup (using pg_basebackup)...
+    HINT: this may take some time; consider using the -c/--fast-checkpoint option
+    INFO: executing:
+      pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream
+    NOTICE: standby clone (using pg_basebackup) complete
+    NOTICE: you can now start your PostgreSQL server
+    HINT: for example: pg_ctl -D /var/lib/postgresql/data start
+  </programlisting>
+  <para>
+   This has cloned the PostgreSQL data directory files from the primary <literal>node1</literal>
+   using PostgreSQL's <command>pg_basebackup</command> utility. A <filename>recovery.conf</filename>
+   file containing the correct parameters to start streaming from this primary server will be created
+   automatically.
+  </para>
+  <note>
+   <simpara>
+    By default, any configuration files in the primary's data directory will be
+    copied to the standby. Typically these will be <filename>postgresql.conf</filename>,
+    <filename>postgresql.auto.conf</filename>, <filename>pg_hba.conf</filename> and
+    <filename>pg_ident.conf</filename>. These may require modification before the standby
+    is started.
+   </simpara>
+  </note>
+  <para>
+   Make any adjustments to the standby's PostgreSQL configuration files now,
+   then start the server.
+  </para>
+  <para>
+   For more details on <command>repmgr standby clone</command>, see the
+   <link linkend="repmgr-standby-clone">command reference</link>.
+   A more detailed overview of cloning options is available in the
+   <link linkend="cloning-standbys">administration manual</link>.
+  </para>
+ </sect1>
+
+ <sect1 id="quickstart-verify-replication">
+  <title>Verify replication is functioning</title>
+  <para>
+   Connect to the primary server and execute:
+   <programlisting>
+    repmgr=# SELECT * FROM pg_stat_replication;
+    -[ RECORD 1 ]----+------------------------------
+    pid              | 19111
+    usesysid         | 16384
+    usename          | repmgr
+    application_name | node2
+    client_addr      | 192.168.1.12
+    client_hostname  |
+    client_port      | 50378
+    backend_start    | 2017-08-28 15:14:19.851581+09
+    backend_xmin     |
+    state            | streaming
+    sent_location    | 0/7000318
+    write_location   | 0/7000318
+    flush_location   | 0/7000318
+    replay_location  | 0/7000318
+    sync_priority    | 0
+    sync_state       | async</programlisting>
+   This shows that the previously cloned standby (<literal>node2</literal> shown in the field
+   <literal>application_name</literal>) has connected to the primary from IP address
+   <literal>192.168.1.12</literal>.
+  </para>
+  <para>
+    From PostgreSQL 9.6 you can also use the view
+    <ulink url="https://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW">
+    <literal>pg_stat_wal_receiver</literal></ulink> to check the replication status from the standby.
+
+   <programlisting>
+    repmgr=# SELECT * FROM pg_stat_wal_receiver;
+    Expanded display is on.
+    -[ RECORD 1 ]---------+--------------------------------------------------------------------------------
+    pid                   | 18236
+    status                | streaming
+    receive_start_lsn     | 0/3000000
+    receive_start_tli     | 1
+    received_lsn          | 0/7000538
+    received_tli          | 1
+    last_msg_send_time    | 2017-08-28 15:21:26.465728+09
+    last_msg_receipt_time | 2017-08-28 15:21:26.465774+09
+    latest_end_lsn        | 0/7000538
+    latest_end_time       | 2017-08-28 15:20:56.418735+09
+    slot_name             |
+    conninfo              | user=repmgr dbname=replication host=node1 application_name=node2
+   </programlisting>
+   Note that the <varname>conninfo</varname> value is that generated in <filename>recovery.conf</filename>
+   and will differ slightly from the primary's <varname>conninfo</varname> as set in <filename>repmgr.conf</filename> -
+   among others it will contain the connecting node's name as <varname>application_name</varname>.
+  </para>
+ </sect1>
+
+ <sect1 id="quickstart-register-standby">
+  <title>Register the standby</title>
+  <para>
+    Register the standby server with:
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf standby register
+    NOTICE: standby node "node2" (ID: 2) successfully registered</programlisting>
+  </para>
+  <para>
+    Check the node is registered by executing <command>repmgr cluster show</command> on the standby:
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr</programlisting>
+  </para>
+  <para>
+   Both nodes are now registered with &repmgr; and the records have been copied to the standby server.
+  </para>
+ </sect1>
+
+</chapter>
--- a/doc/repmgr-cluster-cleanup.sgml
+++ b/doc/repmgr-cluster-cleanup.sgml
@@ -0,0 +1,41 @@
+<refentry id="repmgr-cluster-cleanup">
+  <indexterm>
+    <primary>repmgr cluster cleanup</primary>
+  </indexterm>
+ <refmeta>
+    <refentrytitle>repmgr cluster cleanup</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr cluster cleanup</refname>
+    <refpurpose>purge monitoring history</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Purges monitoring history from the <literal>repmgr.monitoring_history</literal> table to
+      prevent excessive table growth. Use the <literal>-k/--keep-history</literal> to specify the
+      number of days of monitoring history to retain. This command can be used
+      manually or as a cronjob.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Usage</title>
+    <para>
+      This command requires a valid <filename>repmgr.conf</filename> file for the node on which it is
+      executed; no additional arguments are required.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Notes</title>
+
+    <para>
+      Monitoring history will only be written if <application>repmgrd</application> is active, and
+      <varname>monitoring_history</varname> is set to <literal>true</literal> in
+      <filename>repmgr.conf</filename>.
+    </para>
+  </refsect1>
+</refentry>
--- a/doc/repmgr-cluster-crosscheck.sgml
+++ b/doc/repmgr-cluster-crosscheck.sgml
@@ -0,0 +1,42 @@
+<refentry id="repmgr-cluster-crosscheck">
+  <indexterm>
+    <primary>repmgr cluster crosscheck</primary>
+  </indexterm>
+
+
+  <refmeta>
+    <refentrytitle>repmgr cluster crosscheck</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr cluster crosscheck</refname>
+    <refpurpose>cross-checks connections between each combination of nodes</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
+        but cross-checks connections between each combination of nodes. In "Example 3" in
+        <xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
+        However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
+        overview of the cluster situation:
+          <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster crosscheck
+
+    Name   | Id |  1 |  2 |  3
+    -------+----+----+----+----
+     node1 |  1 |  * |  * |  x
+     node2 |  2 |  * |  * |  *
+     node3 |  3 |  * |  * |  *</programlisting>
+    </para>
+    <para>
+      What happened is that <command>repmgr cluster crosscheck</command> merged its own
+      <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command> with the
+      <command>repmgr cluster matrix</command> output from <literal>node2</literal>; the latter is
+      able to connect to <literal>node3</literal>
+      and therefore determine the state of outbound connections from that node.
+    </para>
+  </refsect1>
+</refentry>
+
--- a/doc/repmgr-cluster-event.sgml
+++ b/doc/repmgr-cluster-event.sgml
@@ -0,0 +1,63 @@
+<refentry id="repmgr-cluster-event">
+  <indexterm>
+    <primary>repmgr cluster event</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr cluster event</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr cluster event</refname>
+    <refpurpose>output a formatted list of cluster events</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+
+    <para>
+      Outputs a formatted list of cluster events, as stored in the <literal>repmgr.events</literal> table.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Usage</title>
+
+    <para>
+      Output is in reverse chronological order, and
+      can be filtered with the following options:
+      <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+          <simpara><literal>--all</literal>: outputs all entries</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>--limit</literal>: set the maximum number of entries to output (default: 20)</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>--node-id</literal>: restrict entries to node with this ID</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>--node-name</literal>: restrict entries to node with this name</simpara>
+        </listitem>
+        <listitem>
+          <simpara><literal>--event</literal>: filter specific event (see <xref linkend="event-notifications"> for a full list)</simpara>
+        </listitem>
+      </itemizedlist>
+    </para>
+    <para>
+      The "Details" column can be omitted by providing <literal>--terse</literal>.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+      <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster event --event=standby_register
+     Node ID | Name  | Event            | OK | Timestamp           | Details
+    ---------+-------+------------------+----+---------------------+--------------------------------
+     3       | node3 | standby_register | t  | 2017-08-17 10:28:55 | standby registration succeeded
+     2       | node2 | standby_register | t  | 2017-08-17 10:28:53 | standby registration succeeded</programlisting>
+    </para>
+  </refsect1>
+</refentry>
--- a/doc/repmgr-cluster-matrix.sgml
+++ b/doc/repmgr-cluster-matrix.sgml
@@ -0,0 +1,101 @@
+<refentry id="repmgr-cluster-matrix">
+  <indexterm>
+    <primary>repmgr cluster matrix</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr cluster matrix</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr cluster matrix</refname>
+    <refpurpose>
+      runs repmgr cluster show on each node and summarizes output
+    </refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr cluster matrix</command> runs <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command> on each
+      node and arranges the results in a matrix, recording success or failure.
+    </para>
+    <para>
+      <command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
+      file on each node. Additionally, passwordless <command>ssh</command> connections are required between
+      all nodes.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+    Example 1 (all nodes up):
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster matrix
+
+    Name   | Id |  1 |  2 |  3
+    -------+----+----+----+----
+     node1 |  1 |  * |  * |  *
+     node2 |  2 |  * |  * |  *
+     node3 |  3 |  * |  * |  *</programlisting>
+  </para>
+  <para>
+    Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster matrix
+
+    Name   | Id |  1 |  2 |  3
+    -------+----+----+----+----
+     node1 |  1 |  * |  * |  x
+     node2 |  2 |  * |  * |  x
+     node3 |  3 |  ? |  ? |  ?
+    </programlisting>
+  </para>
+  <para>
+   Each row corresponds to one server, and indicates the result of
+   testing an outbound connection from that server.
+  </para>
+  <para>
+    Since <literal>node3</literal> is down, all the entries in its row are filled with
+    <literal>?</literal>, meaning that there we cannot test outbound connections.
+  </para>
+  <para>
+    The other two nodes are up; the corresponding rows have <literal>x</literal> in the
+    column corresponding to <literal>node3</literal>, meaning that inbound connections to
+    that node have failed, and <literal>*</literal> in the columns corresponding to
+    <literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
+    to these nodes have succeeded.
+  </para>
+  <para>
+    Example 3 (all nodes up, firewall dropping packets originating
+    from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
+    running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster matrix
+
+    Name   | Id |  1 |  2 |  3
+    -------+----+----+----+----
+     node1 |  1 |  * |  * |  x
+     node2 |  2 |  * |  * |  *
+     node3 |  3 |  ? |  ? |  ?</programlisting>
+  </para>
+  <para>
+    Note this may take some time depending on the <varname>connect_timeout</varname>
+    setting in the node <varname>conninfo</varname> strings; default is
+    <literal>1 minute</literal> which means without modification the above
+    command would take around 2 minutes to run; see comment elsewhere about setting
+    <varname>connect_timeout</varname>)
+  </para>
+  <para>
+   The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
+   and that (therefore) we don't know the state of any outbound
+   connection from <literal>node3</literal>.
+  </para>
+  <para>
+    In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
+    useful result.
+  </para>
+  </refsect1>
+</refentry>
+
--- a/doc/repmgr-cluster-show.sgml
+++ b/doc/repmgr-cluster-show.sgml
@@ -0,0 +1,116 @@
+<refentry id="repmgr-cluster-show">
+  <indexterm>
+    <primary>repmgr cluster show</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr cluster show</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr cluster show</refname>
+    <refpurpose>display information about each registered node in the replication cluster</refpurpose>
+  </refnamediv>
+
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Displays information about each registered node in the replication cluster. This
+      command polls each registered server and shows its role (<literal>primary</literal> /
+      <literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
+      directly and can be run on any node in the cluster; this is also useful when analyzing
+      connectivity from a particular node.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Execution</title>
+    <para>
+      This command requires either a valid <filename>repmgr.conf</filename> file or a database
+      connection string to one of the registered nodes; no additional arguments are needed.
+    </para>
+
+    <para>
+      To show database connection errors when polling nodes, run the command in
+      <literal>--verbose</literal> mode.
+    </para>
+
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+    <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+-----------------------------------------
+     1  | node1 | primary | * running |          | default  | host=db_node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | default  | host=db_node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node1    | default  | host=db_node3 dbname=repmgr user=repmgr</programlisting>
+  </para>
+  </refsect1>
+  <refsect1>
+    <title>Notes</title>
+    <para>
+      The column <literal>Role</literal> shows the expected server role according to the
+      &repmgr; metadata. <literal>Status</literal> shows whether the server is running or unreachable.
+      If the node has an unexpected role not reflected in the &repmgr; metadata, e.g. a node was manually
+      promoted to primary, this will be highlighted with an exclamation mark, e.g.:
+      <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+
+     ID | Name  | Role    | Status               | Upstream | Location | Connection string
+    ----+-------+---------+----------------------+----------+----------+-----------------------------------------
+     1  | node1 | primary | ? unreachable        |          | default  | host=db_node1 dbname=repmgr user=repmgr
+     2  | node2 | standby | ! running as primary | node1    | default  | host=db_node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running            | node1    | default  | host=db_node3 dbname=repmgr user=repmgr
+
+    WARNING: following issues were detected
+      node "node1" (ID: 1) is registered as an active primary but is unreachable
+      node "node2" (ID: 2) is registered as standby but running as primary</programlisting>
+    </para>
+    <para>
+      Node availability is tested by connecting from the node where
+      <command>repmgr cluster show</command> is executed, and does not necessarily imply the node
+      is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
+          a better overviews of connections between nodes.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Options</title>
+    <para>
+      <command>repmgr cluster show</command> accepts an optional parameter <literal>--csv</literal>, which
+      outputs the replication cluster's status in a simple CSV format, suitable for
+      parsing by scripts:
+      <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show --csv
+    1,-1,-1
+    2,0,0
+    3,0,1</programlisting>
+    </para>
+    <para>
+      The columns have following meanings:
+      <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+          <simpara>
+            node ID
+          </simpara>
+        </listitem>
+        <listitem>
+          <simpara>
+            availability (0 = available, -1 = unavailable)
+          </simpara>
+        </listitem>
+        <listitem>
+          <simpara>
+            recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
+          </simpara>
+        </listitem>
+      </itemizedlist>
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-node-check.sgml
+++ b/doc/repmgr-node-check.sgml
@@ -0,0 +1,87 @@
+<refentry id="repmgr-node-check">
+  <indexterm>
+    <primary>repmgr node check</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr node check</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr node check</refname>
+    <refpurpose>performs some health checks on a node from a replication perspective</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Performs some health checks on a node from a replication perspective.
+      This command must be run on the local node.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+      <programlisting>
+       $ repmgr -f /etc/repmgr.conf node check
+       Node "node1":
+            Server role: OK (node is primary)
+            Replication lag: OK (N/A - node is primary)
+            WAL archiving: OK (0 pending files)
+            Downstream servers: OK (2 of 2 downstream nodes attached)
+            Replication slots: OK (node has no replication slots)</programlisting>
+    </para>
+  </refsect1>
+  <refsect1>
+    <title>Individual checks</title>
+    <para>
+      Each check can be performed individually by supplying
+      an additional command line parameter, e.g.:
+      <programlisting>
+        $ repmgr node check --role
+        OK (node is primary)</programlisting>
+    </para>
+    <para>
+   Parameters for individual checks are as follows:
+    <itemizedlist spacing="compact" mark="bullet">
+
+     <listitem>
+      <simpara>
+        <literal>--role</literal>: checks if the node has the expected role
+      </simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>
+        <literal>--replication-lag</literal>: checks if the node is lagging by more than
+        <varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
+      </simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>
+        <literal>--archive-ready</literal>: checks for WAL files which have not yet been archived
+      </simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>
+        <literal>--downstream</literal>: checks that the expected downstream nodes are attached
+      </simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>
+        <literal>--slots</literal>: checks there are no inactive replication slots
+      </simpara>
+     </listitem>
+
+    </itemizedlist>
+  </para>
+  <para>
+   Individual checks can also be output in a Nagios-compatible format by additionally
+   providing the option <literal>--nagios</literal>.
+  </para>
+  </refsect1>
+</refentry>
--- a/doc/repmgr-node-rejoin.sgml
+++ b/doc/repmgr-node-rejoin.sgml
@@ -0,0 +1,233 @@
+<refentry id="repmgr-node-rejoin">
+
+  <indexterm>
+    <primary>repmgr node rejoin</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr node rejoin</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr node rejoin</refname>
+    <refpurpose>rejoin a dormant (stopped) node to the replication cluster</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Enables a dormant (stopped) node to be rejoined to the replication cluster.
+    </para>
+    <para>
+      This can optionally use <application>pg_rewind</application> to re-integrate
+      a node which has diverged from the rest of the cluster, typically a failed primary.
+    </para>
+
+    <tip>
+      <para>
+        If the node is running and needs to be attached to the current primary, use
+        <xref linkend="repmgr-standby-follow">.
+      </para>
+    </tip>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Usage</title>
+
+    <para>
+      <programlisting>
+      repmgr node rejoin -d '$conninfo'</programlisting>
+
+      where <literal>$conninfo</literal> is the conninfo string of any reachable node in the cluster.
+      <filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
+      otherwise available.
+    </para>
+  </refsect1>
+
+  <refsect1>
+
+    <title>Options</title>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually execute the rejoin.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
+        <listitem>
+          <para>
+            Execute <application>pg_rewind</application> if necessary.
+          </para>
+          <para>
+            It is only necessary to provide the <application>pg_rewind</application>
+            if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
+            is not installed in the PostgreSQL <filename>bin</filename> directory.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--config-files</option></term>
+        <listitem>
+          <para>
+            comma-separated list of configuration files to retain after
+            executing <application>pg_rewind</application>.
+          </para>
+          <para>
+            Currently <application>pg_rewind</application> will overwrite
+            the local node's configuration files with the files from the source node,
+            so it's advisable to use this option to ensure they are kept.
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+      <varlistentry>
+        <term><option>--config-archive-dir</option></term>
+        <listitem>
+          <para>
+            Directory to temporarily store configuration files specified with
+            <option>--config-files</option>; default: <filename>/tmp</filename>.
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+      <varlistentry>
+        <term><option>-W/--no-wait</option></term>
+        <listitem>
+          <para>
+            Don't wait for the node to rejoin cluster.
+          </para>
+          <para>
+            If this option is supplied, &repmgr; will restart the node but
+            not wait for it to connect to the primary.
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>node_rejoin</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Notes</title>
+    <para>
+      Currently <command>repmgr node rejoin</command> can only be used to attach
+      a standby to the current primary, not another standby.
+    </para>
+    <para>
+      The node must have been shut down cleanly; if this was not the case, it will
+      need to be manually started (remove any existing <filename>recovery.conf</filename> file first)
+      until it has reached a consistent recovery point, then shut down cleanly.
+    </para>
+    <tip>
+      <para>
+        If <application>PostgreSQL</application> is started in single-user mode and
+        input is directed from <filename>/dev/null/</filename>, it will perform recovery
+        then immediately quit, and will then be in a state suitable for use by
+        <application>pg_rewind</application>.
+        <programlisting>
+          rm -f /var/lib/pgsql/data/recovery.conf
+          postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
+      </para>
+    </tip>
+  </refsect1>
+
+  <refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
+
+   <indexterm>
+      <primary>pg_rewind</primary>
+      <secondary>using with "repmgr node rejoin"</secondary>
+    </indexterm>
+
+    <title>Using <command>pg_rewind</command></title>
+    <para>
+      <command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
+      node which has diverged from the rest of the cluster, typically a failed primary.
+      <command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution,
+      and can be installed from external sources for PostgreSQL 9.3 and 9.4.
+    </para>
+    <note>
+      <para>
+        <command>pg_rewind</command> <emphasis>requires</emphasis> that either
+        <varname>wal_log_hints</varname> is enabled, or that
+        data checksums were enabled when the cluster was initialized. See the
+        <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html"><command>pg_rewind</command> documentation</ulink> for details.
+      </para>
+    </note>
+
+    <para>
+      To have <command>repmgr node rejoin</command> use <command>pg_rewind</command> if required,
+      pass the command line option <literal>--force-rewind</literal>, which will tell &repmgr;
+      to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
+    </para>
+
+    <para>
+      Be aware that if <command>pg_rewind</command> is executed and actually performs a
+      rewind operation, any configuration files in the PostgreSQL data directory will be
+      overwritten with those from the source server.
+    </para>
+    <para>
+      To prevent this happening, provide a comma-separated list of files to retain
+      using the <literal>--config-file</literal> command line option; the specified files
+      will be archived in a temporary directory (whose parent directory can be specified with
+      <literal>--config-archive-dir</literal>) and restored once the rewind operation is
+      complete.
+    </para>
+
+    <para>
+      Example, first using <literal>--dry-run</literal>, then actually executing the
+      <literal>node rejoin command</literal>.
+    <programlisting>
+    $ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
+         --force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose --dry-run
+    NOTICE: using provided configuration file "/etc/repmgr.conf"
+    INFO: prerequisites for using pg_rewind are met
+    INFO: file "postgresql.local.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
+    INFO: file "postgresql.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
+    INFO: 2 files would have been copied to "/tmp/repmgr-config-archive-node1"
+    INFO: directory "/tmp/repmgr-config-archive-node1" deleted
+    INFO: pg_rewind would now be executed
+    DETAIL: pg_rewind command is:
+      pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node1 dbname=repmgr user=repmgr'</programlisting>
+    <programlisting>
+    $ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
+         --force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose
+    NOTICE: using provided configuration file "/etc/repmgr.conf"
+    INFO: prerequisites for using pg_rewind are met
+    INFO: 2 files copied to "/tmp/repmgr-config-archive-node1"
+    NOTICE: executing pg_rewind
+    NOTICE: 2 files copied to /var/lib/pgsql/data
+    INFO: directory "/tmp/repmgr-config-archive-node1" deleted
+    INFO: deleting "recovery.done"
+    INFO: setting node 1's primary to node 2
+    NOTICE: starting server using "pg_ctl-l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
+    waiting for server to start.... done
+    server started
+    NOTICE: NODE REJOIN successful
+    DETAIL: node 1 is now attached to node 2</programlisting>
+    </para>
+
+  </refsect1>
+
+  <refsect1>
+    <title>See also</title>
+    <para>
+     <xref linkend="repmgr-standby-follow">
+    </para>
+  </refsect1>
+</refentry>
--- a/doc/repmgr-node-status.sgml
+++ b/doc/repmgr-node-status.sgml
@@ -0,0 +1,47 @@
+<refentry id="repmgr-node-status">
+  <indexterm>
+    <primary>repmgr node status</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr node status</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr node status</refname>
+    <refpurpose>show overview of a node's basic information and replication status</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Displays an overview of a node's basic information and replication
+      status. This command must be run on the local node.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+    <programlisting>
+        $ repmgr -f /etc/repmgr.comf node status
+        Node "node1":
+            PostgreSQL version: 10beta1
+            Total data size: 30 MB
+            Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2
+            Role: primary
+            WAL archiving: off
+            Archive command: (none)
+            Replication connections: 2 (of maximal 10)
+            Replication slots: 0 (of maximal 10)
+            Replication lag: n/a</programlisting>
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>See also</title>
+    <para>
+      See <xref linkend="repmgr-node-check"> to diagnose issues.
+    </para>
+  </refsect1>
+</refentry>
--- a/doc/repmgr-primary-register.sgml
+++ b/doc/repmgr-primary-register.sgml
@@ -0,0 +1,85 @@
+<refentry id="repmgr-primary-register">
+  <indexterm>
+    <primary>repmgr primary register</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr primary register</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr primary register</refname>
+    <refpurpose>initialise a repmgr installation and register the primary node</refpurpose>
+  </refnamediv>
+
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr primary register</command> registers a primary node in a
+      streaming replication cluster, and configures it for use with repmgr, including
+      installing the &repmgr; extension. This command needs to be executed before any
+      standby nodes are registered.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Execution</title>
+    <para>
+      Execute with the <option>--dry-run</option> option to check what would happen without
+      actually registering the primary.
+    </para>
+    <para>
+      <command>repmgr master register</command> can be used as an alias for
+      <command>repmgr primary register</command>.
+    </para>
+
+    <note>
+    <para>
+      If providing the configuration file location with <option>-f/--config-file</option>,
+      avoid using a relative path, as &repmgr; stores the configuration file location
+      in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
+      <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
+        a relative path into an absolute one, but this may not be the same as the path you
+        would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
+        to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
+        <filename>/path/to/repmgr.conf</filename>).
+    </para>
+    </note>
+  </refsect1>
+
+  <refsect1>
+
+    <title>Options</title>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually register the primary.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+       <term><option>-F</option>, <option>--force</option></term>
+        <listitem>
+          <para>
+            Overwrite an existing node record
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>primary_register</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-primary-unregister.sgml
+++ b/doc/repmgr-primary-unregister.sgml
@@ -0,0 +1,74 @@
+<refentry id="repmgr-primary-unregister">
+  <indexterm>
+    <primary>repmgr primary unregister</primary>
+  </indexterm>
+  <refmeta>
+    <refentrytitle>repmgr primary unregister</refentrytitle>
+  </refmeta>
+  <refnamediv>
+    <refname>repmgr primary unregister</refname>
+    <refpurpose>unregister an inactive primary node</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr primary unregister</command> unregisters an inactive primary node
+      from the &repmgr; metadata. This is typically when the primary has failed and is
+      being removed from the cluster after a new primary has been promoted.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Execution</title>
+    <para>
+      <command>repmgr primary unregister</command> can be run on any active &repmgr; node,
+      with the ID of the node to unregister passed as <option>--node-id</option>.
+    </para>
+    <para>
+      Execute with the <literal>--dry-run</literal> option to check what would happen without
+      actually unregistering the node.
+    </para>
+
+    <para>
+      <command>repmgr master unregister</command> can be used as an alias for
+      <command>repmgr primary unregister</command>.
+    </para>
+  </refsect1>
+
+  <refsect1>
+
+    <title>Options</title>
+
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually unregister the primary.
+          </para>
+        </listitem>
+      </varlistentry>
+
+     <varlistentry>
+        <term><option>--node-id</option></term>
+        <listitem>
+          <para>
+            ID of the inactive primary to be unregistered.
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>primary_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-standby-clone.sgml
+++ b/doc/repmgr-standby-clone.sgml
@@ -0,0 +1,340 @@
+<refentry id="repmgr-standby-clone">
+  <indexterm>
+    <primary>repmgr standby clone</primary>
+    <seealso>cloning</seealso>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby clone</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby clone</refname>
+    <refpurpose>clone a PostgreSQL standby node from another PostgreSQL node</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr standby clone</command> clones a PostgreSQL node from another
+      PostgreSQL node, typically the primary, but optionally from any other node in
+      the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
+      to attach the cloned node to the primary node (or another standby, if cascading replication
+      is in use).
+    </para>
+    <note>
+      <simpara>
+        <command>repmgr standby clone</command> does not start the standby, and after cloning
+        a standby, the command <command>repmgr standby register</command> must be executed to
+        notify &repmgr; of its existence.
+      </simpara>
+    </note>
+
+  </refsect1>
+
+
+  <refsect1 id="repmgr-standby-clone-config-file-copying" xreflabel="Copying configuration files">
+   <title>Handling configuration files</title>
+
+   <para>
+    Note that by default, all configuration files in the source node's data
+    directory will be copied to the cloned node.  Typically these will be
+    <filename>postgresql.conf</filename>, <filename>postgresql.auto.conf</filename>,
+    <filename>pg_hba.conf</filename> and <filename>pg_ident.conf</filename>.
+    These may require modification before the standby is started.
+   </para>
+   <para>
+    In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
+    configuration files are located outside of the data directory and will
+    not be copied by default. &repmgr; can copy these files, either to the same
+    location on the standby server (provided appropriate directory and file permissions
+    are available), or into the standby's data directory. This requires passwordless
+    SSH access to the primary server. Add the option <literal>--copy-external-config-files</literal>
+    to the <command>repmgr standby clone</command> command; by default files will be copied to
+    the same path as on the upstream server. Note that the user executing <command>repmgr</command>
+    must have write access to those directories.
+   </para>
+   <para>
+    To have the configuration files placed in the standby's data directory, specify
+    <literal>--copy-external-config-files=pgdata</literal>, but note that
+    any include directives in the copied files may need to be updated.
+   </para>
+   <tip>
+    <simpara>
+     For reliable configuration file management we recommend using a
+     configuration management tool such as Ansible, Chef, Puppet or Salt.
+    </simpara>
+   </tip>
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-clone-recovery-conf">
+   <indexterm>
+     <primary>recovery.conf</primary>
+     <secondary>customising with "repmgr standby clone"</secondary>
+   </indexterm>
+
+   <title>Customising recovery.conf</title>
+   <para>
+     By default, &repmgr; will create a minimal <filename>recovery.conf</filename>
+     containing following parameters:
+   </para>
+
+   <itemizedlist spacing="compact" mark="bullet">
+
+     <listitem>
+       <simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
+     </listitem>
+
+     <listitem>
+       <simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
+     </listitem>
+
+     <listitem>
+       <simpara><varname>primary_conninfo</varname></simpara>
+     </listitem>
+
+     <listitem>
+       <simpara><varname>primary_slot_name</varname> (if replication slots in use)</simpara>
+     </listitem>
+
+   </itemizedlist>
+
+   <para>
+     The following additional parameters can be specified in <filename>repmgr.conf</filename>
+     for inclusion in <filename>recovery.conf</filename>:
+   </para>
+
+   <itemizedlist spacing="compact" mark="bullet">
+
+     <listitem>
+       <simpara><varname>restore_command</varname></simpara>
+     </listitem>
+
+     <listitem>
+       <simpara><varname>archive_cleanup_command</varname></simpara>
+     </listitem>
+
+     <listitem>
+       <simpara><varname>recovery_min_apply_delay</varname></simpara>
+     </listitem>
+
+   </itemizedlist>
+
+   <note>
+     <para>
+       We recommend using <ulink url="https://www.pgbarman.org/">Barman</ulink> to manage
+       WAL file archiving. For more details on combining &repmgr; and <application>Barman</application>,
+       in particular using <varname>restore_command</varname> to configure Barman as a backu source of
+       WAL files, see <xref linkend="cloning-from-barman">.
+     </para>
+   </note>
+
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-clone-wal-management">
+   <title>Managing WAL during the cloning process</title>
+   <para>
+    When initially cloning a standby, you will need to ensure
+    that all required WAL files remain available while the cloning is taking
+    place. To ensure this happens when using the default <command>pg_basebackup</command> method,
+    &repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
+    parameter to <literal>stream</literal>,
+    which will ensure all WAL files generated during the cloning process are
+    streamed in parallel with the main backup. Note that this requires two
+    replication connections to be available (&repmgr; will verify sufficient
+    connections are available before attempting to clone, and this can be checked
+    before performing the clone using the <literal>--dry-run</literal> option).
+   </para>
+   <para>
+    To override this behaviour, in <filename>repmgr.conf</filename> set
+    <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
+    parameter to <literal>fetch</literal>:
+    <programlisting>
+      pg_basebackup_options='--xlog-method=fetch'</programlisting>
+
+    and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
+    See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
+    pg_basebackup</ulink> documentation for details.
+   </para>
+
+   <note>
+    <simpara>
+      From PostgreSQL 10, <command>pg_basebackup</command>'s
+      <literal>--xlog-method</literal> parameter has been renamed to
+      <literal>--wal-method</literal>.
+    </simpara>
+   </note>
+  </refsect1>
+
+
+  <refsect1 id="repmgr-standby-create-recovery-conf">
+
+   <indexterm>
+     <primary>recovery.conf</primary>
+     <secondary>generating for a standby cloned by another method</secondary>
+   </indexterm>
+
+   <title>Using a standby cloned by another method</title>
+   <para>
+     &repmgr; supports standbys cloned by another method (e.g. using <application>barman</application>'s
+     <command>barman recover</command> command).
+   </para>
+   <para>
+     To integrate the standby as a &repmgr; node, ensure the <filename>repmgr.conf</filename>
+     file is created for the node, then execute the command
+     <command>repmgr standby clone --recovery-conf-only</command>.
+     This will create the <filename>recovery.conf</filename> file needed to attach
+     the node to its upstream, and will also create a replication slot on the
+     upstream node if required.
+   </para>
+   <para>
+     Note that the upstream node must be running. An existing
+     <filename>recovery.conf</filename> will not be overwritten unless the
+     <option>-F/--force</option> option is provided.
+   </para>
+   <para>
+     Execute <command>repmgr standby clone --recovery-conf-only --dry-run</command>
+     to check the prerequisites for creating the <filename>recovery.conf</filename> file,
+     and display the contents of the file without actually creating it.
+   </para>
+
+   <note>
+     <para>
+       <option>--recovery-conf-only</option> was introduced in &repmgr; <link linkend="release-4.0.4">4.0.4</link>.
+     </para>
+   </note>
+
+  </refsect1>
+
+  <refsect1>
+
+    <title>Options</title>
+
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually clone the standby.
+          </para>
+          <para>
+            If <option>--recovery-conf-only</option> specified, the contents of
+            the generated <filename>recovery.conf</filename> file will be displayed
+            but the file itself not written.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>-c, --fast-checkpoint</option></term>
+        <listitem>
+          <para>
+            Force fast checkpoint (not effective when cloning from Barman).
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--copy-external-config-files[={samepath|pgdata}]</option></term>
+        <listitem>
+          <para>
+            Copy configuration files located outside the data directory on the source
+            node to the same path on the standby (default) or to the
+            PostgreSQL data directory.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--no-upstream-connection</option></term>
+        <listitem>
+          <para>
+            When using Barman, do not connect to upstream node.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>-R, --remote-user=USERNAME</option></term>
+        <listitem>
+          <para>
+            Remote system username for SSH operations (default: current local system username).
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option> --recovery-conf-only</option></term>
+        <listitem>
+          <para>
+            Create <filename>recovery.conf</filename> file for a previously cloned instance. &repmgr 4.0.4 and later.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--replication-user</option></term>
+        <listitem>
+          <para>
+            User to make replication connections with (optional, not usually required).
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--superuser</option></term>
+        <listitem>
+          <para>
+            If the &repmgr; user is not a superuser, the name of a valid superuser must
+            be provided with this option.
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+      <varlistentry>
+        <term><option>--upstream-conninfo</option></term>
+        <listitem>
+          <para>
+            <literal>primary_conninfo</literal> value to write in recovery.conf
+            when the intended upstream server does not yet exist.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--upstream-node-id</option></term>
+        <listitem>
+          <para>
+            ID of the upstream node to replicate from (optional, defaults to primary node)
+          </para>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><option>--without-barman </option></term>
+        <listitem>
+          <para>
+            Do not use Barman even if configured.
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>standby_clone</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>See also</title>
+    <para>
+      See <xref linkend="cloning-standbys"> for details about various aspects of cloning.
+    </para>
+  </refsect1>
+</refentry>
+
--- a/doc/repmgr-standby-follow.sgml
+++ b/doc/repmgr-standby-follow.sgml
@@ -0,0 +1,108 @@
+<refentry id="repmgr-standby-follow">
+  <indexterm>
+    <primary>repmgr standby follow</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby follow</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby follow</refname>
+    <refpurpose>attach a standby to a new primary</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+
+    <para>
+      Attaches the standby to a new primary. This command requires a valid
+      <filename>repmgr.conf</filename> file for the standby, either specified
+      explicitly with <literal>-f/--config-file</literal> or located in a
+      default location; no additional arguments are required.
+    </para>
+    <para>
+      This command will force a restart of the standby server, which must be
+      running. It can only be used to attach an active standby to the current primary node
+   (and not to another standby).
+    </para>
+    <para>
+      To re-add an inactive node to the replication cluster, see
+      <xref linkend="repmgr-node-rejoin">
+    </para>
+
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+      <programlisting>
+      $ repmgr -f /etc/repmgr.conf standby follow
+      INFO: setting node 3's primary to node 2
+      NOTICE: restarting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' restart"
+      waiting for server to shut down........ done
+      server stopped
+      waiting for server to start.... done
+      server started
+      NOTICE: STANDBY FOLLOW successful
+      DETAIL: node 3 is now attached to node 2</programlisting>
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Options</title>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually follow a new standby.
+          </para>
+          <important>
+            <para>
+              This does not guarantee the standby can follow the primary; in
+              particular, whether the primary and standby timelines have diverged,
+              can currently only be determined by actually attempting to
+              attach the standby to the primary.
+            </para>
+          </important>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>-w</option></term>
+        <term><option>--wait</option></term>
+        <listitem>
+          <para>
+            Wait for a primary to appear. &repmgr; will wait for up to
+            <varname>primary_follow_timeout</varname> seconds
+            (default: 60 seconds) to verify that the standby is following the new primary.
+            This value can be defined in <filename>repmgr.conf</filename>.
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+    <para>
+      If provided, &repmgr; will subsitute the placeholders <literal>%p</literal> with the node ID of the primary
+      being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
+      <literal>%a</literal> with its node name.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>See also</title>
+    <para>
+     <xref linkend="repmgr-node-rejoin">
+    </para>
+  </refsect1>
+</refentry>
+
--- a/doc/repmgr-standby-promote.sgml
+++ b/doc/repmgr-standby-promote.sgml
@@ -0,0 +1,59 @@
+<refentry id="repmgr-standby-promote">
+  <indexterm>
+    <primary>repmgr standby promote</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby promote</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby promote</refname>
+    <refpurpose>promote a standby to a primary</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Promotes a standby to a primary if the current primary has failed. This
+      command requires a valid <filename>repmgr.conf</filename> file for the standby, either
+      specified explicitly  with <literal>-f/--config-file</literal> or located in a
+      default location; no additional arguments are required.
+    </para>
+    <para>
+      If the standby promotion succeeds, the server will not need to be
+      restarted. However any other standbys will need to follow the new server,
+      by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application>
+        is active, it will handle this automatically.
+    </para>
+    <para>
+      Note that &repmgr; will wait for up to <varname>promote_check_timeout</varname> seconds
+      (default: 60 seconds) to verify that the standby has been promoted, and will
+      check the promotion every <varname>promote_check_interval</varname> seconds (default: 1 second).
+      Both values can be defined in <filename>repmgr.conf</filename>.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Example</title>
+    <para>
+      <programlisting>
+      $ repmgr -f /etc/repmgr.conf standby promote
+      NOTICE: promoting standby to primary
+      DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
+      server promoting
+      DEBUG: setting node 2 as primary and marking existing primary as failed
+      NOTICE: STANDBY PROMOTE successful
+      DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
+    </para>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>standby_promote</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-standby-register.sgml
+++ b/doc/repmgr-standby-register.sgml
@@ -0,0 +1,183 @@
+<refentry id="repmgr-standby-register" xreflabel="repmgr standby register">
+  <indexterm>
+    <primary>repmgr standby register</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby register</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby register</refname>
+    <refpurpose>add a standby's information to the &repmgr; metadata</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr standby register</command> adds a standby's information to
+      the &repmgr; metadata. This command needs to be executed to enable
+      promote/follow operations and to allow <application>repmgrd</application> to work with the node.
+      An existing standby can be registered using this command. Execute with the
+      <literal>--dry-run</literal> option to check what would happen without actually registering the
+      standby.
+    </para>
+
+    <note>
+      <para>
+        If providing the configuration file location with <literal>-f/--config-file</literal>,
+        avoid using a relative path, as &repmgr; stores the configuration file location
+        in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
+        <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
+          a relative path into an absolute one, but this may not be the same as the path you
+          would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
+          to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
+          <filename>/path/to/repmgr.conf</filename>).
+      </para>
+    </note>
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-register-wait-start" xreflabel="repmgr standby register --wait-start">
+   <title>Waiting for the the standby to start</title>
+   <para>
+     By default, &repmgr; will wait 30 seconds for the standby to become available before
+     aborting with a connection error. This is useful when setting up a standby from a script,
+     as the standby may not have fully started up by the time <command>repmgr standby register</command>
+     is executed.
+   </para>
+   <para>
+     To change the timeout, pass the desired value with the <literal>--wait-start</literal> option.
+     A value of <literal>0</literal> will disable the timeout.
+   </para>
+   <para>
+     The timeout will be ignored if <literal>-F/--force</literal> was provided.
+   </para>
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-register-wait-sync" xreflabel="repmgr standby register --wait-sync">
+   <title>Waiting for the registration to propagate to the standby</title>
+   <para>
+     Depending on your environment and workload, it may take some time for the standby's node record
+     to propagate from the primary to the standby. Some actions (such as starting
+     <application>repmgrd</application>) require that the standby's node record
+     is present and up-to-date to function correctly.
+   </para>
+   <para>
+    By providing the option <option>--wait-sync</option> to the
+    <command>repmgr standby register</command> command, &repmgr; will wait
+    until the record is synchronised before exiting. An optional timeout (in
+    seconds) can be added to this option (e.g. <option>--wait-sync=60</option>).
+   </para>
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-register-inactive-node" xreflabel="Registering an inactive node">
+   <title>Registering an inactive node</title>
+   <para>
+    Under some circumstances you may wish to register a standby which is not
+    yet running; this can be the case when using provisioning tools to create
+    a complex replication cluster. In this case, by using the <option>-F/--force</option>
+    option and providing the connection parameters to the primary server,
+    the standby can be registered.
+   </para>
+   <para>
+    Similarly, with cascading replication it may be necessary to register
+    a standby whose upstream node has not yet been registered - in this case,
+    using <option>-F/--force</option> will result in the creation of an inactive placeholder
+    record for the upstream node, which will however later need to be registered
+    with the <option>-F/--force</option> option too.
+   </para>
+   <para>
+    When used with <command>repmgr standby register</command>, care should be taken that use of the
+    <option>-F/--force</option> option does not result in an incorrectly configured cluster.
+   </para>
+  </refsect1>
+
+  <refsect1 id="repmgr-standby-register-node-cloned-other-source">
+    <title>Registering a node not cloned by repmgr</title>
+    <para>
+      If you've cloned a standby using another method (e.g. <application>barman</application>'s
+     <command>barman recover</command> command), first execute
+     <link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --recovery-conf-only</link>
+     to add the <filename>recovery.conf</filename> file, then register the standby as usual.
+    </para>
+  </refsect1>
+
+  <refsect1>
+
+    <title>Options</title>
+
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually register the standby.
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+      <varlistentry>
+       <term><option>-F</option><option>--force</option></term>
+        <listitem>
+          <para>
+            Overwrite an existing node record
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+      <varlistentry>
+        <term><option>--upstream-node-id</option></term>
+        <listitem>
+          <para>
+            ID of the upstream node to replicate from (optional)
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--wait-start</option></term>
+        <listitem>
+          <para>
+            wait for the standby to start (timeout in seconds, default 30 seconds)
+          </para>
+        </listitem>
+      </varlistentry>
+
+     <varlistentry>
+        <term><option>--wait-sync</option></term>
+        <listitem>
+          <para>
+            wait for the node record to synchronise to the standby (optional timeout in seconds)
+          </para>
+        </listitem>
+      </varlistentry>
+
+
+    </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>standby_register</literal> <link linkend="event-notifications">event notification</link>
+      will be generated immediately after the node record is updated on the primary.
+    </para>
+
+    <para>
+      If the <option>--wait-sync</option> option is provided, a <literal>standby_register_sync</literal>
+      event notification  will be generated immediately after the node record has synchronised to the
+      standby.
+    </para>
+
+    <para>
+      If provided, &repmgr; will subsitute the placeholders <literal>%p</literal> with the node ID of the
+      primary node, <literal>%c</literal> with its <literal>conninfo</literal> string, and
+      <literal>%a</literal> with its node name.
+    </para>
+
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-standby-switchover.sgml
+++ b/doc/repmgr-standby-switchover.sgml
@@ -0,0 +1,245 @@
+<refentry id="repmgr-standby-switchover">
+  <indexterm>
+    <primary>repmgr standby switchover</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby switchover</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby switchover</refname>
+    <refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+
+    <para>
+      Promotes a standby to primary and demotes the existing primary to a standby.
+      This command must be run on the standby to be promoted, and requires a
+      passwordless SSH connection to the current primary.
+    </para>
+    <para>
+      If other standbys are connected to the demotion candidate, &repmgr; can instruct
+      these to follow the new primary if the option <literal>--siblings-follow</literal>
+      is specified. This requires a passwordless SSH connection between the promotion
+      candidate (new primary) and the standbys attached to the demotion candidate
+      (existing primary).
+    </para>
+    <note>
+      <para>
+        Performing a switchover is a non-trivial operation. In particular it
+        relies on the current primary being able to shut down cleanly and quickly.
+        &repmgr; will attempt to check for potential issues but cannot guarantee
+        a successful switchover.
+      </para>
+    </note>
+    <para>
+      For more details on performing a switchover, including preparation and configuration,
+      see section <xref linkend="performing-switchover">.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Options</title>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--always-promote</option></term>
+        <listitem>
+          <para>
+            Promote standby to primary, even if it is behind original primary
+            (original primary will be shut down in any case).
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--dry-run</option></term>
+        <listitem>
+          <para>
+            Check prerequisites but don't actually execute a switchover.
+          </para>
+          <important>
+            <para>
+              Success of <option>--dry-run</option> does not imply the switchover will
+              complete successfully, only that
+              the prerequisites for performing the operation are met.
+            </para>
+          </important>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>-F</option></term>
+        <term><option>--force</option></term>
+        <listitem>
+          <para>
+            Ignore warnings and continue anyway.
+          </para>
+          <para>
+            Specifically, if a problem is encountered when shutting down the current primary,
+            using <option>-F/--force</option> will cause &repmgr; to continue by promoting
+            the standby to be the new primary, and if <option>--siblings-follow</option> is
+            specified, attach any other standbys to the new primary.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
+        <listitem>
+          <para>
+            Use <application>pg_rewind</application> to reintegrate the old primary if necessary
+            (and the prerequisites for using <application>pg_rewind</application> are met).
+            If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
+            binary is not installed in the PostgreSQL <filename>bin</filename> directory,
+            provide its full path. For more details see also <xref linkend="switchover-pg-rewind">.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>-R</option></term>
+        <term><option>--remote-user</option></term>
+        <listitem>
+          <para>
+            System username for remote SSH operations (defaults to local system user).
+          </para>
+        </listitem>
+      </varlistentry>
+
+     <varlistentry>
+        <term><option>--siblings-follow</option></term>
+        <listitem>
+          <para>
+            Have standbys attached to the old primary follow the new primary.
+          </para>
+        </listitem>
+      </varlistentry>
+    </variablelist>
+
+  </refsect1>
+
+  <refsect1>
+    <title>Configuration file settings</title>
+
+    <para>
+     Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
+     switchover operation:
+     <itemizedlist spacing="compact" mark="bullet">
+       <listitem>
+         <simpara>
+           <literal>reconnect_attempts</literal>: number of times to check the original primary
+           for a clean shutdown after executing the shutdown command, before aborting
+         </simpara>
+       </listitem>
+       <listitem>
+         <simpara>
+           <literal>reconnect_interval</literal>: interval (in seconds) to check the original
+           primary for a clean shutdown after executing the shutdown command (up to a maximum
+           of <literal>reconnect_attempts</literal> tries)
+         </simpara>
+       </listitem>
+       <listitem>
+         <simpara>
+           <literal>replication_lag_critical</literal>:
+           if replication lag (in seconds) on the standby exceeds this value, the
+           switchover will be aborted (unless the <literal>-F/--force</literal> option
+           is provided)
+         </simpara>
+       </listitem>
+
+       <listitem>
+         <simpara>
+           <literal>standby_reconnect_timeout</literal>:
+           Number of seconds to attempt to reconnect to the demoted primary
+           once it has been restarted.
+         </simpara>
+       </listitem>
+
+     </itemizedlist>
+    </para>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Execution</title>
+
+    <para>
+      Execute with the <literal>--dry-run</literal> option to test the switchover as far as
+      possible without actually changing the status of either node.
+    </para>
+    <para>
+      <application>repmgrd</application> should not be active on any nodes while a switchover is being
+      executed. This restriction may be lifted in a later version.
+    </para>
+    <para>
+      External database connections, e.g. from an application, should not be permitted while
+      the switchover is taking place. In particular, active transactions on the primary
+      can potentially disrupt the shutdown process.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      <literal>standby_switchover</literal> and <literal>standby_promote</literal>
+      <link linkend="event-notifications">event notifications</link> will be generated for the new primary,
+      and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
+    </para>
+    <para>
+      If using an event notification script, <literal>standby_switchover</literal>
+      will populate the placeholder parameter <literal>%p</literal> with the node ID of
+      the former primary.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Exit codes</title>
+    <para>
+      Following exit codes can be emitted by <literal>repmgr standby switchover</literal>:
+    </para>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>SUCCESS (0)</option></term>
+        <listitem>
+          <para>
+            The switchover completed successfully.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
+        <listitem>
+          <para>
+            The switchover could not be executed.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
+        <listitem>
+          <para>
+            The switchover was executed but a problem was encountered.
+            Typically this means the former primary could not be reattached
+            as a standby.
+          </para>
+        </listitem>
+      </varlistentry>
+
+   </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>See also</title>
+    <para>
+      For more details see the section <xref linkend="performing-switchover">.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-standby-unregister.sgml
+++ b/doc/repmgr-standby-unregister.sgml
@@ -0,0 +1,70 @@
+<refentry id="repmgr-standby-unregister">
+  <indexterm>
+    <primary>repmgr standby unregister</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr standby unregister</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr standby unregister</refname>
+    <refpurpose>remove a standby's information from the &repmgr; metadata</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      Unregisters a standby with &repmgr;. This command does not affect the actual
+      replication, just removes the standby's entry from the &repmgr; metadata.
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Execution</title>
+    <para>
+      To unregister a running standby, execute:
+      <programlisting>
+        repmgr standby unregister -f /etc/repmgr.conf</programlisting>
+    </para>
+    <para>
+      This will remove the standby record from &repmgr;'s internal metadata
+      table (<literal>repmgr.nodes</literal>). A <literal>standby_unregister</literal>
+      event notification will be recorded in the <literal>repmgr.events</literal> table.
+    </para>
+    <para>
+      If the standby is not running, the command can be executed on another
+      node by providing the id of the node to be unregistered using
+      the command line parameter <literal>--node-id</literal>, e.g. executing the following
+      command on the primary server will unregister the standby with
+      id <literal>3</literal>:
+      <programlisting>
+        repmgr standby unregister -f /etc/repmgr.conf --node-id=3</programlisting>
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Options</title>
+    <variablelist>
+
+      <varlistentry>
+        <term><option>--node-id</option></term>
+        <listitem>
+          <para>
+            <varname>node_id</varname> of the node to unregister (optional)
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+  </refsect1>
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>standby_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
+
--- a/doc/repmgr-witness-register.sgml
+++ b/doc/repmgr-witness-register.sgml
@@ -0,0 +1,60 @@
+<refentry id="repmgr-witness-register">
+  <indexterm>
+    <primary>repmgr witness register</primary>
+    <seealso>witness server</seealso>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr witness register</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr witness register</refname>
+    <refpurpose>add a witness node's information to the &repmgr; metadata</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr witness register</command> adds a witness server's node
+      record to the &repmgr; metadata, and if necessary initialises the witness
+      node by installing the &repmgr; extension and copying the &repmgr; metadata
+      to the witness server. This command needs to be executed to enable
+      use of the witness server with <application>repmgrd</application>.
+    </para>
+    <para>
+      When executing <command>repmgr witness register</command>, connection information
+      for the cluster primary server must also be provided. &repmgr; will automatically
+      use the <varname>user</varname> and <varname>dbname</varname> values defined
+      in the <varname>conninfo</varname> string defined in the  witness node's
+      <filename>repmgr.conf</filename>, if these are not explicitly provided.
+    </para>
+    <para>
+      Execute with the <literal>--dry-run</literal> option to check what would happen
+      without actually registering the witness server.
+    </para>
+  </refsect1>
+  <refsect1>
+    <title>Example</title>
+    <para>
+      <programlisting>
+    $ repmgr -f /etc/repmgr.conf witness register -h node1
+    INFO: connecting to witness node "node3" (ID: 3)
+    INFO: connecting to primary node
+    NOTICE: attempting to install extension "repmgr"
+    NOTICE: "repmgr" extension successfully installed
+    INFO: witness registration complete
+    NOTICE: witness node "node3" (ID: 3) successfully registered
+      </programlisting>
+    </para>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>witness_register</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr-witness-unregister.sgml
+++ b/doc/repmgr-witness-unregister.sgml
@@ -0,0 +1,73 @@
+<refentry id="repmgr-witness-unregister" xreflabel="repmgr witness unregister">
+  <indexterm>
+    <primary>repmgr witness unregister</primary>
+  </indexterm>
+
+  <refmeta>
+    <refentrytitle>repmgr witness unregister</refentrytitle>
+  </refmeta>
+
+  <refnamediv>
+    <refname>repmgr witness unregister</refname>
+    <refpurpose>remove a witness node's information to the &repmgr; metadata</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>Description</title>
+    <para>
+      <command>repmgr witness unregister</command> removes a witness server's node
+      record from the &repmgr; metadata.
+    </para>
+    <para>
+      The node does not have to be running to be unregistered, however if this is the
+      case then connection information for the primary server must be provided.
+    </para>
+    <para>
+      Execute with the <literal>--dry-run</literal> option to check what would happen
+      without actually registering the witness server.
+    </para>
+  </refsect1>
+  <refsect1>
+    <title>Examples</title>
+    <para>
+      Unregistering a running witness node:
+      <programlisting>
+    $ repmgr -f /etc/repmgr.conf witness unregister
+    INFO: connecting to witness node "node3" (ID: 3)
+    INFO: unregistering witness node 3
+    INFO: witness unregistration complete
+    DETAIL: witness node with id 3 (conninfo: host=node3 dbname=repmgr user=repmgr port=5499) successfully unregistered</programlisting>
+    </para>
+    <para>
+      Unregistering a non-running witness node:
+      <programlisting>
+        $ repmgr -f /etc/repmgr.conf witness unregister -h node1 -p 5501  -F
+        INFO: connecting to witness node "node3" (ID: 3)
+        NOTICE: unable to connect to witness node "node3" (ID: 3), removing node record on cluster primary only
+        INFO: unregistering witness node 3
+        INFO: witness unregistration complete
+        DETAIL: witness node with id 3 (conninfo: host=node3 dbname=repmgr user=repmgr port=5499) successfully unregistered</programlisting>
+    </para>
+  </refsect1>
+
+  <refsect1>
+    <title>Notes</title>
+    <para>
+      This command will not make any changes to the witness node itself and will neither
+      remove any data from the witness database nor stop the PostgreSQL instance.
+    </para>
+    <para>
+      A witness node which has been unregistered, can be re-registered with
+      <link linkend="repmgr-witness-register">repmgr witness register --force</link>.
+    </para>
+  </refsect1>
+
+
+  <refsect1>
+    <title>Event notifications</title>
+    <para>
+      A <literal>witness_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
+    </para>
+  </refsect1>
+
+</refentry>
--- a/doc/repmgr.sgml
+++ b/doc/repmgr.sgml
@@ -0,0 +1,125 @@
+<!-- doc/src/sgml/postgres.sgml -->
+
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
+
+          <!ENTITY % version SYSTEM "version.sgml">
+          %version;
+
+          <!ENTITY % filelist SYSTEM "filelist.sgml">
+          %filelist;
+
+          <!ENTITY repmgr "<productname>repmgr</productname>">
+          <!ENTITY postgres "<productname>PostgreSQL</productname>">
+]>
+
+<book id="repmgr">
+ <title>repmgr &repmgrversion; Documentation</title>
+
+ <bookinfo>
+  <corpauthor>2ndQuadrant Ltd</corpauthor>
+  <productname>repmgr</productname>
+  <productnumber>&repmgrversion;</productnumber>
+  &legal;
+
+  <abstract>
+   <para>
+   This is the official documentation of &repmgr; &repmgrversion; for
+   use with PostgreSQL 9.3 - PostgreSQL 10.
+   It describes the functionality supported by the current version of &repmgr;.
+   </para>
+
+   <para>
+    &repmgr; was developed by
+    <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
+    along with contributions from other individuals and companies.
+    Contributions from the community are appreciated and welcome - get
+    in touch via <ulink url="https://github.com/2ndQuadrant/repmgr">github</>
+    or <ulink url="https://groups.google.com/group/repmgr">the mailing list/forum</>.
+    Multiple 2ndQuadrant customers contribute funding
+    to make repmgr development possible.
+   </para>
+
+   <para>
+    2ndQuadrant, a Platinum sponsor of the PostgreSQL project,
+    continues to develop repmgr to meet internal needs and those of customers.
+     Other companies as well as individual developers
+    are welcome to participate in the efforts.
+   </para>
+  </abstract>
+
+  <keywordset>
+   <keyword>repmgr</keyword>
+   <keyword>PostgreSQL</keyword>
+   <keyword>replication</keyword>
+   <keyword>asynchronous</keyword>
+   <keyword>HA</keyword>
+   <keyword>high-availability</keyword>
+  </keywordset>
+ </bookinfo>
+
+
+ <part id="getting-started">
+  <title>Getting started</title>
+  &overview;
+  &install;
+  &quickstart;
+ </part>
+
+ <part id="repmgr-administration-manual">
+  <title>repmgr administration manual</title>
+
+  &configuration;
+  &cloning-standbys;
+  &promoting-standby;
+  &follow-new-primary;
+  &switchover;
+  &configuring-witness-server;
+  &event-notifications;
+  &upgrading-repmgr;
+ </part>
+
+ <part id="using-repmgrd">
+  <title>Using repmgrd</title>
+  &repmgrd-automatic-failover;
+  &repmgrd-configuration;
+  &repmgrd-demonstration;
+  &repmgrd-cascading-replication;
+  &repmgrd-network-split;
+  &repmgrd-witness-server;
+  &repmgrd-degraded-monitoring;
+  &repmgrd-monitoring;
+  &repmgrd-bdr;
+ </part>
+
+ <part id="repmgr-command-reference">
+  <title>repmgr command reference</title>
+
+  &repmgr-primary-register;
+  &repmgr-primary-unregister;
+  &repmgr-standby-clone;
+  &repmgr-standby-register;
+  &repmgr-standby-unregister;
+  &repmgr-standby-promote;
+  &repmgr-standby-follow;
+  &repmgr-standby-switchover;
+  &repmgr-witness-register;
+  &repmgr-witness-unregister;
+  &repmgr-node-status;
+  &repmgr-node-check;
+  &repmgr-node-rejoin;
+  &repmgr-cluster-show;
+  &repmgr-cluster-matrix;
+  &repmgr-cluster-crosscheck;
+  &repmgr-cluster-event;
+  &repmgr-cluster-cleanup;
+ </part>
+
+ &appendix-release-notes;
+ &appendix-signatures;
+ &appendix-faq;
+ &appendix-packages;
+
+ <![%include-index;[&bookindex;]]>
+ <![%include-xslt-index;[<index id="bookindex"></index>]]>
+
+</book>
--- a/doc/repmgrd-automatic-failover.sgml
+++ b/doc/repmgrd-automatic-failover.sgml
@@ -0,0 +1,17 @@
+<chapter id="repmgrd-automatic-failover" xreflabel="Automatic failover with repmgrd">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>automatic failover</secondary>
+ </indexterm>
+
+ <title>Automatic failover with repmgrd</title>
+
+ <para>
+  <application>repmgrd</application> is a management and monitoring daemon which runs
+  on each node in a replication cluster. It can automate actions such as
+  failover and updating standbys to follow the new primary, as well as
+  providing monitoring information about the state of each standby.
+ </para>
+
+
+</chapter>
--- a/doc/repmgrd-bdr.sgml
+++ b/doc/repmgrd-bdr.sgml
@@ -0,0 +1,414 @@
+<chapter id="repmgrd-bdr">
+  <indexterm>
+    <primary>repmgrd</primary>
+    <secondary>BDR</secondary>
+  </indexterm>
+
+  <indexterm>
+    <primary>BDR</primary>
+  </indexterm>
+
+  <title>BDR failover with repmgrd</title>
+  <para>
+    &repmgr; 4.x provides support for monitoring BDR nodes and taking action in
+    case one of the nodes fails.
+  </para>
+  <note>
+    <simpara>
+      Due to the nature of BDR, it's only safe to use this solution for
+      a two-node scenario. Introducing additional nodes will create an inherent
+      risk of node desynchronisation if a node goes down without being cleanly
+      removed from the cluster.
+    </simpara>
+  </note>
+  <para>
+    In contrast to streaming replication, there's no concept of "promoting" a new
+    primary node with BDR. Instead, "failover" involves monitoring both nodes
+    with <application>repmgrd</application> and redirecting queries from the failed node to the remaining
+    active node. This can be done by using an
+    <link linkend="event-notifications">event notification</link> script
+    which is called by <application>repmgrd</application> to dynamically
+    reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
+  </para>
+
+  <sect1 id="bdr-prerequisites" xreflabel="BDR prequisites">
+    <title>Prerequisites</title>
+    <para>
+      &repmgr; 4 requires PostgreSQL 9.4 or 9.6 with the BDR 2 extension
+      enabled and configured for a two-node BDR network. &repmgr; 4 packages
+      must be installed on each node before attempting to configure
+      <application>repmgr</application>.
+    </para>
+    <note>
+      <simpara>
+        &repmgr; 4 will refuse to install if it detects more than two BDR nodes.
+      </simpara>
+    </note>
+    <para>
+      Application database connections *must* be passed through a proxy server/
+      connection pooler such as <application>PgBouncer</application>, and it must be possible to dynamically
+      reconfigure that from <application>repmgrd</application>. The example demonstrated in this document
+      will use <application>PgBouncer</application>
+    </para>
+    <para>
+      The proxy server / connection poolers must <emphasis>not</emphasis>
+      be installed on the database servers.
+    </para>
+    <para>
+      For this example, it's assumed password-less SSH connections are available
+      from the PostgreSQL servers to the servers where <application>PgBouncer</application>
+      runs, and that the user on those servers has permission to alter the
+      <application>PgBouncer</application> configuration files.
+    </para>
+    <para>
+      PostgreSQL connections must be possible between each node, and each node
+      must be able to connect to each PgBouncer instance.
+    </para>
+  </sect1>
+
+  <sect1 id="bdr-configuration" xreflabel="BDR configuration">
+    <title>Configuration</title>
+    <para>
+      A sample configuration for <filename>repmgr.conf</filename> on each
+      BDR node would look like this:
+      <programlisting>
+        # Node information
+        node_id=1
+        node_name='node1'
+        conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
+        data_directory='/var/lib/postgresql/data'
+        replication_type='bdr'
+
+        # Event notification configuration
+        event_notifications=bdr_failover
+        event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
+
+        # repmgrd options
+        monitor_interval_secs=5
+        reconnect_attempts=6
+        reconnect_interval=5</programlisting>
+    </para>
+    <para>
+      Adjust settings as appropriate; copy and adjust for the second node (particularly
+      the values <varname>node_id</varname>, <varname>node_name</varname>
+      and <varname>conninfo</varname>).
+    </para>
+    <para>
+      Note that the values provided for the <varname>conninfo</varname> string
+      must be valid for connections from <emphasis>both</emphasis> nodes in the
+      replication cluster. The database must be the BDR-enabled database.
+    </para>
+    <para>
+      If defined, the evenr <application>event_notifications</application> parameter
+      will restrict execution of <varname>event_notification_command</varname>
+      to the specified event(s).
+    </para>
+    <note>
+      <simpara>
+        <varname>event_notification_command</varname> is the script which does the actual "heavy lifting"
+        of reconfiguring the proxy server/ connection pooler. It is fully
+        user-definable; a reference implementation is documented below.
+      </simpara>
+    </note>
+
+  </sect1>
+
+  <sect1 id="bdr-repmgr-setup" xreflabel="repmgr setup with BDR">
+    <title>repmgr setup</title>
+    <para>
+      Register both nodes; example on <literal>node1</literal>:
+      <programlisting>
+        $ repmgr -f /etc/repmgr.conf bdr register
+        NOTICE: attempting to install extension "repmgr"
+        NOTICE: "repmgr" extension successfully installed
+        NOTICE: node record created for node 'node1' (ID: 1)
+        NOTICE: BDR node 1 registered (conninfo: host=node1 dbname=bdrtest user=repmgr)</programlisting>
+    </para>
+    <para>
+      and on <literal>node1</literal>:
+      <programlisting>
+        $ repmgr -f /etc/repmgr.conf bdr register
+        NOTICE: node record created for node 'node2' (ID: 2)
+        NOTICE: BDR node 2 registered (conninfo: host=node2 dbname=bdrtest user=repmgr)</programlisting>
+    </para>
+    <para>
+      The <literal>repmgr</literal> extension will be automatically created
+      when the first node is registered, and will be propagated to the second
+      node.
+    </para>
+    <important>
+      <simpara>
+        Ensure the &repmgr; package is available on both nodes before
+        attempting to register the first node.
+      </simpara>
+    </important>
+    <para>
+      At this point the meta data for both nodes has been created; executing
+      <xref linkend="repmgr-cluster-show"> (on either node) should produce output like this:
+      <programlisting>
+        $ repmgr -f /etc/repmgr.conf cluster show
+        ID | Name  | Role | Status    | Upstream | Location | Connection string
+       ----+-------+------+-----------+----------+--------------------------------------------------------
+        1  | node1 | bdr  | * running |          | default  | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
+        2  | node2 | bdr  | * running |          | default  | host=node2 dbname=bdrtest user=repmgr connect_timeout=2</programlisting>
+    </para>
+    <para>
+      Additionally it's possible to display log of significant events;  executing
+      <xref linkend="repmgr-cluster-event"> (on either node) should produce output like this:
+      <programlisting>
+        $ repmgr -f /etc/repmgr.conf cluster event
+        Node ID | Event        | OK | Timestamp           | Details
+       ---------+--------------+----+---------------------+----------------------------------------------
+        2       | bdr_register | t  | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
+        1       | bdr_register | t  | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
+      </programlisting>
+    </para>
+    <para>
+      At this point there will only be records for the two node registrations (displayed here
+      in reverse chronological order).
+    </para>
+  </sect1>
+
+  <sect1 id="bdr-event-notification-command" xreflabel="BDR failover event notification command">
+    <title>Defining the "event_notification_command"</title>
+    <para>
+      Key to "failover" execution is the <literal>event_notification_command</literal>,
+      which is a user-definable script specified in <filename>repmpgr.conf</filename>
+      and which can use a &repmgr; <link linkend="event-notifications">event notification</link>
+      to reconfigure the proxy server / connection pooler so it points to the other, still-active node.
+      Details of the event will be passed as parameters to the script.
+    </para>
+    <para>
+      Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>;
+      these will be replaced with the appropriate value when the script is executed:
+    </para>
+
+    <variablelist>
+      <varlistentry>
+        <term><option>%n</option></term>
+        <listitem>
+          <para>
+            node ID
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>%e</option></term>
+        <listitem>
+          <para>
+            event type
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>%t</option></term>
+        <listitem>
+          <para>
+            success (1 or 0)
+          </para>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><option>%t</option></term>
+        <listitem>
+          <para>
+            timestamp
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><option>%d</option></term>
+        <listitem>
+          <para>
+            details
+          </para>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><option>%c</option></term>
+        <listitem>
+          <para>
+            conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
+          </para>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><option>%a</option></term>
+        <listitem>
+          <para>
+            name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
+          </para>
+        </listitem>
+      </varlistentry>
+    </variablelist>
+
+    <para>
+      Note that <literal>%c</literal> and <literal>%a</literal> are only provided with
+      particular failover events, in this case <varname>bdr_failover</varname>.
+    </para>
+    <para>
+      The provided sample script
+     (<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>)
+      is configured as follows:
+      <programlisting>
+        event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
+    </para>
+    <para>
+      and parses the placeholder parameters like this:
+      <programlisting>
+        NODE_ID=$1
+        EVENT_TYPE=$2
+        SUCCESS=$3
+        NEXT_CONNINFO=$4
+        NEXT_NODE_NAME=$5</programlisting>
+    </para>
+    <note>
+      <para>
+        The sample script also contains some hard-coded values for the <application>PgBouncer</application>
+        configuration for both nodes; these will need to be adjusted for your local environment
+        (ideally the scripts would be maintained as templates and generated by some
+        kind of provisioning system).
+      </para>
+    </note>
+
+    <para>
+      The script performs following steps:
+      <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+          <simpara>pauses <application>PgBouncer</application> on all nodes</simpara>
+        </listitem>
+        <listitem>
+          <simpara>recreates the <application>PgBouncer</application> configuration file on each
+            node using the information provided by <application>repmgrd</application>
+            (primarily the <varname>conninfo</varname> string) to configure
+            <application>PgBouncer</application></simpara>
+        </listitem>
+        <listitem>
+          <simpara>reloads the <application>PgBouncer</application> configuration</simpara>
+        </listitem>
+        <listitem>
+          <simpara>executes the <command>RESUME</command> command (in <application>PgBouncer</application>)</simpara>
+        </listitem>
+      </itemizedlist>
+    </para>
+    <para>
+      Following successful script execution, any connections to PgBouncer on the failed BDR node
+      will be redirected to the active node.
+    </para>
+  </sect1>
+
+  <sect1 id="bdr-monitoring-failover" xreflabel="Node monitoring and failover">
+    <title>Node monitoring and failover</title>
+    <para>
+      At the intervals specified by <varname>monitor_interval_secs</varname>
+      in <filename>repmgr.conf</filename>, <application>repmgrd</application>
+      will ping each node to check if it's available. If a node isn't available,
+      <application>repmgrd</application> will enter failover mode and check <varname>reconnect_attempts</varname>
+      times at intervals of <varname>reconnect_interval</varname> to confirm the node is definitely unreachable.
+      This buffer period is necessary to avoid false positives caused by transient
+      network outages.
+    </para>
+    <para>
+      If the node is still unavailable, <application>repmgrd</application> will enter failover mode and execute
+      the script defined in <varname>event_notification_command</varname>; an entry will be logged
+      in the <literal>repmgr.events</literal> table and <application>repmgrd</application> will
+      (unless otherwise configured) resume monitoring of the node in "degraded" mode until it reappears.
+    </para>
+    <para>
+      <application>repmgrd</application> logfile output during a failover event will look something like this
+      on one node (usually the node which has failed, here <literal>node2</literal>):
+      <programlisting>
+            ...
+    [2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
+    [2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
+    [2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
+    [2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
+    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
+    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
+    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
+    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
+    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
+    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
+    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
+    [2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
+    [2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
+    [2017-07-27 21:09:28] [DETAIL] command is:
+      /path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
+    [2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
+    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
+    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
+    ...</programlisting>
+    </para>
+    <para>
+      Output on the other node (<literal>node1</literal>) during the same event will look like this:
+      <programlisting>
+    ...
+    [2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
+    [2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
+    [2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
+    [2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
+    [2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
+    [2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
+    [2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
+    [2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
+    [2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
+    [2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
+    [2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
+    [2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
+    [2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
+    [2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
+    ...</programlisting>
+    </para>
+    <para>
+      This assumes only the PostgreSQL instance on <literal>node2</literal> has failed. In this case the
+      <application>repmgrd</application> instance running on <literal>node2</literal> has performed the failover. However if
+      the entire server becomes unavailable, <application>repmgrd</application> on <literal>node1</literal> will perform
+      the failover.
+    </para>
+  </sect1>
+  <sect1 id="bdr-node-recovery" xreflabel="Node recovery">
+    <title>Node recovery</title>
+    <para>
+      Following failure of a BDR node, if the node subsequently becomes available again,
+      a <varname>bdr_recovery</varname> event will be generated. This could potentially be used to
+      reconfigure PgBouncer automatically to bring the node back into the available pool,
+      however it would be prudent to manually verify the node's status before
+      exposing it to the application.
+    </para>
+    <para>
+      If the failed node comes back up and connects correctly, output similar to this
+      will be visible in the <application>repmgrd</application> log:
+      <programlisting>
+        [2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
+        [2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
+        [2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
+        [2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
+        [2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds</programlisting>
+    </para>
+  </sect1>
+
+  <sect1 id="bdr-complete-shutdown" xreflabel="Shutdown of both nodes">
+    <title>Shutdown of both nodes</title>
+    <para>
+      If both PostgreSQL instances are shut down, <application>repmgrd</application> will try and handle the
+      situation as gracefully as possible, though with no failover candidates available
+      there's not much it can do. Should this case ever occur, we recommend shutting
+      down <application>repmgrd</application> on both nodes and restarting it once the PostgreSQL instances
+      are running properly.
+    </para>
+  </sect1>
+</chapter>
+
--- a/doc/repmgrd-cascading-replication.sgml
+++ b/doc/repmgrd-cascading-replication.sgml
@@ -0,0 +1,22 @@
+<chapter id="repmgrd-cascading-replication">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>cascading replication</secondary>
+ </indexterm>
+
+ <title>repmgrd and cascading replication</title>
+ <para>
+  Cascading replication - where a standby can connect to an upstream node and not
+  the primary server itself - was introduced in PostgreSQL 9.2. &repmgr; and
+  <application>repmgrd</application> support cascading replication by keeping track of the relationship
+  between standby servers - each node record is stored with the node id of its
+  upstream ("parent") server (except of course the primary server).
+ </para>
+ <para>
+  In a failover situation where the primary node fails and a top-level standby
+  is promoted, a standby connected to another standby will not be affected
+  and continue working as normal (even if the upstream standby it's connected
+  to becomes the primary node). If however the node's direct upstream fails,
+  the "cascaded standby" will attempt to reconnect to that node's parent.
+ </para>
+</chapter>
--- a/doc/repmgrd-configuration.sgml
+++ b/doc/repmgrd-configuration.sgml
@@ -0,0 +1,282 @@
+<chapter id="repmgrd-configuration">
+
+  <indexterm>
+    <primary>repmgrd</primary>
+    <secondary>configuration</secondary>
+  </indexterm>
+
+  <title>repmgrd configuration</title>
+
+  <para>
+    <application>repmgrd</application> is a daemon which runs on each PostgreSQL node,
+    monitoring the local node, and (unless it's the primary node) the upstream server
+    (the primary server or with cascading replication, another standby) which it's
+    connected to.
+  </para>
+  <para>
+    <application>repmgrd</application> can be configured to provide failover
+    capability in case the primary upstream node becomes unreachable, and/or
+    provide monitoring data to the &repmgr; metadatabase.
+  </para>
+
+  <sect1 id="repmgrd-basic-configuration">
+    <title>repmgrd basic configuration</title>
+
+    <para>
+      To use <application>repmgrd</application>, its associated function library <emphasis>must</emphasis> be
+      included in <filename>postgresql.conf</filename> with:
+
+      <programlisting>
+        shared_preload_libraries = 'repmgr'</programlisting>
+    </para>
+    <para>
+      Changing this setting requires a restart of PostgreSQL; for more details see
+      the <ulink url="https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>.
+    </para>
+
+    <sect2 id="repmgrd-automatic-failover-configuration">
+      <title>automatic failover configuration</title>
+      <para>
+        If using automatic failover, the following <application>repmgrd</application> options *must* be set in
+        <filename>repmgr.conf</filename> :
+        <programlisting>
+          failover=automatic
+          promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
+          follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
+      </para>
+      <para>
+        Adjust file paths as appropriate; we recomment specifying the full path to the &repmgr; binary.
+      </para>
+      <para>
+        Note that the <literal>--log-to-file</literal> option will cause
+        output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
+        to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
+        See <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename>
+        for further <application>repmgrd</application>-specific settings.
+      </para>
+      <para>
+        When <varname>failover</varname> is set to <literal>automatic</literal>, upon detecting failure
+        of the current  primary, <application>repmgrd</application> will execute one of:
+      </para>
+      <itemizedlist spacing="compact" mark="bullet">
+        <listitem>
+          <simpara>
+            <varname>promote_command</varname> (if the current server is to become the new primary)
+          </simpara>
+        </listitem>
+        <listitem>
+          <simpara>
+            <varname>follow_command</varname> (if the current server needs to follow another server which has
+            become the new primary)
+          </simpara>
+        </listitem>
+      </itemizedlist>
+      <note>
+        <para>
+          These commands can be any valid shell script which results in one of these
+          two actions happening, but if &repmgr;'s <command>standby follow</command> or
+          <command>standby promote</command>
+          commands are not executed (either directly as shown here, or from a script which
+          performs other actions), the &repmgr; metadata will not be updated and
+          &repmgr; will no longer function reliably.
+        </para>
+      </note>
+
+      <para>
+        The <varname>follow_command</varname> should provide the <literal>--upstream-node-id=%n</literal>
+        option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
+        <application>repmgrd</application> with the ID of the new primary node. If this is not provided, &repmgr;
+        will attempt to determine the new primary by itself, but if the
+        original primary comes back online after the new primary is promoted, there is a risk that
+        <command>repmgr standby follow</command> will result in the node continuing to follow
+        the original primary.
+      </para>
+    </sect2>
+
+    <sect2 id="repmgrd-service-configuration">
+      <indexterm>
+        <primary>repmgrd</primary>
+        <secondary>PostgreSQL service configuration</secondary>
+      </indexterm>
+      <title>PostgreSQL service configuration</title>
+      <para>
+        If using automatic failover, currently <application>repmgrd</application> will need to execute
+        <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
+        to restart PostgreSQL on standbys to have them follow a new primary.
+      </para>
+      <para>
+        To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
+        command appropriate to your operating system via <varname>service_restart_command</varname>
+        in <filename>repmgr.conf</filename>. If you don't do this, <application>repmgrd</application>
+        will default to using <command>pg_ctl</command>, which can result in unexpected problems,
+        particularly on <application>systemd</application>-based systems.
+      </para>
+      <para>
+        For more details, see <xref linkend="configuration-service-commands">.
+      </para>
+    </sect2>
+
+    <sect2 id="repmgrd-monitoring-configuration">
+      <indexterm>
+        <primary>repmgrd</primary>
+        <secondary>monitoring configuration</secondary>
+      </indexterm>
+      <title>Monitoring configuration</title>
+      <para>
+        To enable monitoring, set:
+        <programlisting>
+          monitoring_history=yes</programlisting>
+        in <filename>repmgr.conf</filename>.
+      </para>
+      <para>
+        The default monitoring interval is 2 seconds; this value can be explicitly set using:
+        <programlisting>
+          monitor_interval_secs=&lt;seconds&gt;</programlisting>
+        in <filename>repmgr.conf</filename>.
+      </para>
+      <para>
+        For more details on monitoring, see <xref linkend="repmgrd-monitoring">.
+      </para>
+    </sect2>
+
+  </sect1>
+
+  <sect1 id="repmgrd-daemon">
+    <indexterm>
+      <primary>repmgrd</primary>
+      <secondary>starting and stopping</secondary>
+    </indexterm>
+    <title>repmgrd daemon</title>
+    <para>
+      If installed from a package, the <application>repmgrd</application> can be started
+      via the operating system's service command, e.g. in <application>systemd</application>
+      using <command>systemctl</command>.
+    </para>
+    <para>
+      See appendix <xref linkend="appendix-packages"> for details of service commands
+      for different distributions.
+    </para>
+    <para>
+      <application>repmgrd</application> can be started manually like this:
+      <programlisting>
+        repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid --daemonize</programlisting>
+      and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
+    </para>
+    <para>
+      To apply configuration file changes to a running <application>repmgrd</application>
+      daemon, execute the operating system's service reload command (for manually started
+      instances, execute <command>kill -HUP `cat /tmp/repmgrd.pid`</command>).
+      Note that only a subset of configuration file parameters can be changed on a
+      running <application>repmgrd</application> daemon.
+    </para>
+
+    <sect2 id="repmgrd-configuration-debian-ubuntu">
+      <indexterm>
+        <primary>repmgrd</primary>
+        <secondary>Debian/Ubuntu and daemon configuration</secondary>
+      </indexterm>
+      <indexterm>
+        <primary>Debian/Ubuntu</primary>
+        <secondary>repmgrd daemon configuration</secondary>
+      </indexterm>
+
+      <title>repmgrd daemon configuration on Debian/Ubuntu</title>
+
+      <para>
+        If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
+        is required before <application>repmgrd</application> is started as a daemon.
+      </para>
+      <para>
+        This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
+        looks like this:
+        <programlisting>
+# default settings for repmgrd. This file is source by /bin/sh from
+# /etc/init.d/repmgrd
+
+# disable repmgrd by default so it won't get started upon installation
+# valid values: yes/no
+REPMGRD_ENABLED=no
+
+# configuration file (required)
+#REPMGRD_CONF="/path/to/repmgr.conf"
+
+# additional options
+#REPMGRD_OPTS=""
+
+# user to run repmgrd as
+#REPMGRD_USER=postgres
+
+# repmgrd binary
+#REPMGRD_BIN=/usr/bin/repmgrd
+
+# pid file
+#REPMGRD_PIDFILE=/var/run/repmgrd.pid</programlisting>
+      </para>
+      <para>
+        Set <varname>REPMGRD_ENABLED</varname> to <literal>yes</literal>, and <varname>REPMGRD_CONF</varname>
+        to the <filename>repmgr.conf</filename> file you are using.
+      </para>
+      <para>
+        If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
+        Also, if you attempted to start <application>repmgrd</application> using <command>systemctl start repmgrd</command>,
+        you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
+        rolls.
+      </para>
+
+    </sect2>
+  </sect1>
+
+  <sect1 id="repmgrd-connection-settings">
+    <title>repmgrd connection settings</title>
+ <para>
+  In addition to the &repmgr; configuration settings, parameters in the
+  <varname>conninfo</varname> string influence how &repmgr; makes a network connection to
+  PostgreSQL. In particular, if another server in the replication cluster
+  is unreachable at network level, system network settings will influence
+  the length of time it takes to determine that the connection is not possible.
+ </para>
+ <para>
+  In particular explicitly setting a parameter for <literal>connect_timeout</literal>
+  should be considered; the effective minimum value of <literal>2</literal>
+  (seconds) will ensure that a connection failure at network level is reported
+  as soon as possible, otherwise depending on the system settings (e.g.
+  <varname>tcp_syn_retries</varname> in Linux) a delay of a minute or more
+  is possible.
+ </para>
+ <para>
+  For further details on <varname>conninfo</varname> network connection
+  parameters, see the
+  <ulink url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>.
+ </para>
+ </sect1>
+
+
+
+ <sect1 id="repmgrd-log-rotation">
+   <indexterm>
+     <primary>log rotation</primary>
+     <secondary>repmgrd</secondary>
+   </indexterm>
+
+  <title>repmgrd log rotation</title>
+  <para>
+   To ensure the current <application>repmgrd</application> logfile
+   (specified in <filename>repmgr.conf</filename> with the parameter
+   <option>log_file</option> does not grow indefinitely, configure your
+   system's <command>logrotate</command> to regularly rotate it.
+  </para>
+  <para>
+   Sample configuration to rotate logfiles weekly with retention for
+   up to 52 weeks and rotation forced if a file grows beyond 100Mb:
+   <programlisting>
+    /var/log/postgresql/repmgr-9.6.log {
+        missingok
+        compress
+        rotate 52
+        maxsize 100M
+        weekly
+        create 0600 postgres postgres
+    }</programlisting>
+  </para>
+ </sect1>
+</chapter>
--- a/doc/repmgrd-degraded-monitoring.sgml
+++ b/doc/repmgrd-degraded-monitoring.sgml
@@ -0,0 +1,83 @@
+<chapter id="repmgrd-degraded-monitoring">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>degraded monitoring</secondary>
+ </indexterm>
+
+ <title>"degraded monitoring" mode</title>
+ <para>
+  In certain circumstances, <application>repmgrd</application> is not able to fulfill its primary mission
+  of monitoring the nodes' upstream server. In these cases it enters "degraded
+  monitoring" mode, where <application>repmgrd</application> remains active but is waiting for the situation
+  to be resolved.
+ </para>
+ <para>
+  Situations where this happens are:
+  <itemizedlist spacing="compact" mark="bullet">
+
+   <listitem>
+    <simpara>a failover situation has occurred, no nodes in the primary node's location are visible</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>a failover situation has occurred, but no promotion candidate is available</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>a failover situation has occurred, but the promotion candidate could not be promoted</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>a failover situation has occurred, but the node was unable to follow the new primary</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>a failover situation has occurred, but no primary has become available</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>a failover situation has occurred, but automatic failover is not enabled for the node</simpara>
+   </listitem>
+
+   <listitem>
+    <simpara>repmgrd is monitoring the primary node, but it is not available (and no other node has been promoted as primary)</simpara>
+   </listitem>
+  </itemizedlist>
+ </para>
+
+ <para>
+  Example output in a situation where there is only one standby with <literal>failover=manual</literal>,
+  and the primary node is unavailable (but is later restarted):
+  <programlisting>
+    [2017-08-29 10:59:19] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)
+    [2017-08-29 10:59:33] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
+    [2017-08-29 10:59:33] [INFO] checking state of node 1, 1 of 5 attempts
+    [2017-08-29 10:59:33] [INFO] sleeping 1 seconds until next reconnection attempt
+    (...)
+    [2017-08-29 10:59:37] [INFO] checking state of node 1, 5 of 5 attempts
+    [2017-08-29 10:59:37] [WARNING] unable to reconnect to node 1 after 5 attempts
+    [2017-08-29 10:59:37] [NOTICE] this node is not configured for automatic failover so will not be considered as promotion candidate
+    [2017-08-29 10:59:37] [NOTICE] no other nodes are available as promotion candidate
+    [2017-08-29 10:59:37] [HINT] use "repmgr standby promote" to manually promote this node
+    [2017-08-29 10:59:37] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
+    [2017-08-29 10:59:53] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
+    [2017-08-29 11:00:45] [NOTICE] reconnected to upstream node 1 after 68 seconds, resuming monitoring
+    [2017-08-29 11:00:57] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)</programlisting>
+
+ </para>
+ <para>
+  By default, <literal>repmgrd</literal> will continue in degraded monitoring mode indefinitely.
+  However a timeout (in seconds) can be set with <varname>degraded_monitoring_timeout</varname>,
+  after which <application>repmgrd</application> will terminate.
+ </para>
+
+ <note>
+   <para>
+     If <application>repmgrd</application> is monitoring a primary mode which has been stopped
+     and manually restarted as a standby attached to a new primary, it will automatically detect
+     the status change and update the node record to reflect the node's new status
+     as an active standby. It will then resume monitoring the node as a standby.
+   </para>
+ </note>
+
+</chapter>
--- a/doc/repmgrd-demonstration.sgml
+++ b/doc/repmgrd-demonstration.sgml
@@ -0,0 +1,96 @@
+<chapter id="repmgrd-demonstration">
+ <title>repmgrd demonstration</title>
+ <para>
+  To demonstrate automatic failover, set up a 3-node replication cluster (one primary
+  and two standbys streaming directly from the primary) so that the cluster looks
+  something like this:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
+ </para>
+ <para>
+  Start <application>repmgrd</application> on each standby and verify that it's running by examining the
+  log output, which at log level <literal>INFO</literal> will look like this:
+  <programlisting>
+    [2017-08-24 17:31:00] [NOTICE] using configuration file "/etc/repmgr.conf"
+    [2017-08-24 17:31:00] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr"
+    [2017-08-24 17:31:00] [NOTICE] starting monitoring of node <literal>node2</literal> (ID: 2)
+    [2017-08-24 17:31:00] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
+ </para>
+ <para>
+  Each <application>repmgrd</application> should also have recorded its successful startup as an event:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
+     Node ID | Name  | Event         | OK | Timestamp           | Details
+    ---------+-------+---------------+----+---------------------+-------------------------------------------------------------
+     3       | node3 | repmgrd_start | t  | 2017-08-24 17:35:54 | monitoring connection to upstream node "node1" (node ID: 1)
+     2       | node2 | repmgrd_start | t  | 2017-08-24 17:35:50 | monitoring connection to upstream node "node1" (node ID: 1)
+     1       | node1 | repmgrd_start | t  | 2017-08-24 17:35:46 | monitoring cluster primary "node1" (node ID: 1)  </programlisting>
+ </para>
+ <para>
+  Now stop the current primary server with e.g.:
+  <programlisting>
+    pg_ctl -D /var/lib/postgresql/data -m immediate stop</programlisting>
+ </para>
+ <para>
+  This will force the primary to shut down straight away, aborting all processes
+  and transactions.  This will cause a flurry of activity in the <application>repmgrd</application> log
+  files as each <application>repmgrd</application> detects the failure of the primary and a failover
+  decision is made. This is an extract from the log of a standby server (<literal>node2</literal>)
+  which has promoted to new primary after failure of the original primary (<literal>node1</literal>).
+  <programlisting>
+    [2017-08-24 23:32:01] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state
+    [2017-08-24 23:32:08] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
+    [2017-08-24 23:32:08] [INFO] checking state of node 1, 1 of 5 attempts
+    [2017-08-24 23:32:08] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-08-24 23:32:09] [INFO] checking state of node 1, 2 of 5 attempts
+    [2017-08-24 23:32:09] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-08-24 23:32:10] [INFO] checking state of node 1, 3 of 5 attempts
+    [2017-08-24 23:32:10] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-08-24 23:32:11] [INFO] checking state of node 1, 4 of 5 attempts
+    [2017-08-24 23:32:11] [INFO] sleeping 1 seconds until next reconnection attempt
+    [2017-08-24 23:32:12] [INFO] checking state of node 1, 5 of 5 attempts
+    [2017-08-24 23:32:12] [WARNING] unable to reconnect to node 1 after 5 attempts
+    INFO:  setting voting term to 1
+    INFO:  node 2 is candidate
+    INFO:  node 3 has received request from node 2 for electoral term 1 (our term: 0)
+    [2017-08-24 23:32:12] [NOTICE] this node is the winner, will now promote self and inform other nodes
+    INFO: connecting to standby database
+    NOTICE: promoting standby
+    DETAIL: promoting server using 'pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote'
+    INFO: reconnecting to promoted server
+    NOTICE: STANDBY PROMOTE successful
+    DETAIL: node 2 was successfully promoted to primary
+    INFO:  node 3 received notification to follow node 2
+    [2017-08-24 23:32:13] [INFO] switching to primary monitoring mode</programlisting>
+ </para>
+ <para>
+  The cluster status will now look like this, with the original primary (<literal>node1</literal>)
+  marked as inactive, and standby <literal>node3</literal> now following the new primary
+  (<literal>node2</literal>):
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+----------------------------------------------------
+     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node2    | default  | host=node3 dbname=repmgr user=repmgr</programlisting>
+
+ </para>
+ <para>
+  <command>repmgr cluster event</command> will display a summary of what happened to each server
+  during the failover:
+  <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster event
+     Node ID | Name  | Event                    | OK | Timestamp           | Details
+    ---------+-------+--------------------------+----+---------------------+-----------------------------------------------------------------------------------
+     3       | node3 | repmgrd_failover_follow  | t  | 2017-08-24 23:32:16 | node 3 now following new upstream node 2
+     3       | node3 | standby_follow           | t  | 2017-08-24 23:32:16 | node 3 is now attached to node 2
+     2       | node2 | repmgrd_failover_promote | t  | 2017-08-24 23:32:13 | node 2 promoted to primary; old primary 1 marked as failed
+     2       | node2 | standby_promote          | t  | 2017-08-24 23:32:13 | node 2 was successfully promoted to primary</programlisting>
+ </para>
+</chapter>
--- a/doc/repmgrd-monitoring.sgml
+++ b/doc/repmgrd-monitoring.sgml
@@ -0,0 +1,80 @@
+<chapter id="repmgrd-monitoring">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>monitoring</secondary>
+ </indexterm>
+ <indexterm>
+   <primary>monitoring</primary>
+   <secondary>with repmgrd</secondary>
+ </indexterm>
+
+ <title>Monitoring with repmgrd</title>
+ <para>
+   When <application>repmgrd</application> is running with the option <literal>monitoring_history=true</literal>,
+  it will constantly write standby node status information to the
+  <varname>monitoring_history</varname> table, providing a near-real time
+  overview of replication status on all nodes
+  in the cluster.
+ </para>
+ <para>
+   The view <literal>replication_status</literal> shows the most recent state
+   for each node, e.g.:
+  <programlisting>
+    repmgr=# select * from repmgr.replication_status;
+    -[ RECORD 1 ]-------------+------------------------------
+    primary_node_id           | 1
+    standby_node_id           | 2
+    standby_name              | node2
+    node_type                 | standby
+    active                    | t
+    last_monitor_time         | 2017-08-24 16:28:41.260478+09
+    last_wal_primary_location | 0/6D57A00
+    last_wal_standby_location | 0/5000000
+    replication_lag           | 29 MB
+    replication_time_lag      | 00:00:11.736163
+    apply_lag                 | 15 MB
+    communication_time_lag    | 00:00:01.365643</programlisting>
+ </para>
+ <para>
+  The interval in which monitoring history is written is controlled by the
+  configuration parameter <varname>monitor_interval_secs</varname>;
+  default is 2.
+ </para>
+ <para>
+  As this can generate a large amount of monitoring data in the table
+  <literal>repmgr.monitoring_history</literal>. it's advisable to regularly
+  purge historical data using the <xref linkend="repmgr-cluster-cleanup">
+  command; use the <literal>-k/--keep-history</literal> option to
+  specify how many day's worth of data should be retained.
+ </para>
+ <para>
+  It's possible to use <application>repmgrd</application> to run in monitoring
+  mode only (without automatic failover capability) for some or all
+  nodes by setting <literal>failover=manual</literal> in the node's
+  <filename>repmgr.conf</filename> file. In the event of the node's upstream failing,
+  no failover action will be taken and the node will require manual intervention to
+  be reattached to replication. If this occurs, an
+  <link linkend="event-notifications">event notification</link>
+  <varname>standby_disconnect_manual</varname> will be created.
+ </para>
+ <para>
+  Note that when a standby node is not streaming directly from its upstream
+  node, e.g. recovering WAL from an archive, <varname>apply_lag</varname> will always appear as
+  <literal>0 bytes</literal>.
+ </para>
+ <tip>
+  <para>
+   If monitoring history is enabled, the contents of the <literal>repmgr.monitoring_history</literal>
+   table will be replicated to attached standbys. This means there will be a small but
+   constant stream of replication activity which may not be desirable. To prevent
+   this, convert the table to an <literal>UNLOGGED</literal> one with:
+   <programlisting>
+     ALTER TABLE repmgr.monitoring_history SET UNLOGGED;</programlisting>
+  </para>
+  <para>
+   This will however mean that monitoring history will not be available on
+   another node following a failover, and the view <literal>repmgr.replication_status</literal>
+   will not work on standbys.
+  </para>
+ </tip>
+</chapter>
--- a/doc/repmgrd-network-split.sgml
+++ b/doc/repmgrd-network-split.sgml
@@ -0,0 +1,48 @@
+<chapter id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>network splits</secondary>
+ </indexterm>
+
+ <title>Handling network splits with repmgrd</title>
+ <para>
+  A common pattern for replication cluster setups is to spread servers over
+  more than one datacentre. This can provide benefits such as geographically-
+  distributed read replicas and DR (disaster recovery capability). However
+  this also means there is a risk of disconnection at network level between
+  datacentre locations, which would result in a split-brain scenario if
+  servers in a secondary data centre were no longer able to see the primary
+  in the main data centre and promoted a standby among themselves.
+ </para>
+ <para>
+  &repmgr; enables provision of &quot;<xref linkend="witness-server">&quot; to
+  artificially create a quorum of servers in a particular location, ensuring
+  that nodes in another location will not elect a new primary if they
+  are unable to see the majority of nodes. However this approach does not
+  scale well, particularly with more complex replication setups, e.g.
+  where the majority of nodes are located outside of the primary datacentre.
+  It also means the <literal>witness</literal> node needs to be managed as an
+  extra PostgreSQL instance outside of the main replication cluster, which
+  adds administrative and programming complexity.
+ </para>
+ <para>
+  <literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
+  each node is associated with an arbitrary location string (default is
+  <literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
+  <programlisting>
+    node_id=1
+    node_name=node1
+    conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
+    data_directory='/var/lib/postgresql/data'
+    location='dc1'</programlisting>
+ </para>
+ <para>
+  In a failover situation, <application>repmgrd</application> will check if any servers in the
+  same location as the current primary node are visible.  If not, <application>repmgrd</application>
+  will assume a network interruption and not promote any node in any
+  other location (it will however enter <xref linkend="repmgrd-degraded-monitoring"> mode until
+  a primary becomes visible).
+ </para>
+
+</chapter>
+
--- a/doc/repmgrd-witness-server.sgml
+++ b/doc/repmgrd-witness-server.sgml
@@ -0,0 +1,31 @@
+<chapter id="repmgrd-witness-server" xreflabel="Using a witness server with repmgrd">
+ <indexterm>
+   <primary>repmgrd</primary>
+   <secondary>witness server</secondary>
+ </indexterm>
+
+ <title>Using a witness server with repmgrd</title>
+ <para>
+   In a situation caused e.g. by a network interruption between two
+   data centres, it's important to avoid a "split-brain" situation where
+   both sides of the network assume they are the active segment and the
+   side without an active primary unilaterally promotes one of its standbys.
+ </para>
+ <para>
+   To prevent this situation happening, it's essential to ensure that one
+   network segment has a "voting majority", so other segments will know
+   they're in the minority and not attempt to promote a new primary. Where
+   an odd number of servers exists, this is not an issue. However, if each
+   network has an even number of nodes, it's necessary to provide some way
+   of ensuring a majority, which is where the witness server becomes useful.
+ </para>
+ <para>
+   This is not a fully-fledged standby node and is not integrated into
+   replication, but it effectively represents the "casting vote" when
+   deciding which network segment has a majority. A witness server can
+   be set up using <xref linkend="repmgr-witness-register">. Note that it only
+   makes sense to create a witness server in conjunction with running
+   <application>repmgrd</application>; the witness server will require its own
+   <application>repmgrd</application> instance.
+ </para>
+</chapter>
--- a/doc/stylesheet.css
+++ b/doc/stylesheet.css
@@ -0,0 +1,96 @@
+/* doc/src/sgml/stylesheet.css */
+
+/* color scheme similar to www.postgresql.org */
+
+BODY {
+	color: #000000;
+	background: #FFFFFF;
+	font-family: verdana, sans-serif;
+}
+
+A:link		{ color:#0066A2; }
+A:visited	{ color:#004E66; }
+A:active	{ color:#0066A2; }
+A:hover		{ color:#000000; }
+
+H1 {
+	font-size: 1.4em;
+	font-weight: bold;
+	margin-top: 0em;
+	margin-bottom: 0em;
+	color: #EC5800;
+}
+
+H2 {
+	font-size: 1.2em;
+	margin: 1.2em 0em 1.2em 0em;
+	font-weight: bold;
+	color: #666;
+}
+
+H3 {
+	font-size: 1.1em;
+	margin: 1.2em 0em 1.2em 0em;
+	font-weight: bold;
+	color: #666;
+}
+
+H4 {
+	font-size: 0.95em;
+	margin: 1.2em 0em 1.2em 0em;
+	font-weight: normal;
+	color: #666;
+}
+
+H5 {
+	font-size: 0.9em;
+	margin: 1.2em 0em 1.2em 0em;
+	font-weight: normal;
+}
+
+H6 {
+	font-size: 0.85em;
+	margin: 1.2em 0em 1.2em 0em;
+	font-weight: normal;
+}
+
+/* center some titles */
+
+.BOOK .TITLE, .BOOK .CORPAUTHOR, .BOOK .COPYRIGHT {
+	text-align: center;
+}
+
+/* decoration for formal examples */
+
+DIV.EXAMPLE {
+	padding-left: 15px;
+	border-style: solid;
+	border-width: 0px;
+	border-left-width: 2px;
+	border-color: black;
+	margin: 0.5ex;
+}
+
+/* less dense spacing of TOC */
+
+.BOOK .TOC DL DT {
+	padding-top: 1.5ex;
+	padding-bottom: 1.5ex;
+}
+
+.BOOK .TOC DL DL DT {
+	padding-top: 0ex;
+	padding-bottom: 0ex;
+}
+
+/* miscellaneous */
+
+PRE.LITERALLAYOUT, .SCREEN, .SYNOPSIS, .PROGRAMLISTING {
+	margin-left: 4ex;
+}
+
+.COMMENT	{ color: red; }
+
+VAR		{ font-family: monospace; font-style: italic; }
+/* Konqueror's standard style for ACRONYM is italic. */
+ACRONYM		{ font-style: inherit; }
--- a/doc/stylesheet.dsl
+++ b/doc/stylesheet.dsl
@@ -0,0 +1,851 @@
+<!-- doc/src/sgml/stylesheet.dsl -->
+<!DOCTYPE style-sheet PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" [
+
+<!-- must turn on one of these with -i on the jade command line -->
+<!ENTITY % output-html          "IGNORE">
+<!ENTITY % output-print         "IGNORE">
+<!ENTITY % output-text          "IGNORE">
+
+<![ %output-html; [
+<!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML Stylesheet//EN" CDATA DSSSL>
+]]>
+
+<![ %output-print; [
+<!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook Print Stylesheet//EN" CDATA DSSSL>
+]]>
+
+<![ %output-text; [
+<!ENTITY dbstyle PUBLIC "-//Norman Walsh//DOCUMENT DocBook HTML Stylesheet//EN" CDATA DSSSL>
+]]>
+
+]>
+
+<style-sheet>
+ <style-specification use="docbook">
+  <style-specification-body>
+
+<!-- general customization ......................................... -->
+
+<!-- (applicable to all output formats) -->
+
+(define draft-mode              #f)
+
+;; Don't show manpage volume numbers
+(define %refentry-xref-manvolnum% #f)
+
+;; Don't use graphics for callouts.  (We could probably do that, but
+;; it needs extra work.)
+(define %callout-graphics%      #f)
+
+;; Show comments during the development stage.
+(define %show-comments%         draft-mode)
+
+;; Force a chapter TOC even if it includes only a single entry
+(define %force-chapter-toc% #t)
+
+;; Don't append period if run-in title ends with any of these
+;; characters.  We had to add the colon here.  This is fixed in
+;; stylesheets version 1.71, so it can be removed sometime.
+(define %content-title-end-punct%
+  '(#\. #\! #\? #\:))
+
+;; No automatic punctuation after honorific name parts
+(define %honorific-punctuation% "")
+
+;; Change display of some elements
+(element command ($mono-seq$))
+(element envar ($mono-seq$))
+(element lineannotation ($italic-seq$))
+(element literal ($mono-seq$))
+(element option ($mono-seq$))
+(element parameter ($mono-seq$))
+(element structfield ($mono-seq$))
+(element structname ($mono-seq$))
+(element symbol ($mono-seq$))
+(element token ($mono-seq$))
+(element type ($mono-seq$))
+(element varname ($mono-seq$))
+(element (programlisting emphasis) ($bold-seq$)) ;; to highlight sections of code
+
+;; Special support for Tcl synopses
+(element optional
+  (if (equal? (attribute-string (normalize "role")) "tcl")
+      (make sequence
+        (literal "?")
+        ($charseq$)
+        (literal "?"))
+      (make sequence
+        (literal %arg-choice-opt-open-str%)
+        ($charseq$)
+        (literal %arg-choice-opt-close-str%))))
+
+;; Avoid excessive cross-reference labels
+(define (auto-xref-indirect? target ancestor)
+  (cond
+;   ;; Always add indirect references to another book
+;   ((member (gi ancestor) (book-element-list))
+;    #t)
+   ;; Add indirect references to the section or component a block
+   ;; is in iff chapters aren't autolabelled.  (Otherwise "Figure 1-3"
+   ;; is sufficient)
+   ((and (member (gi target) (block-element-list))
+         (not %chapter-autolabel%))
+    #t)
+   ;; Add indirect references to the component a section is in if
+   ;; the sections are not autolabelled
+   ((and (member (gi target) (section-element-list))
+         (member (gi ancestor) (component-element-list))
+         (not %section-autolabel%))
+    #t)
+   (else #f)))
+
+
+;; Bibliography things
+
+;; Use the titles of bibliography entries in cross-references
+(define biblio-xref-title       #t)
+
+;; Process bibliography entry components in the order shown below, not
+;; in the order they appear in the document.  (I suppose this should
+;; be made to fit some publishing standard.)
+(define %biblioentry-in-entry-order% #f)
+
+(define (biblioentry-inline-elements)
+  (list
+   (normalize "author")
+   (normalize "authorgroup")
+   (normalize "title")
+   (normalize "subtitle")
+   (normalize "volumenum")
+   (normalize "edition")
+   (normalize "othercredit")
+   (normalize "contrib")
+   (normalize "editor")
+   (normalize "publishername")
+   (normalize "confgroup")
+   (normalize "publisher")
+   (normalize "isbn")
+   (normalize "issn")
+   (normalize "pubsnumber")
+   (normalize "date")
+   (normalize "pubdate")
+   (normalize "pagenums")
+   (normalize "bibliomisc")))
+
+(mode biblioentry-inline-mode
+
+  (element confgroup
+    (make sequence
+      (literal "Proc. ")
+      (next-match)))
+
+  (element isbn
+    (make sequence
+      (literal "ISBN ")
+      (process-children)))
+
+  (element issn
+    (make sequence
+      (literal "ISSN ")
+      (process-children))))
+
+
+;; The rules in the default stylesheet for productname format it as a
+;; paragraph.  This may be suitable for productname directly within
+;; *info, but it's nonsense when productname is used inline, as we do.
+(mode book-titlepage-recto-mode
+  (element (para productname) ($charseq$)))
+(mode book-titlepage-verso-mode
+  (element (para productname) ($charseq$)))
+;; Add more here if needed...
+
+
+;; Replace a sequence of whitespace in a string by a single space
+(define (normalize-whitespace str #!optional (whitespace '(#\space #\U-000D)))
+  (let loop ((characters (string->list str))
+             (result '())
+             (prev-was-space #f))
+    (if (null? characters)
+        (list->string (reverse result))
+        (let ((c (car characters))
+              (rest (cdr characters)))
+          (if (member c whitespace)
+              (if prev-was-space
+                  (loop rest result #t)
+                  (loop rest (cons #\space result) #t))
+              (loop rest (cons c result) #f))))))
+
+
+<!-- HTML output customization ..................................... -->
+
+<![ %output-html; [
+
+(define %section-autolabel%     #t)
+(define %label-preface-sections% #f)
+(define %generate-legalnotice-link% #t)
+(define %html-ext%              ".html")
+(define %root-filename%         "index")
+(define %link-mailto-url%       (string-append "mailto: repmgr-list@2ndquadrant.com"))
+(define %use-id-as-filename%    #t)
+(define website-build           #f)
+(define %stylesheet%            (if website-build "/resources/docs.css" "website-docs.css"))
+(define %graphic-default-extension% "gif")
+(define %body-attr%             '())
+(define ($generate-book-lot-list$) '())
+(define use-output-dir          #t)
+(define %output-dir%            "html")
+(define html-index-filename     "../HTML.index")
+
+
+;; Only build HTML.index or the actual HTML output, not both.  Saves a
+;; *lot* of time.  (overrides docbook.dsl)
+(root
+   (if (not html-index)
+       (make sequence
+         (process-children)
+         (with-mode manifest
+           (process-children)))
+       (with-mode htmlindex
+         (process-children))))
+
+
+;; Do not combine first section into chapter chunk.
+(define (chunk-skip-first-element-list) '())
+
+;; Returns the depth of auto TOC that should be made at the nd-level
+(define (toc-depth nd)
+  (cond ((string=? (gi nd) (normalize "book")) 2)
+        ((string=? (gi nd) (normalize "part")) 2)
+        ((string=? (gi nd) (normalize "chapter")) 2)
+        (else 1)))
+
+;; Add character encoding and time of creation into HTML header
+(define %html-header-tags%
+  (list (list "META" '("HTTP-EQUIV" "Content-Type") '("CONTENT" "text/html; charset=ISO-8859-1"))
+        (list "META" '("NAME" "creation") (list "CONTENT" (time->string (time) #t)))))
+
+
+;; Block elements are allowed in PARA in DocBook, but not in P in
+;; HTML.  With %fix-para-wrappers% turned on, the stylesheets attempt
+;; to avoid putting block elements in HTML P tags by outputting
+;; additional end/begin P pairs around them.
+(define %fix-para-wrappers% #t)
+
+;; ...but we need to do some extra work to make the above apply to PRE
+;; as well.  (mostly pasted from dbverb.dsl)
+(define ($verbatim-display$ indent line-numbers?)
+  (let ((content (make element gi: "PRE"
+                       attributes: (list
+                                    (list "CLASS" (gi)))
+                       (if (or indent line-numbers?)
+                           ($verbatim-line-by-line$ indent line-numbers?)
+                           (process-children)))))
+    (if %shade-verbatim%
+        (make element gi: "TABLE"
+              attributes: ($shade-verbatim-attr$)
+              (make element gi: "TR"
+                    (make element gi: "TD"
+                          content)))
+        (make sequence
+          (para-check)
+          content
+          (para-check 'restart)))))
+
+;; ...and for notes.
+(element note
+  (make sequence
+    (para-check)
+    ($admonition$)
+    (para-check 'restart)))
+
+;;; XXX The above is very ugly.  It might be better to run 'tidy' on
+;;; the resulting *.html files.
+
+
+;; Format multiple terms in varlistentry vertically, instead
+;; of comma-separated.
+(element (varlistentry term)
+  (make sequence
+    (process-children-trim)
+    (if (not (last-sibling?))
+        (make empty-element gi: "BR")
+        (empty-sosofo))))
+
+
+;; Customization of header
+;; - make title a link to the home page
+;; - add tool tips to Prev/Next links
+;; - add Up link
+;; (overrides dbnavig.dsl)
+(define (default-header-nav-tbl-noff elemnode prev next prevsib nextsib)
+  (let* ((r1? (nav-banner? elemnode))
+         (r1-sosofo (make element gi: "TR"
+                          (make element gi: "TH"
+                                attributes: (list
+                                             (list "COLSPAN" "4")
+                                             (list "ALIGN" "center")
+                                             (list "VALIGN" "bottom"))
+                                (make element gi: "A"
+                                      attributes: (list
+                                                   (list "HREF" (href-to (nav-home elemnode))))
+                                      (nav-banner elemnode)))))
+         (r2? (or (not (node-list-empty? prev))
+                  (not (node-list-empty? next))
+                  (nav-context? elemnode)))
+         (r2-sosofo (make element gi: "TR"
+                          (make element gi: "TD"
+                                attributes: (list
+                                             (list "WIDTH" "10%")
+                                             (list "ALIGN" "left")
+                                             (list "VALIGN" "top"))
+                                (if (node-list-empty? prev)
+                                    (make entity-ref name: "nbsp")
+                                    (make element gi: "A"
+                                          attributes: (list
+                                                       (list "TITLE" (element-title-string prev))
+                                                       (list "HREF"
+                                                             (href-to
+                                                              prev))
+                                                       (list "ACCESSKEY"
+                                                             "P"))
+                                          (gentext-nav-prev prev))))
+                          (make element gi: "TD"
+                                attributes: (list
+                                             (list "WIDTH" "10%")
+                                             (list "ALIGN" "left")
+                                             (list "VALIGN" "top"))
+                                (if (nav-up? elemnode)
+                                    (nav-up elemnode)
+                                    (nav-home-link elemnode)))
+                          (make element gi: "TD"
+                                attributes: (list
+                                             (list "WIDTH" "60%")
+                                             (list "ALIGN" "center")
+                                             (list "VALIGN" "bottom"))
+                                (nav-context elemnode))
+                          (make element gi: "TD"
+                                attributes: (list
+                                             (list "WIDTH" "20%")
+                                             (list "ALIGN" "right")
+                                             (list "VALIGN" "top"))
+                                (if (node-list-empty? next)
+                                    (make entity-ref name: "nbsp")
+                                    (make element gi: "A"
+                                          attributes: (list
+                                                       (list "TITLE" (element-title-string next))
+                                                       (list "HREF"
+                                                             (href-to
+                                                              next))
+                                                       (list "ACCESSKEY"
+                                                             "N"))
+                                          (gentext-nav-next next)))))))
+    (if (or r1? r2?)
+        (make element gi: "DIV"
+              attributes: '(("CLASS" "NAVHEADER"))
+          (make element gi: "TABLE"
+                attributes: (list
+                             (list "SUMMARY" "Header navigation table")
+                             (list "WIDTH" %gentext-nav-tblwidth%)
+                             (list "BORDER" "0")
+                             (list "CELLPADDING" "0")
+                             (list "CELLSPACING" "0"))
+                (if r1? r1-sosofo (empty-sosofo))
+                (if r2? r2-sosofo (empty-sosofo)))
+          (make empty-element gi: "HR"
+                attributes: (list
+                             (list "ALIGN" "LEFT")
+                             (list "WIDTH" %gentext-nav-tblwidth%))))
+        (empty-sosofo))))
+
+
+;; Put index "quicklinks" (A | B | C | ...) at the top of the bookindex page.
+
+(element index
+  (let ((preamble (node-list-filter-by-not-gi
+                   (children (current-node))
+                   (list (normalize "indexentry"))))
+        (indexdivs  (node-list-filter-by-gi
+                     (children (current-node))
+                     (list (normalize "indexdiv"))))
+        (entries  (node-list-filter-by-gi
+                   (children (current-node))
+                   (list (normalize "indexentry")))))
+    (html-document
+     (with-mode head-title-mode
+       (literal (element-title-string (current-node))))
+     (make element gi: "DIV"
+           attributes: (list (list "CLASS" (gi)))
+           ($component-separator$)
+           ($component-title$)
+           (if (node-list-empty? indexdivs)
+               (empty-sosofo)
+               (make element gi: "P"
+                     attributes: (list (list "CLASS" "INDEXDIV-QUICKLINKS"))
+                     (with-mode indexdiv-quicklinks-mode
+                       (process-node-list indexdivs))))
+           (process-node-list preamble)
+           (if (node-list-empty? entries)
+               (empty-sosofo)
+               (make element gi: "DL"
+                     (process-node-list entries)))))))
+
+
+(mode indexdiv-quicklinks-mode
+  (element indexdiv
+    (make sequence
+      (make element gi: "A"
+            attributes: (list (list "HREF" (href-to (current-node))))
+            (element-title-sosofo))
+      (if (not (last-sibling?))
+          (literal " | ")
+          (literal "")))))
+
+
+;; Changed to strip and normalize index term content (overrides
+;; dbindex.dsl)
+(define (htmlindexterm)
+  (let* ((attr    (gi (current-node)))
+         (content (data (current-node)))
+         (string  (strip (normalize-whitespace content))) ;; changed
+         (sortas  (attribute-string (normalize "sortas"))))
+    (make sequence
+      (make formatting-instruction data: attr)
+      (if sortas
+          (make sequence
+            (make formatting-instruction data: "[")
+            (make formatting-instruction data: sortas)
+            (make formatting-instruction data: "]"))
+          (empty-sosofo))
+      (make formatting-instruction data: " ")
+      (make formatting-instruction data: string)
+      (htmlnewline))))
+
+(define ($html-body-start$)
+        (if website-build
+            (make empty-element gi: "!--#include virtual=\"/resources/docs-header.html\"--")
+            (empty-sosofo)))
+
+(define ($html-body-end$)
+        (if website-build
+            (make empty-element gi: "!--#include virtual=\"/resources/docs-footer.html\"--")
+            (empty-sosofo)))
+
+]]> <!-- %output-html -->
+
+
+<!-- Print output customization .................................... -->
+
+<![ %output-print; [
+
+(define %section-autolabel%     #t)
+(define %default-quadding%      'justify)
+
+;; Don't know how well hyphenation works with other backends.  Might
+;; turn this on if desired.
+(define %hyphenation%
+  (if tex-backend #t #f))
+
+;; Put footnotes at the bottom of the page (rather than end of
+;; section), and put the URLs of links into footnotes.
+;;
+;; bop-footnotes only works with TeX, otherwise it's ignored.  But
+;; when both of these are #t and TeX is used, you need at least
+;; stylesheets 1.73 because otherwise you don't get any footnotes at
+;; all for the links.
+(define bop-footnotes           #t)
+(define %footnote-ulinks%       #t)
+
+(define %refentry-new-page%     #t)
+(define %refentry-keep%         #f)
+
+;; Disabled because of TeX problems
+;; (http://archives.postgresql.org/pgsql-docs/2007-12/msg00056.php)
+(define ($generate-book-lot-list$) '())
+
+;; Indentation of verbatim environments.  (This should really be done
+;; with start-indent in DSSSL.)
+;; Use of indentation in this area exposes a bug in openjade,
+;; http://archives.postgresql.org/pgsql-docs/2006-12/msg00064.php
+;; (define %indent-programlisting-lines% "    ")
+;; (define %indent-screen-lines% "    ")
+;; (define %indent-synopsis-lines% "    ")
+
+
+;; Default graphic format: Jadetex wants eps, pdfjadetex wants pdf.
+;; (Note that pdfjadetex will not accept eps, that's why we need to
+;; create a different .tex file for each.)  What works with RTF?
+
+(define texpdf-output #f) ;; override from command line
+
+(define %graphic-default-extension%
+  (cond (tex-backend (if texpdf-output "pdf" "eps"))
+        (rtf-backend "gif")
+        (else "XXX")))
+
+;; Need to add pdf here so that the above works.  Default setup
+;; doesn't know about PDF.
+(define preferred-mediaobject-extensions
+  (list "eps" "ps" "jpg" "jpeg" "pdf" "png"))
+
+
+;; Don't show links when citing a bibliography entry.  This fouls up
+;; the footnumber counting.  To get the link, one can still look into
+;; the bibliography itself.
+(mode xref-title-mode
+  (element ulink
+    (process-children)))
+
+
+;; Format legalnotice justified and with space between paragraphs.
+(mode book-titlepage-verso-mode
+  (element (legalnotice para)
+    (make paragraph
+      use: book-titlepage-verso-style   ;; alter this if ever it needs to appear elsewhere
+      quadding: %default-quadding%
+      line-spacing: (* 0.8 (inherited-line-spacing))
+      font-size: (* 0.8 (inherited-font-size))
+      space-before: (* 0.8 %para-sep%)
+      space-after: (* 0.8 %para-sep%)
+      first-line-start-indent: (if (is-first-para)
+                                   (* 0.8 %para-indent-firstpara%)
+                                   (* 0.8 %para-indent%))
+      (process-children))))
+
+
+;; Fix spacing problems in variablelists
+
+(element (varlistentry term)
+  (make paragraph
+    space-before: (if (first-sibling?)
+                      %para-sep%
+                      0pt)
+    keep-with-next?: #t
+    (process-children)))
+
+(define %varlistentry-indent% 2em)
+
+(element (varlistentry listitem)
+  (make sequence
+    start-indent: (+ (inherited-start-indent) %varlistentry-indent%)
+    (process-children)))
+
+
+;; Whitespace fixes for itemizedlists and orderedlists
+
+(define (process-listitem-content)
+  (if (absolute-first-sibling?)
+      (make sequence
+        (process-children-trim))
+      (next-match)))
+
+
+;; Default stylesheets format simplelists as tables.  This spells
+;; trouble for Jade.  So we just format them as plain lines.
+
+(define %simplelist-indent% 1em)
+
+(define (my-simplelist-vert members)
+  (make display-group
+    space-before: %para-sep%
+    space-after: %para-sep%
+    start-indent: (+ %simplelist-indent% (inherited-start-indent))
+    (process-children)))
+
+(element simplelist
+  (let ((type (attribute-string (normalize "type")))
+        (cols (if (attribute-string (normalize "columns"))
+                  (if (> (string->number (attribute-string (normalize "columns"))) 0)
+                      (string->number (attribute-string (normalize "columns")))
+                      1)
+                  1))
+        (members (select-elements (children (current-node)) (normalize "member"))))
+    (cond
+       ((equal? type (normalize "inline"))
+        (if (equal? (gi (parent (current-node)))
+                    (normalize "para"))
+            (process-children)
+            (make paragraph
+              space-before: %para-sep%
+              space-after: %para-sep%
+              start-indent: (inherited-start-indent))))
+       ((equal? type (normalize "vert"))
+        (my-simplelist-vert members))
+       ((equal? type (normalize "horiz"))
+        (simplelist-table 'row    cols members)))))
+
+(element member
+  (let ((type (inherited-attribute-string (normalize "type"))))
+    (cond
+     ((equal? type (normalize "inline"))
+      (make sequence
+        (process-children)
+        (if (not (last-sibling?))
+            (literal ", ")
+            (literal ""))))
+      ((equal? type (normalize "vert"))
+       (make paragraph
+         space-before: 0pt
+         space-after: 0pt))
+      ((equal? type (normalize "horiz"))
+       (make paragraph
+         quadding: 'start
+         (process-children))))))
+
+
+;; Jadetex doesn't handle links to the content of tables, so
+;; indexterms that point to table entries will go nowhere.  We fix
+;; this by pointing the index entry to the table itself instead, which
+;; should be equally useful in practice.
+
+(define (find-parent-table nd)
+  (let ((table (ancestor-member nd ($table-element-list$))))
+    (if (node-list-empty? table)
+        nd
+        table)))
+
+;; (The function below overrides the one in print/dbindex.dsl.)
+
+(define (indexentry-link nd)
+  (let* ((id        (attribute-string (normalize "role") nd))
+         (prelim-target (find-indexterm id))
+         (target    (find-parent-table prelim-target))
+         (preferred (not (node-list-empty?
+                          (select-elements (children (current-node))
+                                           (normalize "emphasis")))))
+         (sosofo    (if (node-list-empty? target)
+                        (literal "?")
+                        (make link
+                          destination: (node-list-address target)
+                          (with-mode toc-page-number-mode
+                            (process-node-list target))))))
+    (if preferred
+        (make sequence
+          font-weight: 'bold
+          sosofo)
+        sosofo)))
+
+
+;; By default, the part and reference title pages get wrong page
+;; numbers: The first title page gets roman numerals carried over from
+;; preface/toc -- we want Arabic numerals.  We also need to make sure
+;; that page-number-restart is set of #f explicitly, because otherwise
+;; it will carry over from the previous component, which is not good.
+;;
+;; (This looks worse than it is.  It's copied from print/dbttlpg.dsl
+;; and common/dbcommon.dsl and modified in minor detail.)
+
+(define (first-part?)
+  (let* ((book (ancestor (normalize "book")))
+         (nd   (ancestor-member (current-node)
+                                (append
+                                 (component-element-list)
+                                 (division-element-list))))
+         (bookch (children book)))
+    (let loop ((nl bookch))
+      (if (node-list-empty? nl)
+          #f
+          (if (equal? (gi (node-list-first nl)) (normalize "part"))
+              (if (node-list=? (node-list-first nl) nd)
+                  #t
+                  #f)
+              (loop (node-list-rest nl)))))))
+
+(define (first-reference?)
+  (let* ((book (ancestor (normalize "book")))
+         (nd   (ancestor-member (current-node)
+                                (append
+                                 (component-element-list)
+                                 (division-element-list))))
+         (bookch (children book)))
+    (let loop ((nl bookch))
+      (if (node-list-empty? nl)
+          #f
+          (if (equal? (gi (node-list-first nl)) (normalize "reference"))
+              (if (node-list=? (node-list-first nl) nd)
+                  #t
+                  #f)
+              (loop (node-list-rest nl)))))))
+
+
+(define (part-titlepage elements #!optional (side 'recto))
+  (let ((nodelist (titlepage-nodelist
+                   (if (equal? side 'recto)
+                       (reference-titlepage-recto-elements)
+                       (reference-titlepage-verso-elements))
+                   elements))
+        ;; partintro is a special case...
+        (partintro (node-list-first
+                    (node-list-filter-by-gi elements (list (normalize "partintro"))))))
+    (if (part-titlepage-content? elements side)
+        (make simple-page-sequence
+          page-n-columns: %titlepage-n-columns%
+          ;; Make sure that page number format is correct.
+          page-number-format: ($page-number-format$)
+          ;; Make sure that the page number is set to 1 if this is the
+          ;; first part in the book
+          page-number-restart?: (first-part?)
+          input-whitespace-treatment: 'collapse
+          use: default-text-style
+
+          ;; This hack is required for the RTF backend. If an external-graphic
+          ;; is the first thing on the page, RTF doesn't seem to do the right
+          ;; thing (the graphic winds up on the baseline of the first line
+          ;; of the page, left justified).  This "one point rule" fixes
+          ;; that problem.
+          (make paragraph
+            line-spacing: 1pt
+            (literal ""))
+
+          (let loop ((nl nodelist) (lastnode (empty-node-list)))
+            (if (node-list-empty? nl)
+                (empty-sosofo)
+                (make sequence
+                  (if (or (node-list-empty? lastnode)
+                          (not (equal? (gi (node-list-first nl))
+                                       (gi lastnode))))
+                      (part-titlepage-before (node-list-first nl) side)
+                      (empty-sosofo))
+                  (cond
+                   ((equal? (gi (node-list-first nl)) (normalize "subtitle"))
+                    (part-titlepage-subtitle (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "title"))
+                    (part-titlepage-title (node-list-first nl) side))
+                   (else
+                    (part-titlepage-default (node-list-first nl) side)))
+                  (loop (node-list-rest nl) (node-list-first nl)))))
+
+          (if (and %generate-part-toc%
+                   %generate-part-toc-on-titlepage%
+                   (equal? side 'recto))
+              (make display-group
+                (build-toc (current-node)
+                           (toc-depth (current-node))))
+              (empty-sosofo))
+
+          ;; PartIntro is a special case
+          (if (and (equal? side 'recto)
+                   (not (node-list-empty? partintro))
+                   %generate-partintro-on-titlepage%)
+              ($process-partintro$ partintro #f)
+              (empty-sosofo)))
+
+        (empty-sosofo))))
+
+
+(define (reference-titlepage elements #!optional (side 'recto))
+  (let ((nodelist (titlepage-nodelist
+                   (if (equal? side 'recto)
+                       (reference-titlepage-recto-elements)
+                       (reference-titlepage-verso-elements))
+                   elements))
+        ;; partintro is a special case...
+        (partintro (node-list-first
+                    (node-list-filter-by-gi elements (list (normalize "partintro"))))))
+    (if (reference-titlepage-content? elements side)
+        (make simple-page-sequence
+          page-n-columns: %titlepage-n-columns%
+          ;; Make sure that page number format is correct.
+          page-number-format: ($page-number-format$)
+          ;; Make sure that the page number is set to 1 if this is the
+          ;; first part in the book
+          page-number-restart?: (first-reference?)
+          input-whitespace-treatment: 'collapse
+          use: default-text-style
+
+          ;; This hack is required for the RTF backend. If an external-graphic
+          ;; is the first thing on the page, RTF doesn't seem to do the right
+          ;; thing (the graphic winds up on the baseline of the first line
+          ;; of the page, left justified).  This "one point rule" fixes
+          ;; that problem.
+          (make paragraph
+            line-spacing: 1pt
+            (literal ""))
+
+          (let loop ((nl nodelist) (lastnode (empty-node-list)))
+            (if (node-list-empty? nl)
+                (empty-sosofo)
+                (make sequence
+                  (if (or (node-list-empty? lastnode)
+                          (not (equal? (gi (node-list-first nl))
+                                       (gi lastnode))))
+                      (reference-titlepage-before (node-list-first nl) side)
+                      (empty-sosofo))
+                  (cond
+                   ((equal? (gi (node-list-first nl)) (normalize "author"))
+                    (reference-titlepage-author (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "authorgroup"))
+                    (reference-titlepage-authorgroup (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "corpauthor"))
+                    (reference-titlepage-corpauthor (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "editor"))
+                    (reference-titlepage-editor (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "subtitle"))
+                    (reference-titlepage-subtitle (node-list-first nl) side))
+                   ((equal? (gi (node-list-first nl)) (normalize "title"))
+                    (reference-titlepage-title (node-list-first nl) side))
+                   (else
+                    (reference-titlepage-default (node-list-first nl) side)))
+                  (loop (node-list-rest nl) (node-list-first nl)))))
+
+          (if (and %generate-reference-toc%
+                   %generate-reference-toc-on-titlepage%
+                   (equal? side 'recto))
+              (make display-group
+                (build-toc (current-node)
+                           (toc-depth (current-node))))
+              (empty-sosofo))
+
+          ;; PartIntro is a special case
+          (if (and (equal? side 'recto)
+                   (not (node-list-empty? partintro))
+                   %generate-partintro-on-titlepage%)
+              ($process-partintro$ partintro #f)
+              (empty-sosofo)))
+
+        (empty-sosofo))))
+
+]]> <!-- %output-print -->
+
+
+<!-- Plain text output customization ............................... -->
+
+<!--
+This is used for making the INSTALL file and others.  We customize the
+HTML stylesheets to be suitable for dumping plain text (via Netscape,
+Lynx, or similar).
+-->
+
+<![ %output-text; [
+
+(define %section-autolabel% #f)
+(define %chapter-autolabel% #f)
+(define $generate-chapter-toc$ (lambda () #f))
+
+;; For text output, produce "ASCII markup" for emphasis and such.
+
+(define ($asterix-seq$ #!optional (sosofo (process-children)))
+  (make sequence
+    (literal "*")
+    sosofo
+    (literal "*")))
+
+(define ($dquote-seq$ #!optional (sosofo (process-children)))
+  (make sequence
+    (literal (gentext-start-quote))
+    sosofo
+    (literal (gentext-end-quote))))
+
+(element (para command) ($dquote-seq$))
+(element (para emphasis) ($asterix-seq$))
+(element (para filename) ($dquote-seq$))
+(element (para option) ($dquote-seq$))
+(element (para replaceable) ($dquote-seq$))
+(element (para userinput) ($dquote-seq$))
+
+]]> <!-- %output-text -->
+
+  </style-specification-body>
+ </style-specification>
+
+ <external-specification id="docbook" document="dbstyle">
+</style-sheet>
--- a/doc/switchover.sgml
+++ b/doc/switchover.sgml
@@ -0,0 +1,329 @@
+<chapter id="performing-switchover" xreflabel="Performing a switchover with repmgr">
+
+ <indexterm>
+  <primary>switchover</primary>
+ </indexterm>
+
+ <title>Performing a switchover with repmgr</title>
+ <para>
+  A typical use-case for replication is a combination of primary and standby
+  server, with the standby serving as a backup which can easily be activated
+  in case of a problem with the primary. Such an unplanned failover would
+  normally be handled by promoting the standby, after which an appropriate
+  action must be taken to restore the old primary.
+ </para>
+ <para>
+  In some cases however it's desirable to promote the standby in a planned
+  way, e.g. so maintenance can be performed on the primary; this kind of switchover
+  is supported by the <xref linkend="repmgr-standby-switchover"> command.
+ </para>
+ <para>
+  <command>repmgr standby switchover</command> differs from other &repmgr;
+  actions in that it also performs actions on another server (the demotion
+  candidate), which means passwordless SSH access is required to that server
+  from the one where <command>repmgr standby switchover</command> is executed.
+ </para>
+ <note>
+  <simpara>
+   <command>repmgr standby switchover</command> performs a relatively complex
+   series of operations on two servers, and should therefore be performed after
+   careful preparation and with adequate attention. In particular you should
+   be confident that your network environment is stable and reliable.
+  </simpara>
+  <simpara>
+   Additionally you should be sure that the current primary can be shut down
+   quickly and cleanly. In particular, access from applications should be
+   minimalized or preferably blocked completely. Also be aware that if there
+   is a backlog of files waiting to be archived, PostgreSQL will not shut
+   down until archiving completes.
+  </simpara>
+  <simpara>
+    We recommend running <command>repmgr standby switchover</command> at the
+    most verbose logging level (<literal>--log-level=DEBUG --verbose</literal>)
+    and capturing all output to assist troubleshooting any problems.
+  </simpara>
+  <simpara>
+   Please also read carefully the sections <xref linkend="preparing-for-switchover"> and
+   <xref linkend="switchover-caveats"> below.
+  </simpara>
+ </note>
+
+ <sect1 id="preparing-for-switchover" xreflabel="Preparing for switchover">
+   <indexterm>
+     <primary>switchover</primary>
+     <secondary>preparation</secondary>
+   </indexterm>
+   <title>Preparing for switchover</title>
+
+   <para>
+    As mentioned in the previous section, success of the switchover operation depends on
+    &repmgr;  being able to shut down the current primary server quickly and cleanly.
+   </para>
+
+   <para>
+     Ensure that a passwordless SSH connection is possible from the promotion candidate
+     (standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
+     will be used, ensure that passwordless SSH connections are possible from the
+     promotion candidate to all standbys attached to the demotion candidate.
+   </para>
+
+   <note>
+     <simpara>
+       &repmgr; expects to find the &repmgr; binary in the same path on the remote
+       server as on the local server.
+     </simpara>
+   </note>
+
+   <para>
+    Double-check which commands will be used to stop/start/restart the current
+    primary; on the current primary execute:
+    <programlisting>
+     repmgr -f /etc/repmgr.conf node service --list --action=stop
+     repmgr -f /etc/repmgr.conf node service --list --action=start
+     repmgr -f /etc/repmgr.conf node service --list --action=restart</programlisting>
+
+   </para>
+
+   <para>
+     These commands can be defined in <filename>repmgr.conf</filename> with
+     <option>service_start_command</option>, <option>service_stop_command</option>
+     and <option>service_restart_command</option>.
+   </para>
+
+   <important>
+     <para>
+       If &repmgr; is installed from a package. you should set these commands
+       to use the appropriate service commands defined by the package/operating
+       system as these will ensure PostgreSQL is stopped/started properly
+       taking into account configuration and log file locations etc.
+     </para>
+     <para>
+       If the <option>service_*_command</option> options aren't defined, &repmgr; will
+       fall back to using <application>pg_ctl</application> to stop/start/restart
+       PostgreSQL, which may not work properly, particularly when executed on a remote
+       server.
+     </para>
+     <para>
+       For more details, see <xref linkend="configuration-service-commands">.
+     </para>
+   </important>
+
+   <note>
+    <simpara>
+     On <literal>systemd</literal> systems we strongly recommend using the appropriate
+     <command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
+     <literal>systemd</literal> is informed about the status of the PostgreSQL service.
+    </simpara>
+    <simpara>
+     If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
+     <command>sudo</command> specification doesn't require a real tty for the user. If not set
+     this way, <command>repmgr</command> will fail to stop the primary.
+    </simpara>
+   </note>
+
+
+   <para>
+    Check that access from applications is minimalized or preferably blocked
+    completely, so applications are not unexpectedly interrupted.
+   </para>
+
+   <para>
+    Check there is no significant replication lag on standbys attached to the
+    current primary.
+   </para>
+
+   <para>
+    If WAL file archiving is set up, check that there is no backlog of files waiting
+    to be archived, as PostgreSQL will not finally shut down until all of these have been
+    archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files,
+    &repmgr; will emit a warning before attempting to perform a switchover; you can also check
+    manually with <command>repmgr node check --archive-ready</command>.
+   </para>
+
+   <para>
+    Ensure that <application>repmgrd</application> is *not* running anywhere to prevent it unintentionally
+    promoting a node.
+   </para>
+
+   <para>
+    Finally, consider executing <command>repmgr standby switchover</command> with the
+    <literal>--dry-run</literal> option; this will perform any necessary checks and inform you about
+    success/failure, and stop before the first actual command is run (which would be the shutdown of the
+    current primary). Example output:
+    <programlisting>
+      $ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run
+      NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
+      INFO: SSH connection to host "node1" succeeded
+      INFO: archive mode is "off"
+      INFO: replication lag on this standby is 0 seconds
+      INFO: all sibling nodes are reachable via SSH
+      NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
+      INFO: following shutdown command would be run on node "node1":
+        "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
+    </programlisting>
+   </para>
+
+   <important>
+     <para>
+       Be aware that <option>--dry-run</option> checks the prerequisites
+       for performing the switchover and some basic sanity checks on the
+       state of the database which might effect the switchover operation
+       (e.g. replication lag); it cannot however guarantee the switchover
+       operation will succeed. In particular, if the current primary
+       does not shut down cleanly, &repmgr; will not be able to reliably
+       execute the switchover (as there would be a danger of divergence
+       between the former and new primary nodes).
+     </para>
+   </important>
+
+
+   <note>
+     <simpara>
+       See <xref linkend="repmgr-standby-switchover"> for a full list of available
+       command line options and <filename>repmgr.conf</filename> settings relevant
+       to performing a switchover.
+     </simpara>
+   </note>
+
+  <sect2 id="switchover-pg-rewind" xreflabel="Switchover and pg_rewind">
+    <indexterm>
+      <primary>pg_rewind</primary>
+      <secondary>using with "repmgr standby switchover"</secondary>
+    </indexterm>
+    <title>Switchover and pg_rewind</title>
+    <para>
+      If the demotion candidate does not shut down smoothly or cleanly, there's a risk it
+      will have a slightly divergent timeline and will not be able to attach to the new
+      primary. To fix this situation without needing to reclone the old primary, it's
+      possible to use the <application>pg_rewind</application> utility, which will usually be
+      able to resync the two servers.
+    </para>
+    <para>
+      To have &repmgr; execute <application>pg_rewind</application> if it detects this
+      situation after promoting the new primary, add the <option>--force-rewind</option>
+      option.
+    </para>
+    <note>
+      <simpara>
+        If &repmgr; detects a situation where it needs to execute <application>pg_rewind</application>,
+        it will execute a <literal>CHECKPOINT</literal> on the new primary before executing
+        <application>pg_rewind</application>.
+      </simpara>
+    </note>
+    <para>
+      For more details on <application>pg_rewind</application>, see:
+      <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html">https://www.postgresql.org/docs/current/static/app-pgrewind.html</ulink>.
+    </para>
+    <para>
+      <application>pg_rewind</application> has been part of the core PostgreSQL distribution since
+      version 9.5. Users of versions 9.3 and 9.4 will need to manually install it; the source code is available here:
+      <ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
+      If the <application>pg_rewind</application>
+      binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
+      its full path  on the demotion candidate  with <option>--force-rewind</option>.
+    </para>
+    <para>
+      Note that building the 9.3/9.4 version of <application>pg_rewind</application> requires the PostgreSQL
+      source code. Also, PostgreSQL 9.3 does not provide <varname>wal_log_hints</varname>,
+      meaning data checksums must have been enabled when the database was initialized.
+    </para>
+  </sect2>
+
+
+ </sect1>
+
+ <sect1 id="switchover-execution" xreflabel="Executing the switchover command">
+  <indexterm>
+   <primary>switchover</primary>
+    <secondary>execution</secondary>
+  </indexterm>
+  <title>Executing the switchover command</title>
+  <para>
+   To demonstrate switchover, we will assume a replication cluster with a
+   primary (<literal>node1</literal>) and one standby (<literal>node2</literal>);
+   after the switchover <literal>node2</literal> should become the primary with
+   <literal>node1</literal> following it.
+  </para>
+  <para>
+   The switchover command must be run from the standby which is to be promoted,
+   and in its simplest form looks like this:
+   <programlisting>
+    $ repmgr -f /etc/repmgr.conf standby switchover
+    NOTICE: executing switchover on node "node2" (ID: 2)
+    INFO: searching for primary node
+    INFO: checking if node 1 is primary
+    INFO: current primary node is 1
+    INFO: SSH connection to host "node1" succeeded
+    INFO: archive mode is "off"
+    INFO: replication lag on this standby is 0 seconds
+    NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
+    NOTICE: stopping current primary node "node1" (ID: 1)
+    NOTICE: issuing CHECKPOINT
+    DETAIL: executing server command "pg_ctl -l /var/log/postgres/startup.log -D '/var/lib/pgsql/data' -m fast -W stop"
+    INFO: checking primary status; 1 of 6 attempts
+    NOTICE: current primary has been cleanly shut down at location 0/3001460
+    NOTICE: promoting standby to primary
+    DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote"
+    server promoting
+    NOTICE: STANDBY PROMOTE successful
+    DETAIL: server "node2" (ID: 2) was successfully promoted to primary
+    INFO: setting node 1's primary to node 2
+    NOTICE: starting server using  "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' restart"
+    NOTICE: NODE REJOIN successful
+    DETAIL: node 1 is now attached to node 2
+    NOTICE: switchover was successful
+    DETAIL: node "node2" is now primary
+    NOTICE: STANDBY SWITCHOVER is complete
+   </programlisting>
+  </para>
+  <para>
+   The old primary is now replicating as a standby from the new primary, and the
+   cluster status will now look like this:
+   <programlisting>
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Location | Connection string
+    ----+-------+---------+-----------+----------+----------+--------------------------------------
+     1  | node1 | standby |   running | node2    | default  | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
+   </programlisting>
+  </para>
+ </sect1>
+ <sect1 id="switchover-caveats" xreflabel="Caveats">
+  <indexterm>
+   <primary>switchover</primary>
+    <secondary>caveats</secondary>
+  </indexterm>
+  <title>Caveats</title>
+  <para>
+   <itemizedlist spacing="compact" mark="bullet">
+    <listitem>
+     <simpara>
+      If using PostgreSQL 9.3 or 9.4, you should ensure that the shutdown command
+      is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
+      and later). If relying on <command>pg_ctl</command> to perform database server operations,
+      you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>
+      in <filename>repmgr.conf</filename>.
+     </simpara>
+    </listitem>
+    <listitem>
+     <simpara>
+      <command>pg_rewind</command> *requires* that either <varname>wal_log_hints</varname> is enabled, or that
+      data checksums were enabled when the cluster was initialized. See the
+      <ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html">pg_rewind documentation</ulink>
+      for details.
+     </simpara>
+    </listitem>
+    <listitem>
+     <simpara>
+      <application>repmgrd</application> should not be running with setting <varname>failover=automatic</varname>
+      in <filename>repmgr.conf</filename> when a switchover is carried out, otherwise the
+      <application>repmgrd</application> daemon may try and promote a standby by itself.
+     </simpara>
+    </listitem>
+   </itemizedlist>
+  </para>
+  <para>
+   We hope to remove some of these restrictions in future versions of &repmgr;.
+  </para>
+ </sect1>
+</chapter>
--- a/doc/upgrading-from-repmgr3.md
+++ b/doc/upgrading-from-repmgr3.md
@@ -1,121 +1,9 @@
 Upgrading from repmgr 3
 =======================

-The upgrade process consists of two steps:
+This document has been integrated into the main `repmgr` documentation
+and is now located here:

-    1) converting the repmgr.conf configuration files
-    2) upgrading the repmgr schema.
-
-Scripts are provided to assist both with converting repmgr.conf
-and upgrading the schema.
-
-Converting repmgr.conf configuration files
------------------------------------------
-
-With a completely new repmgr version, we've taken the opportunity
-to rename some configuration items have had their names changed for
-clarity and consistency, both between the configuration file and
-the column names in `repmgr.nodes` (e.g. `node` → `node_id`), and
-also for consistency with PostgreSQL naming conventions
-(e.g. `loglevel` → `log_level`).
-
-Other configuration items have been changed to command line options,
-and vice-versa, e.g. to avoid hard-coding items such as a a node's
-upstream ID, which might change over time.
-
-`repmgr` will issue a warning about deprecated/altered options.
+> [Upgrading from repmgr 3.x](https://repmgr.org/docs/4.0/upgrading-from-repmgr-3.html)


-### Changed parameters
-
-Following parameters have been added:
-
-    - `data_directory`: this is mandatory and must contain the path
-        to the node's data directory
-    - `monitoring_history`: this replaces the `repmgrd` command line
-        option `--monitoring-history`
-
-Following parameters have been renamed:
-
-    - `node` → `node_id`
-    - `loglevel` → `log_level`
-    - `logfacility` → `log_facility`
-    - `logfile` → `log_file`
-    - `master_reponse_timeout` → `async_query_timeout`
-
-Following parameters have been removed:
-
-    - `cluster` is no longer required and will be ignored.
-    - `upstream_node_id` is replaced by the command-line parameter
-         `--upstream-node-id`
-
-### Conversion script
-
-To assist with conversion of `repmgr.conf` files, a Perl script
-is provided in `contrib/convert-config.pl`. Use like this:
-
-    $ ./convert-config.pl /etc/repmgr.conf
-    node_id=2
-    node_name=node2
-    conninfo=host=localhost dbname=repmgr user=repmgr port=5602
-    pg_ctl_options='-l /tmp/postgres.5602.log'
-    pg_bindir=/home/barwick/devel/builds/HEAD/bin
-    rsync_options=--exclude=postgresql.local.conf --archive
-    log_level=DEBUG
-    pg_basebackup_options=--no-slot
-    data_directory=
-
-The converted file is printed to `STDOUT` and the original file is not
-changed.
-
-Please note that the parameter `data_directory` *must* be provided;
-if not already present, the conversion script will add an empty
-placeholder parameter.
-
-
-Upgrading the repmgr schema
---------------------------
-
-Ensure `repmgrd` is not running, or any cron jobs which execute the
-`repmgr` binary.
-
-Install `repmgr4`; any `repmgr3` packages should be uninstalled
-(if not automatically installed already).
-
-### Manually create the repmgr extension
-
-In the database used by the existing `repmgr` configuration, execute:
-
-    CREATE EXTENSION repmgr FROM unpackaged;
-
-This will move and convert all objects from the existing schema
-into the new, standard `repmgr` schema.
-
-> *NOTE* there must be only one schema matching 'repmgr_%' in the
-> database, otherwise this step may not work.
-
-### Re-register each node
-
-This is necessary to update the `repmgr` metadata with some additional items.
-
-On the primary node, execute e.g.
-
-    repmgr primary register -f /etc/repmgr.conf --force
-
-On each standby node, execute e.g.
-
-    repmgr standby register -f /etc/repmgr.conf --force
-
-Check the data is updated as expected by examining the `repmgr.nodes` table;
-restart `repmgrd` if required.
-
-The original `repmgr_$cluster` schema can be dropped at any time.
-
-* * *
-
-> *TIP* If you don't care about any data from the existing `repmgr` installation,
-> (e.g. the contents of the `events` and `monitoring` tables), the manual
-> "CREATE EXTENSION" step can be skipped; just re-register each node, starting
-> with the primary node, and the `repmgr` extension will be automatically created.
-
-* * *
--- a/doc/upgrading-repmgr.sgml
+++ b/doc/upgrading-repmgr.sgml
@@ -0,0 +1,349 @@
+<chapter id="upgrading-repmgr" xreflabel="Upgrading repmgr">
+
+ <indexterm>
+  <primary>upgrading</primary>
+ </indexterm>
+
+ <title>Upgrading repmgr</title>
+
+ <para>
+  &repmgr; is updated regularly with point releases (e.g. 4.0.1 to 4.0.2)
+  containing bugfixes and other minor improvements. Any substantial new
+  functionality will be included in a feature release (e.g. 4.0.x to 4.1.x).
+ </para>
+
+ <sect1 id="upgrading-repmgr-extension" xreflabel="Upgrading repmgr 4.x and later">
+  <indexterm>
+   <primary>upgrading</primary>
+   <secondary>repmgr 4.x and later</secondary>
+  </indexterm>
+  <title>Upgrading repmgr 4.x and later</title>
+  <para>
+    &repmgr; 4.x is implemented as a PostgreSQL extension; normally the upgrade consists
+    of the two following steps:
+    <orderedlist>
+      <listitem>
+        <simpara>
+          Install the updated package (or compile the updated source)
+        </simpara>
+      </listitem>
+      <listitem>
+        <simpara>
+          In the database where the &repmgr; extension is installed, execute
+          <command>ALTER EXTENSION repmgr UPDATE</command>.
+        </simpara>
+      </listitem>
+    </orderedlist>
+  </para>
+
+  <para>
+    Always check the <link linkend="appendix-release-notes">release notes</link> for every
+    release as they may contain upgrade instructions particular to individual versions.
+  </para>
+
+  <para>
+    If the <application>repmgrd</application> daemon is in use, we recommend stopping it
+    before upgrading &repmgr;.
+  </para>
+  <para>
+    Note that it may be necessary to restart the PostgreSQL server if the upgrade contains
+    changes to the shared object file used by <application>repmgrd</application>; check the
+    release notes for details.
+  </para>
+ </sect1>
+
+ <sect1 id="upgrading-and-pg-upgrade" xreflabel="pg_upgrade and repmgr">
+  <indexterm>
+   <primary>upgrading</primary>
+   <secondary>pg_upgrade</secondary>
+  </indexterm>
+  <indexterm>
+    <primary>pg_upgrade</primary>
+  </indexterm>
+  <title>pg_upgrade and repmgr</title>
+
+  <para>
+    <application>pg_upgrade</application> requires that if any functions are
+    dependent on a shared library, this library must be present in both
+    the old and new installations before <application>pg_upgrade</application>
+    can be executed.
+  </para>
+  <para>
+    To minimize the risk of any upgrade issues (particularly if an upgrade to
+    a new major &repmgr; version is involved), we recommend upgrading
+    &repmgr; on the old server <emphasis>before</emphasis> running
+    <application>pg_upgrade</application> to ensure that old and new
+    versions are the same.
+  </para>
+  <note>
+    <simpara>
+      This issue applies to any PostgreSQL extension which has
+      dependencies on a shared library.
+    </simpara>
+  </note>
+  <para>
+    For further details please see the <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade documentation</ulink>.
+  </para>
+  <para>
+    If replication slots are in use, bear in mind these will <emphasis>not</emphasis>
+    be recreated by <application>pg_upgrade</application>. These will need to
+    be recreated manually.
+  </para>
+ </sect1>
+
+
+ <sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x">
+  <indexterm>
+   <primary>upgrading</primary>
+   <secondary>from repmgr 3.x</secondary>
+  </indexterm>
+
+  <title>Upgrading from repmgr 3.x</title>
+  <para>
+   The upgrade process consists of two steps:
+   <orderedlist>
+    <listitem>
+     <simpara>
+       converting the repmgr.conf configuration files
+     </simpara>
+    </listitem>
+    <listitem>
+     <simpara>
+       upgrading the repmgr schema using <command>CREATE EXTENSION</command>
+     </simpara>
+    </listitem>
+   </orderedlist>
+  </para>
+  <para>
+   A script is provided to assist with converting <filename>repmgr.conf</filename>.
+  </para>
+  <para>
+   The schema upgrade (which converts the &repmgr; metadata into
+   a packaged PostgreSQL extension) is normally carried out
+   automatically when the &repmgr; extension is created.
+  </para>
+  <para>
+   The shared library has been renamed from <literal>repmgr_funcs</literal> to
+   <literal>repmgr</literal> - if it's set in <varname>shared_preload_libraries</varname>
+   in <filename>postgresql.conf</filename> it will need to be updated to the new name:
+   <programlisting>
+    shared_preload_libraries = 'repmgr'</programlisting>
+  </para>
+
+  <sect2 id="converting-repmgr-conf">
+   <title>Converting repmgr.conf configuration files</title>
+   <para>
+    With a completely new repmgr version, we've taken the opportunity
+    to rename some configuration items for
+    clarity and consistency, both between the configuration file and
+    the column names in <structname>repmgr.nodes</structname>
+    (e.g. <varname>node</varname> to <varname>node_id</varname>), and
+    also for consistency with PostgreSQL naming conventions
+    (e.g. <varname>loglevel</varname> to <varname>log_level</varname>).
+   </para>
+   <para>
+    Other configuration items have been changed to command line options,
+    and vice-versa, e.g. to avoid hard-coding items such as a a node's
+    upstream ID, which might change over time.
+   </para>
+   <para>
+    &repmgr; will issue a warning about deprecated/altered options.
+   </para>
+   <sect3>
+    <title>Changed parameters in "repmgr.conf"</title>
+    <para>
+     Following parameters have been added:
+     <itemizedlist spacing="compact" mark="bullet">
+      <listitem>
+        <simpara><varname>data_directory</varname>: this is mandatory and must
+         contain the path to the node's data directory</simpara>
+      </listitem>
+      <listitem>
+        <simpara><varname>monitoring_history</varname>: this replaces the
+          <application>repmgrd</application> command line option
+          <literal>--monitoring-history</literal></simpara>
+      </listitem>
+     </itemizedlist>
+    </para>
+    <para>
+     Following parameters have been renamed:
+    </para>
+    <table tocentry="1" id="repmgr3-repmgr4-renamed-parameters">
+     <title>Parameters renamed in repmgr4</title>
+     <tgroup cols="2">
+      <thead>
+       <row>
+        <entry>repmgr3</entry>
+        <entry>repmgr4</entry>
+       </row>
+      </thead>
+      <tbody>
+       <row>
+        <entry><varname>node</varname></entry>
+        <entry><varname>node_id</varname></entry>
+       </row>
+       <row>
+        <entry><varname>loglevel</varname></entry>
+        <entry><varname>log_level</varname></entry>
+       </row>
+       <row>
+        <entry><varname>logfacility</varname></entry>
+        <entry><varname>log_facility</varname></entry>
+       </row>
+       <row>
+        <entry><varname>logfile</varname></entry>
+        <entry><varname>log_file</varname></entry>
+       </row>
+       <row>
+        <entry><varname>barman_server</varname></entry>
+        <entry><varname>barman_host</varname></entry>
+       </row>
+       <row>
+        <entry><varname>master_reponse_timeout</varname></entry>
+        <entry><varname>async_query_timeout</varname></entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </table>
+    <note>
+      <para>
+        From &repmgr; 4, <literal>barman_server</literal> refers
+        to the server configured in Barman (in &repmgr; 3, the deprecated
+        <literal>cluster</literal> parameter was used for this);
+        the physical Barman hostname is configured with
+        <literal>barman_host</literal> (see <xref linkend="cloning-from-barman-prerequisites">
+          for details).
+      </para>
+    </note>
+    <para>
+     Following parameters have been removed:
+     <itemizedlist spacing="compact" mark="bullet">
+      <listitem>
+        <simpara><varname>cluster</varname>: is no longer required and will
+        be ignored.</simpara>
+      </listitem>
+      <listitem>
+        <simpara><varname>upstream_node</varname>:  is replaced by the
+        command-line parameter <literal>--upstream-node-id</literal></simpara>
+      </listitem>
+     </itemizedlist>
+    </para>
+   </sect3>
+   <sect3>
+    <title>Conversion script</title>
+    <para>
+     To assist with conversion of <filename>repmgr.conf</filename> files, a Perl script
+     is provided in <filename>contrib/convert-config.pl</filename>.
+     Use like this:
+     <programlisting>
+    $ ./convert-config.pl /etc/repmgr.conf
+    node_id=2
+    node_name=node2
+    conninfo=host=node2 dbname=repmgr user=repmgr connect_timeout=2
+    pg_ctl_options='-l /var/log/postgres/startup.log'
+    rsync_options=--exclude=postgresql.local.conf --archive
+    log_level=INFO
+    pg_basebackup_options=--no-slot
+    data_directory=</programlisting>
+    </para>
+    <para>
+      The converted file is printed to <literal>STDOUT</literal> and the original file is not
+      changed.
+    </para>
+    <para>
+      Please note that the the conversion script will add an empty
+      placeholder parameter for <varname>data_directory</varname>, which
+      is a required parameter in repmgr4 and which <emphasis>must</emphasis>
+      be provided.
+    </para>
+   </sect3>
+  </sect2>
+  <sect2>
+   <title>Upgrading the repmgr schema</title>
+   <para>
+    Ensure <application>repmgrd</application> is not running, or any cron jobs which execute the
+    <command>repmgr</command> binary.
+   </para>
+   <para>
+    Install <literal>repmgr 4</literal> packages; any <literal>repmgr 3.x</literal> packages
+    should be uninstalled (if not automatically uninstalled already by your packaging system).
+   </para>
+   <sect3>
+    <title>Upgrading from repmgr 3.1.1 or earlier</title>
+    <para>
+     If your repmgr version is 3.1.1 or earlier, you will need to update
+     the schema to the latest version in the 3.x series (3.3.2) before
+     converting the installation to repmgr 4.
+    </para>
+    <para>
+      To do this, apply the following upgrade scripts as appropriate for
+      your current version:
+      <itemizedlist spacing="compact" mark="bullet">
+      <listitem>
+        <simpara>
+          <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.0_repmgr3.1.sql">repmgr3.0_repmgr3.1.sql</ulink></simpara>
+      </listitem>
+      <listitem>
+        <simpara><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.1.1_repmgr3.1.2.sql">repmgr3.1.1_repmgr3.1.2.sql</ulink></simpara>
+      </listitem>
+      </itemizedlist>
+    </para>
+    <para>
+      For more details see the
+      <ulink url="https://repmgr.org/release-notes-3.3.2.html#upgrading">repmgr 3 upgrade notes</ulink>.
+    </para>
+   </sect3>
+   <sect3>
+    <title>Manually create the repmgr extension</title>
+    <para>
+     In the database used by the existing &repmgr; installation, execute:
+     <programlisting>
+      CREATE EXTENSION repmgr FROM unpackaged;</programlisting>
+    </para>
+    <para>
+     This will move and convert all objects from the existing schema
+     into the new, standard <literal>repmgr</literal> schema.
+    </para>
+    <note>
+      <simpara>there must be only one schema matching <literal>repmgr_%</literal> in the
+        database, otherwise this step may not work.
+      </simpara>
+    </note>
+   </sect3>
+   <sect3>
+    <title>Re-register each node</title>
+    <para>
+     This is necessary to update the <literal>repmgr</literal> metadata with some additional items.
+    </para>
+    <para>
+     On the primary node, execute e.g.
+     <programlisting>
+      repmgr primary register -f /etc/repmgr.conf --force</programlisting>
+    </para>
+    <para>
+     On each standby node, execute e.g.
+     <programlisting>
+      repmgr standby register -f /etc/repmgr.conf --force</programlisting>
+    </para>
+    <para>
+     Check the data is updated as expected by examining the <structname>repmgr.nodes</structname>
+     table; restart <application>repmgrd</application> if required.
+    </para>
+    <para>
+     The original <literal>repmgr_$cluster</literal> schema can be dropped at any time.
+    </para>
+    <tip>
+     <simpara>
+      If you don't care about any data from the existing &repmgr; installation,
+      (e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
+      tables), the manual <command>CREATE EXTENSION</command> step can be skipped; just re-register
+      each node, starting with the primary node, and the <literal>repmgr</literal> extension will be
+      automatically created.
+     </simpara>
+    </tip>
+   </sect3>
+  </sect2>
+
+ </sect1>
+
+</chapter>
--- a/doc/version.sgml
+++ b/doc/version.sgml
@@ -0,0 +1 @@
+<!ENTITY repmgrversion "4.0.5">
--- a/doc/website-docs.css
+++ b/doc/website-docs.css
@@ -0,0 +1,469 @@
+/* PostgreSQL.org Documentation Style */
+
+/* requires global.css, table.css and text.css to be loaded before this file! */
+body {
+  font-family: verdana, sans-serif;
+  font-size: 76%;
+  background: url("/resources/background.png") repeat-x scroll left top transparent;
+  padding: 15px 4%;
+  margin: 0;
+}
+
+/* monospace font size fix */
+pre, code, kbd, samp, tt {
+  font-family: monospace,monospace;
+  font-size: 1em;
+}
+
+div.NAVHEADER table {
+  margin-left: 0;
+}
+
+/* Container Definitions */
+
+#docContainerWrap {
+  text-align: center; /* Win IE5 */
+}
+
+#docContainer {
+  margin: 0 auto;
+  width: 90%;
+  padding-bottom: 2em;
+  display: block;
+  text-align: left; /* Win IE5 */
+}
+
+#docHeader {
+  background-image: url("/media/img/docs/bg_hdr.png");
+  height: 83px;
+  margin: 0px;
+  padding: 0px;
+  display: block;
+}
+
+#docHeaderLogo {
+  position: relative;
+  width: 206px;
+  height: 83px;
+  border: 0px;
+  padding: 0px;
+  margin: 0 0 0 20px;
+}
+
+#docHeaderLogo img {
+  border: 0px;
+}
+
+#docNavSearchContainer {
+  padding-bottom: 2px;
+}
+
+#docNav, #docVersions {
+  position: relative;
+  text-align: left;
+  margin-left: 10px;
+  margin-top: 5px;
+  color: #666;
+  font-size: 0.95em;
+}
+
+#docSearch {
+  position: relative;
+  text-align: right;
+  padding: 0;
+  margin: 0;
+  color: #666;
+}
+
+#docTextSize {
+  text-align: right;
+  white-space: nowrap;
+  margin-top: 7px;
+  font-size: 0.95em;
+}
+
+#docSearch form {
+  position: relative;
+  top: 5px;
+  right: 0;
+  margin: 0; /* need for IE 5.5 OSX */
+  text-align: right; /* need for IE 5.5 OSX */
+  white-space: nowrap; /* for Opera */
+}
+
+#docSearch form label {
+  color: #666;
+  font-size: 0.95em;
+}
+
+#docSearch form input {
+  font-size: 0.95em;
+}
+  
+#docSearch form #submit {
+  font-size: 0.95em;
+  background: #7A7A7A;
+  color: #fff;
+  border: 1px solid #7A7A7A;
+  padding: 1px 4px;
+}
+  
+#docSearch form #q {
+  width: 170px;
+  font-size: 0.95em;
+  border:  1px solid #7A7A7A;
+  background: #E1E1E1;
+  color: #000000;
+  padding: 2px;
+}
+
+.frmDocSearch {
+  padding: 0;
+  margin: 0;
+  display: inline;
+}
+
+.inpDocSearch {
+  padding: 0;
+  margin: 0;
+  color: #000;
+}
+
+#docContent {
+  position: relative;
+  margin-left: 10px;
+  margin-right: 10px;
+  margin-top: 40px;
+}
+
+#docFooter {
+  position: relative;
+  font-size: 0.9em; 
+  color: #666; 
+  line-height: 1.3em; 
+  margin-left: 10px;
+  margin-right: 10px;
+}
+
+#docComments {
+  margin-top: 10px;
+}
+
+#docClear {
+  clear: both;
+  margin: 0;
+  padding: 0;
+}
+
+/* Heading Definitions */
+
+h1, h2, h3 {
+  font-weight: bold;
+  margin-top: 2ex;
+  color: #444;
+}
+
+h1 {
+  font-size: 1.4em;
+}
+
+h2 {
+  font-size: 1.2em !important;
+}
+
+h3 {
+  font-size: 1.1em;
+}
+
+h1 a:hover,
+h2 a:hover,
+h3 a:hover,
+h4 a:hover {
+  color: #444;
+  text-decoration: none;
+}
+
+/* Text Styles */
+
+div.SECT2 {
+  margin-top: 4ex;
+}
+
+div.SECT3 {
+  margin-top: 3ex;
+  margin-left: 3ex;
+}
+
+.txtCurrentLocation {
+  font-weight: bold;
+}
+
+p, ol, ul, li {
+  line-height: 1.5em;
+}
+
+.txtCommentsWrap {
+  border: 2px solid #F5F5F5; 
+  width: 100%;
+}
+
+.txtCommentsContent {
+  background: #F5F5F5;
+  padding: 3px;
+}
+
+.txtCommentsPoster {
+  float: left;
+}
+
+.txtCommentsDate {
+  float: right;
+}
+
+.txtCommentsComment {
+  padding: 3px;
+}
+
+#docContainer pre code,
+#docContainer pre tt,
+#docContainer pre pre,
+#docContainer tt tt,
+#docContainer tt code,
+#docContainer tt pre {
+  font-size: 1em;
+}
+
+pre.LITERALLAYOUT,
+.SCREEN,
+.SYNOPSIS,
+.PROGRAMLISTING,
+.REFSYNOPSISDIV p,
+table.CAUTION,
+table.WARNING,
+blockquote.NOTE,
+blockquote.TIP,
+table.CALSTABLE {
+  -moz-box-shadow: 3px 3px 5px #DFDFDF;
+  -webkit-box-shadow: 3px 3px 5px #DFDFDF;
+  -khtml-box-shadow: 3px 3px 5px #DFDFDF;
+  -o-box-shadow: 3px 3px 5px #DFDFDF;
+  box-shadow: 3px 3px 5px #DFDFDF;
+}
+
+pre.LITERALLAYOUT,
+.SCREEN,
+.SYNOPSIS,
+.PROGRAMLISTING,
+.REFSYNOPSISDIV p,
+table.CAUTION,
+table.WARNING,
+blockquote.NOTE,
+blockquote.TIP {
+  color: black;
+  border-width: 1px;
+  border-style: solid;
+  padding: 2ex;
+  margin: 2ex 0 2ex 2ex;
+  overflow: auto;
+  -moz-border-radius: 8px;
+  -webkit-border-radius: 8px;
+  -khtml-border-radius: 8px;
+  border-radius: 8px;
+}
+
+pre.LITERALLAYOUT,
+pre.SYNOPSIS,
+pre.PROGRAMLISTING,
+.REFSYNOPSISDIV p,
+.SCREEN {
+  border-color: #CFCFCF;
+  background-color: #F7F7F7;
+}
+
+blockquote.NOTE,
+blockquote.TIP {
+  border-color: #DBDBCC;
+  background-color: #EEEEDD;
+  padding: 14px;
+  width: 572px;
+}
+
+blockquote.NOTE,
+blockquote.TIP,
+table.CAUTION,
+table.WARNING {
+  margin: 4ex auto;
+}
+
+blockquote.NOTE p,
+blockquote.TIP p {
+  margin: 0;
+}
+
+blockquote.NOTE pre,
+blockquote.NOTE code,
+blockquote.TIP pre,
+blockquote.TIP code {
+  margin-left: 0;
+  margin-right: 0;
+  -moz-box-shadow: none;
+  -webkit-box-shadow: none;
+  -khtml-box-shadow: none;
+  -o-box-shadow: none;
+  box-shadow: none;
+}
+
+.emphasis,
+.c2 {
+  font-weight: bold;
+}
+
+.REPLACEABLE {
+  font-style: italic;
+}
+
+/* Table Styles */
+
+table {
+  margin-left: 2ex;
+}
+
+table.CALSTABLE td,
+table.CALSTABLE th,
+table.CAUTION td,
+table.CAUTION th,
+table.WARNING td,
+table.WARNING th {
+  border-style: solid;
+}
+
+table.CALSTABLE,
+table.CAUTION,
+table.WARNING {
+  border-spacing: 0;
+  border-collapse: collapse;
+}
+
+table.CALSTABLE
+{
+  margin: 2ex 0 2ex 2ex;
+  background-color: #E0ECEF;
+  border: 2px solid #A7C6DF;
+}
+
+table.CALSTABLE tr:hover td
+{
+  background-color: #EFEFEF;
+}
+
+table.CALSTABLE td {
+  background-color: #FFF;
+}
+
+table.CALSTABLE td,
+table.CALSTABLE th {
+  border: 1px solid #A7C6DF;
+  padding: 0.5ex 0.5ex;
+}
+
+table.CAUTION,
+table.WARNING {
+  border-collapse: separate;
+  display: block;
+  padding: 0;
+  max-width: 600px;
+}
+
+table.CAUTION {
+  background-color: #F5F5DC;
+  border-color: #DEDFA7;
+}
+
+table.WARNING {
+  background-color: #FFD7D7;
+  border-color: #DF421E;
+}
+
+table.CAUTION td,
+table.CAUTION th,
+table.WARNING td,
+table.WARNING th {
+  border-width: 0;
+  padding-left: 2ex;
+  padding-right: 2ex;
+}
+
+table.CAUTION td,
+table.CAUTION th {
+  border-color: #F3E4D5
+}
+
+table.WARNING td,
+table.WARNING th {
+  border-color: #FFD7D7;
+}
+
+td.c1,
+td.c2,
+td.c3,
+td.c4,
+td.c5,
+td.c6 {
+  font-size: 1.1em;
+  font-weight: bold;
+  border-bottom: 0px solid #FFEFEF;
+  padding: 1ex 2ex 0;
+}
+
+/* Link Styles */
+
+#docNav a {
+  font-weight: bold;
+}
+
+a:link,
+a:visited,
+a:active,
+a:hover {
+  text-decoration: underline;
+}
+
+a:link,
+a:active {
+  color:#0066A2;
+}
+
+a:visited {
+  color:#004E66;
+}
+
+a:hover {
+  color:#000000;
+}
+
+#docFooter a:link,
+#docFooter a:visited,
+#docFooter a:active {
+  color:#666;
+}
+
+#docContainer code.FUNCTION tt {
+  font-size: 1em;
+}
+
+div.header {
+    color: #444;
+    margin-top: 5px;
+}
+
+div.footer {
+    text-align: center;
+    background-image: url("/resources/footerl.png"), url("/resources/footerr.png"), url("/resources/footerc.png");
+    background-position: left top, right top, center top;
+    background-repeat: no-repeat, no-repeat, repeat-x;
+    padding-top: 45px;
+}
+
+img {
+    border-style: none;
+}
--- a/errcode.h
+++ b/errcode.h
@@ -1,6 +1,6 @@
 /*
 * errcode.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -43,5 +43,8 @@
 #define ERR_BARMAN 19
 #define ERR_REGISTRATION_SYNC 20
 #define ERR_OUT_OF_MEMORY 21
+#define ERR_SWITCHOVER_INCOMPLETE 22
+#define ERR_FOLLOW_FAIL 23
+#define ERR_REJOIN_FAIL 24

 #endif							/* _ERRCODE_H_ */
--- a/expected/repmgr_extension.out
+++ b/expected/repmgr_extension.out
@@ -38,33 +38,27 @@ SELECT repmgr.am_bdr_failover_handler(-1);
 
 (1 row)

+SELECT repmgr.am_bdr_failover_handler(NULL);
+ am_bdr_failover_handler 
+-------------------------
+ 
+(1 row)
+
 SELECT repmgr.get_new_primary();
 get_new_primary 
 -----------------
                
 (1 row)

-SELECT repmgr.get_voting_status();
- get_voting_status 
-------------------
-                  
-(1 row)
-
 SELECT repmgr.notify_follow_primary(-1);
 notify_follow_primary 
 -----------------------
 
 (1 row)

-SELECT repmgr.other_node_is_candidate(-1,-1);
- other_node_is_candidate 
-------------------------
- 
-(1 row)
-
-SELECT repmgr.request_vote(-1,-1);
- request_vote 
--------------
+SELECT repmgr.notify_follow_primary(NULL);
+ notify_follow_primary 
+-----------------------
 
 (1 row)

@@ -80,10 +74,10 @@ SELECT repmgr.set_local_node_id(-1);
 
 (1 row)

-SELECT repmgr.set_voting_status_initiated();
- set_voting_status_initiated 
-----------------------------
-                            
+SELECT repmgr.set_local_node_id(NULL);
+ set_local_node_id 
+-------------------
+ 
 (1 row)

 SELECT repmgr.standby_get_last_updated();
--- a/log.c
+++ b/log.c
@@ -1,6 +1,6 @@
 /*
 * log.c - Logging methods
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/log.h
+++ b/log.h
@@ -1,6 +1,6 @@
 /*
 * log.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/repmgr--4.0.sql
+++ b/repmgr--4.0.sql
@@ -6,7 +6,7 @@ CREATE TABLE repmgr.nodes (
  upstream_node_id INTEGER     NULL REFERENCES nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
-  type             TEXT        NOT NULL CHECK (type IN('primary','standby','bdr')),
+  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
@@ -79,6 +79,19 @@ LEFT JOIN repmgr.nodes un
       ON un.node_id = n.upstream_node_id;


+/* XXX update upgrade scripts! */
+CREATE TABLE repmgr.voting_term (
+  term INT NOT NULL
+);
+
+CREATE UNIQUE INDEX voting_term_restrict
+ON repmgr.voting_term ((TRUE));
+
+CREATE RULE voting_term_delete AS
+   ON DELETE TO repmgr.voting_term
+   DO INSTEAD NOTHING;
+
+
 /* ================= */
 /* repmgrd functions */
 /* ================= */
@@ -90,6 +103,11 @@ CREATE FUNCTION set_local_node_id(INT)
  AS 'MODULE_PATHNAME', 'set_local_node_id'
  LANGUAGE C STRICT;

+CREATE FUNCTION get_local_node_id()
+  RETURNS INT
+  AS 'MODULE_PATHNAME', 'get_local_node_id'
+  LANGUAGE C STRICT;
+
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS 'MODULE_PATHNAME', 'standby_set_last_updated'
@@ -102,49 +120,6 @@ CREATE FUNCTION standby_get_last_updated()

 /* failover functions */

-
-DO $repmgr$
-DECLARE
-  DECLARE server_version_num INT;
-BEGIN
-  SELECT setting
-    FROM pg_catalog.pg_settings
-   WHERE name = 'server_version_num'
-    INTO server_version_num;
-
-  IF server_version_num >= 90400 THEN
-    EXECUTE $repmgr_func$
-CREATE FUNCTION request_vote(INT,INT)
-  RETURNS pg_lsn
-  AS 'MODULE_PATHNAME', 'request_vote'
-  LANGUAGE C STRICT;
-    $repmgr_func$;
-  ELSE
-    EXECUTE $repmgr_func$
-CREATE FUNCTION request_vote(INT,INT)
-  RETURNS TEXT
-  AS 'MODULE_PATHNAME', 'request_vote'
-  LANGUAGE C STRICT;
-    $repmgr_func$;
-  END IF;
-END$repmgr$;
-
-
-CREATE FUNCTION get_voting_status()
-  RETURNS INT
-  AS 'MODULE_PATHNAME', 'get_voting_status'
-  LANGUAGE C STRICT;
-
-CREATE FUNCTION set_voting_status_initiated()
-  RETURNS INT
-  AS 'MODULE_PATHNAME', 'set_voting_status_initiated'
-  LANGUAGE C STRICT;
-
-CREATE FUNCTION other_node_is_candidate(INT, INT)
-  RETURNS BOOL
-  AS 'MODULE_PATHNAME', 'other_node_is_candidate'
-  LANGUAGE C STRICT;
-
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'notify_follow_primary'
@@ -160,13 +135,11 @@ CREATE FUNCTION reset_voting_status()
  AS 'MODULE_PATHNAME', 'reset_voting_status'
  LANGUAGE C STRICT;

-
 CREATE FUNCTION am_bdr_failover_handler(INT)
  RETURNS BOOL
  AS 'MODULE_PATHNAME', 'am_bdr_failover_handler'
  LANGUAGE C STRICT;

-
 CREATE FUNCTION unset_bdr_failover_handler()
  RETURNS VOID
  AS 'MODULE_PATHNAME', 'unset_bdr_failover_handler'
--- a/repmgr--unpackaged--4.0.sql
+++ b/repmgr--unpackaged--4.0.sql
@@ -32,7 +32,7 @@ CREATE TABLE repmgr.nodes (
  upstream_node_id INTEGER     NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
  active           BOOLEAN     NOT NULL DEFAULT TRUE,
  node_name        TEXT        NOT NULL,
-  type             TEXT        NOT NULL CHECK (type IN('primary','standby','bdr')),
+  type             TEXT        NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
  location         TEXT        NOT NULL DEFAULT 'default',
  priority         INT         NOT NULL DEFAULT 100,
  conninfo         TEXT        NOT NULL,
@@ -54,8 +54,34 @@ SELECT id, upstream_node_id, active, name,

 ALTER TABLE repmgr.repl_events RENAME TO events;

+-- create new table "repmgr.voting_term"
+CREATE TABLE repmgr.voting_term (
+  term INT NOT NULL
+);
+
+CREATE UNIQUE INDEX voting_term_restrict
+ON repmgr.voting_term ((TRUE));
+
+CREATE RULE voting_term_delete AS
+   ON DELETE TO repmgr.voting_term
+   DO INSTEAD NOTHING;
+
+INSERT INTO repmgr.voting_term (term) VALUES (1);
+
+
 -- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"

+
+DO $repmgr$
+DECLARE
+  DECLARE server_version_num INT;
+BEGIN
+  SELECT setting
+    FROM pg_catalog.pg_settings
+   WHERE name = 'server_version_num'
+    INTO server_version_num;
+  IF server_version_num >= 90400 THEN
+    EXECUTE $repmgr_func$
 CREATE TABLE repmgr.monitoring_history (
  primary_node_id                INTEGER NOT NULL,
  standby_node_id                INTEGER NOT NULL,
@@ -65,12 +91,32 @@ CREATE TABLE repmgr.monitoring_history (
  last_wal_standby_location      PG_LSN,
  replication_lag                BIGINT NOT NULL,
  apply_lag                      BIGINT NOT NULL
-);
+)
+    $repmgr_func$;
+    INSERT INTO repmgr.monitoring_history
+      (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
+    SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
+      FROM repmgr.repl_monitor;
+  ELSE
+    EXECUTE $repmgr_func$
+CREATE TABLE repmgr.monitoring_history (
+  primary_node_id                INTEGER NOT NULL,
+  standby_node_id                INTEGER NOT NULL,
+  last_monitor_time              TIMESTAMP WITH TIME ZONE NOT NULL,
+  last_apply_time                TIMESTAMP WITH TIME ZONE,
+  last_wal_primary_location      TEXT NOT NULL,
+  last_wal_standby_location      TEXT,
+  replication_lag                BIGINT NOT NULL,
+  apply_lag                      BIGINT NOT NULL
+)
+    $repmgr_func$;
+    INSERT INTO repmgr.monitoring_history
+      (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
+    SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag
+      FROM repmgr.repl_monitor;

-INSERT INTO repmgr.monitoring_history
-  (primary_node_id, standby_node_id, last_monitor_time,  last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
-SELECT primary_node, standby_node, last_monitor_time,  last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
-  FROM repmgr.repl_monitor;
+  END IF;
+END$repmgr$;

 CREATE INDEX idx_monitoring_history_time
          ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
@@ -95,6 +141,16 @@ LEFT JOIN repmgr.nodes un

 /* monitoring functions */

+CREATE FUNCTION set_local_node_id(INT)
+  RETURNS VOID
+  AS 'MODULE_PATHNAME', 'set_local_node_id'
+  LANGUAGE C STRICT;
+
+CREATE FUNCTION get_local_node_id()
+  RETURNS INT
+  AS 'MODULE_PATHNAME', 'get_local_node_id'
+  LANGUAGE C STRICT;
+
 CREATE FUNCTION standby_set_last_updated()
  RETURNS TIMESTAMP WITH TIME ZONE
  AS '$libdir/repmgr', 'standby_set_last_updated'
@@ -108,26 +164,6 @@ CREATE FUNCTION standby_get_last_updated()

 /* failover functions */

-CREATE FUNCTION request_vote(INT,INT)
-  RETURNS pg_lsn
-  AS '$libdir/repmgr', 'request_vote'
-  LANGUAGE C STRICT;
-
-CREATE FUNCTION get_voting_status()
-  RETURNS INT
-  AS '$libdir/repmgr', 'get_voting_status'
-  LANGUAGE C STRICT;
-
-CREATE FUNCTION set_voting_status_initiated()
-  RETURNS INT
-  AS '$libdir/repmgr', 'set_voting_status_initiated'
-  LANGUAGE C STRICT;
-
-CREATE FUNCTION other_node_is_candidate(INT, INT)
-  RETURNS BOOL
-  AS '$libdir/repmgr', 'other_node_is_candidate'
-  LANGUAGE C STRICT;
-
 CREATE FUNCTION notify_follow_primary(INT)
  RETURNS VOID
  AS '$libdir/repmgr', 'notify_follow_primary'
--- a/repmgr-action-bdr.c
+++ b/repmgr-action-bdr.c
@@ -1,9 +1,9 @@
 /*
- * repmgr-action-standby.c
+ * repmgr-action-bdr.c
 *
 * Implements BDR-related actions for the repmgr command line utility
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@
 /*
 * do_bdr_register()
 *
- * As each BDR node is its own master, registering a BDR node
+ * As each BDR node is its own primary, registering a BDR node
 * will create the repmgr metadata schema if necessary.
 */
 void
@@ -92,7 +92,39 @@ do_bdr_register(void)
 		exit(ERR_BAD_CONFIG);
 	}

-	/* check whether repmgr extension exists, and that any other nodes are BDR */
+	/* check for a matching BDR node */
+	{
+		PQExpBufferData bdr_local_node_name;
+		bool		node_match = false;
+
+		initPQExpBuffer(&bdr_local_node_name);
+		node_match = bdr_node_name_matches(conn, config_file_options.node_name, &bdr_local_node_name);
+
+		if (node_match == false)
+		{
+			if (strlen(bdr_local_node_name.data))
+			{
+				log_error(_("local node BDR node name is \"%s\", expected: \"%s\""),
+						  bdr_local_node_name.data,
+						  config_file_options.node_name);
+				log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
+			}
+			else
+			{
+				log_error(_("local node does not report BDR node name"));
+				log_hint(_("ensure this is an active BDR node"));
+			}
+
+			PQfinish(conn);
+			pfree(dbname);
+			termPQExpBuffer(&bdr_local_node_name);
+			exit(ERR_BAD_CONFIG);
+		}
+
+		termPQExpBuffer(&bdr_local_node_name);
+	}
+
+	/* check whether repmgr extension exists, and there are no non-BDR nodes registered */
 	extension_status = get_repmgr_extension_status(conn);

 	if (extension_status == REPMGR_UNKNOWN)
@@ -142,17 +174,9 @@ do_bdr_register(void)

 	pfree(dbname);

-	/* check for a matching BDR node */
+	if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false)
 	{
-		bool		node_exists = bdr_node_exists(conn, config_file_options.node_name);
-
-		if (node_exists == false)
-		{
-			log_error(_("no BDR node with node_name \"%s\" found"), config_file_options.node_name);
-			log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
-			PQfinish(conn);
-			exit(ERR_BAD_CONFIG);
-		}
+		bdr_node_set_repmgr_set(conn, config_file_options.node_name);
 	}

 	/*
--- a/repmgr-action-bdr.h
+++ b/repmgr-action-bdr.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-bdr.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/repmgr-action-cluster.c
+++ b/repmgr-action-cluster.c
@@ -3,7 +3,7 @@
 *
 * Implements cluster information actions for the repmgr command line utility
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -82,6 +82,7 @@ do_cluster_show(void)
 	NodeInfoListCell *cell = NULL;
 	int			i = 0;
 	ItemList	warnings = {NULL, NULL};
+	bool		success = false;

 	/* Connect to local database to obtain cluster connection data */
 	log_verbose(LOG_INFO, _("connecting to database"));
@@ -91,11 +92,19 @@ do_cluster_show(void)
 	else
 		conn = establish_db_connection_by_params(&source_conninfo, true);

-	get_all_node_records_with_upstream(conn, &nodes);
+	success = get_all_node_records_with_upstream(conn, &nodes);
+
+	if (success == false)
+	{
+		/* get_all_node_records_with_upstream() will print error message */
+		PQfinish(conn);
+		exit(ERR_BAD_CONFIG);
+	}

 	if (nodes.node_count == 0)
 	{
-		log_error(_("unable to retrieve any node records"));
+		log_error(_("no node records were found"));
+		log_hint(_("ensure at least one node is registered"));
 		PQfinish(conn);
 		exit(ERR_BAD_CONFIG);
 	}
@@ -131,8 +140,14 @@ do_cluster_show(void)
 		}
 		else
 		{
+			char		error[MAXLEN];
+
+			strncpy(error, PQerrorMessage(cell->node_info->conn), MAXLEN);
 			cell->node_info->node_status = NODE_STATUS_DOWN;
 			cell->node_info->recovery_type = RECTYPE_UNKNOWN;
+			item_list_append_format(&warnings,
+									"when attempting to connect to node \"%s\" (ID: %i), following error encountered :\n\"%s\"",
+									cell->node_info->node_name, cell->node_info->node_id, trim(error));
 		}

 		initPQExpBuffer(&details);
@@ -158,15 +173,13 @@ do_cluster_show(void)
 									break;
 								case RECTYPE_STANDBY:
 									appendPQExpBuffer(&details, "! running as standby");
-									item_list_append_format(
-															&warnings,
+									item_list_append_format(&warnings,
 															"node \"%s\" (ID: %i) is registered as primary but running as standby",
 															cell->node_info->node_name, cell->node_info->node_id);
 									break;
 								case RECTYPE_UNKNOWN:
 									appendPQExpBuffer(&details, "! unknown");
-									item_list_append_format(
-															&warnings,
+									item_list_append_format(&warnings,
 															"node \"%s\" (ID: %i) has unknown replication status",
 															cell->node_info->node_name, cell->node_info->node_id);
 									break;
@@ -177,16 +190,14 @@ do_cluster_show(void)
 							if (cell->node_info->recovery_type == RECTYPE_PRIMARY)
 							{
 								appendPQExpBuffer(&details, "! running");
-								item_list_append_format(
-														&warnings,
+								item_list_append_format(&warnings,
 														"node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
 														cell->node_info->node_name, cell->node_info->node_id);
 							}
 							else
 							{
 								appendPQExpBuffer(&details, "! running as standby");
-								item_list_append_format(
-														&warnings,
+								item_list_append_format(&warnings,
 														"node \"%s\" (ID: %i) is registered as an inactive primary but running as standby",
 														cell->node_info->node_name, cell->node_info->node_id);
 							}
@@ -199,8 +210,7 @@ do_cluster_show(void)
 						if (cell->node_info->active == true)
 						{
 							appendPQExpBuffer(&details, "? unreachable");
-							item_list_append_format(
-													&warnings,
+							item_list_append_format(&warnings,
 													"node \"%s\" (ID: %i) is registered as an active primary but is unreachable",
 													cell->node_info->node_name, cell->node_info->node_id);
 						}
@@ -226,8 +236,7 @@ do_cluster_show(void)
 									break;
 								case RECTYPE_PRIMARY:
 									appendPQExpBuffer(&details, "! running as primary");
-									item_list_append_format(
-															&warnings,
+									item_list_append_format(&warnings,
 															"node \"%s\" (ID: %i) is registered as standby but running as primary",
 															cell->node_info->node_name, cell->node_info->node_id);
 									break;
@@ -245,16 +254,14 @@ do_cluster_show(void)
 							if (cell->node_info->recovery_type == RECTYPE_STANDBY)
 							{
 								appendPQExpBuffer(&details, "! running");
-								item_list_append_format(
-														&warnings,
+								item_list_append_format(&warnings,
 														"node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
 														cell->node_info->node_name, cell->node_info->node_id);
 							}
 							else
 							{
 								appendPQExpBuffer(&details, "! running as primary");
-								item_list_append_format(
-														&warnings,
+								item_list_append_format(&warnings,
 														"node \"%s\" (ID: %i) is running as primary but the repmgr node record is inactive",
 														cell->node_info->node_name, cell->node_info->node_id);
 							}
@@ -267,8 +274,7 @@ do_cluster_show(void)
 						if (cell->node_info->active == true)
 						{
 							appendPQExpBuffer(&details, "? unreachable");
-							item_list_append_format(
-													&warnings,
+							item_list_append_format(&warnings,
 													"node \"%s\" (ID: %i) is registered as an active standby but is unreachable",
 													cell->node_info->node_name, cell->node_info->node_id);
 						}
@@ -279,6 +285,7 @@ do_cluster_show(void)
 					}
 				}
 				break;
+			case WITNESS:
 			case BDR:
 				{
 					/* node is reachable */
@@ -415,7 +422,7 @@ do_cluster_show(void)
 		printf(_("\nWARNING: following issues were detected\n"));
 		for (cell = warnings.head; cell; cell = cell->next)
 		{
-			printf(_("  %s\n"), cell->string);
+			printf(_("  - %s\n"), cell->string);
 		}
 	}
 }
@@ -435,82 +442,18 @@ void
 do_cluster_event(void)
 {
 	PGconn	   *conn = NULL;
-	PQExpBufferData query;
-	PQExpBufferData where_clause;
 	PGresult   *res;
 	int			i = 0;
+	int			column_count = EVENT_HEADER_COUNT;

 	conn = establish_db_connection(config_file_options.conninfo, true);

-	initPQExpBuffer(&query);
-	initPQExpBuffer(&where_clause);
-
-	/* LEFT JOIN used here as a node record may have been removed */
-	appendPQExpBuffer(
-					  &query,
-					  "   SELECT e.node_id, n.node_name, e.event, e.successful, \n"
-					  "          TO_CHAR(e.event_timestamp, 'YYYY-MM-DD HH24:MI:SS') AS timestamp, \n"
-					  "          e.details \n"
-					  "     FROM repmgr.events e \n"
-					  "LEFT JOIN repmgr.nodes n ON e.node_id = n.node_id ");
-
-	if (runtime_options.node_id != UNKNOWN_NODE_ID)
-	{
-
-		append_where_clause(&where_clause,
-							"n.node_id=%i", runtime_options.node_id);
-	}
-	else if (runtime_options.node_name[0] != '\0')
-	{
-		char	   *escaped = escape_string(conn, runtime_options.node_name);
-
-		if (escaped == NULL)
-		{
-			log_error(_("unable to escape value provided for node name"));
-		}
-		else
-		{
-			append_where_clause(&where_clause,
-								"n.node_name='%s'",
-								escaped);
-			pfree(escaped);
-		}
-	}
-
-	if (runtime_options.event[0] != '\0')
-	{
-		char	   *escaped = escape_string(conn, runtime_options.event);
-
-		if (escaped == NULL)
-		{
-			log_error(_("unable to escape value provided for event"));
-		}
-		else
-		{
-			append_where_clause(&where_clause,
-								"e.event='%s'",
-								escaped);
-			pfree(escaped);
-		}
-	}
-
-	appendPQExpBuffer(&query, "\n%s\n",
-					  where_clause.data);
-
-	appendPQExpBuffer(&query,
-					  " ORDER BY e.event_timestamp DESC");
-
-	if (runtime_options.all == false && runtime_options.limit > 0)
-	{
-		appendPQExpBuffer(&query, " LIMIT %i",
-						  runtime_options.limit);
-	}
-
-	log_debug("do_cluster_event():\n%s", query.data);
-	res = PQexec(conn, query.data);
-
-	termPQExpBuffer(&query);
-	termPQExpBuffer(&where_clause);
+	res = get_event_records(conn,
+							runtime_options.node_id,
+							runtime_options.node_name,
+							runtime_options.event,
+							runtime_options.all,
+							runtime_options.limit);

 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
 	{
@@ -537,7 +480,11 @@ do_cluster_event(void)
 	strncpy(headers_event[EV_TIMESTAMP].title, _("Timestamp"), MAXLEN);
 	strncpy(headers_event[EV_DETAILS].title, _("Details"), MAXLEN);

-	for (i = 0; i < EVENT_HEADER_COUNT; i++)
+	/* if --terse provided, simply omit the "Details" column */
+	if (runtime_options.terse == true)
+		column_count --;
+
+	for (i = 0; i < column_count; i++)
 	{
 		headers_event[i].max_length = strlen(headers_event[i].title);
 	}
@@ -546,7 +493,7 @@ do_cluster_event(void)
 	{
 		int			j;

-		for (j = 0; j < EVENT_HEADER_COUNT; j++)
+		for (j = 0; j < column_count; j++)
 		{
 			headers_event[j].cur_length = strlen(PQgetvalue(res, i, j));
 			if (headers_event[j].cur_length > headers_event[j].max_length)
@@ -557,7 +504,7 @@ do_cluster_event(void)

 	}

-	for (i = 0; i < EVENT_HEADER_COUNT; i++)
+	for (i = 0; i < column_count; i++)
 	{
 		if (i == 0)
 			printf(" ");
@@ -570,14 +517,14 @@ do_cluster_event(void)
 	}
 	printf("\n");
 	printf("-");
-	for (i = 0; i < EVENT_HEADER_COUNT; i++)
+	for (i = 0; i < column_count; i++)
 	{
 		int			j;

 		for (j = 0; j < headers_event[i].max_length; j++)
 			printf("-");

-		if (i < (EVENT_HEADER_COUNT - 1))
+		if (i < (column_count - 1))
 			printf("-+-");
 		else
 			printf("-");
@@ -590,13 +537,13 @@ do_cluster_event(void)
 		int			j;

 		printf(" ");
-		for (j = 0; j < EVENT_HEADER_COUNT; j++)
+		for (j = 0; j < column_count; j++)
 		{
 			printf("%-*s",
 				   headers_event[j].max_length,
 				   PQgetvalue(res, i, j));

-			if (j < (EVENT_HEADER_COUNT - 1))
+			if (j < (column_count - 1))
 				printf(" | ");
 		}

@@ -1017,8 +964,7 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length)

 		initPQExpBuffer(&command_output);

-		(void) remote_command(
-							  host,
+		(void) remote_command(host,
 							  runtime_options.remote_user,
 							  command.data,
 							  &command_output);
@@ -1197,13 +1143,12 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)
 		/* fix to work with --node-id */
 		if (cube[i]->node_id == config_file_options.node_id)
 		{
-			(void) local_command(
-								 command.data,
-								 &command_output);
+			(void) local_command_simple(command.data,
+										&command_output);
 		}
 		else
 		{
-			t_conninfo_param_list remote_conninfo;
+			t_conninfo_param_list remote_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
 			char	   *host = NULL;
 			PQExpBufferData quoted_command;

@@ -1223,8 +1168,7 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)

 			log_verbose(LOG_DEBUG, "build_cluster_crosscheck(): executing\n  %s", quoted_command.data);

-			(void) remote_command(
-								  host,
+			(void) remote_command(host,
 								  runtime_options.remote_user,
 								  quoted_command.data,
 								  &command_output);
@@ -1323,7 +1267,7 @@ do_cluster_cleanup(void)

 	conn = establish_db_connection(config_file_options.conninfo, true);

-	/* check if there is a master in this cluster */
+	/* check if there is a primary in this cluster */
 	log_info(_("connecting to primary server"));
 	primary_conn = establish_primary_db_connection(conn, true);

--- a/repmgr-action-cluster.h
+++ b/repmgr-action-cluster.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-cluster.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/repmgr-action-node.c
+++ b/repmgr-action-node.c
--- a/repmgr-action-node.h
+++ b/repmgr-action-node.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-node.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -22,7 +22,6 @@
 extern void do_node_status(void);
 extern void do_node_check(void);

-
 extern void do_node_rejoin(void);
 extern void do_node_service(void);

--- a/repmgr-action-primary.c
+++ b/repmgr-action-primary.c
@@ -3,7 +3,7 @@
 *
 * Implements primary actions for the repmgr command line utility
 *
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -74,7 +74,11 @@ do_primary_register(void)

 	log_verbose(LOG_INFO, _("server is not in recovery"));

-	/* create the repmgr extension if it doesn't already exist */
+	/*
+	 * create the repmgr extension if it doesn't already exist;
+	 * note that create_repmgr_extension() will take into account
+	 * the --dry-run option
+	 */
 	if (!create_repmgr_extension(conn))
 	{
 		PQfinish(conn);
@@ -92,6 +96,7 @@ do_primary_register(void)
 		return;
 	}

+	initialize_voting_term(conn);

 	/* Ensure there isn't another registered node which is primary */
 	primary_conn = get_primary_connection(conn, &current_primary_id, NULL);
@@ -543,7 +548,8 @@ do_primary_help(void)
 	printf(_("  \"primary unregister\" unregisters an inactive primary node.\n"));
 	puts("");
 	printf(_("  --dry-run                           check what would happen, but don't actually unregister the primary\n"));
-	printf(_("  -F, --force                         force removal of the record\n"));
+	printf(_("  --node-id                           ID of the inactive primary node to unregister.\n"));
+	printf(_("  -F, --force                         force removal of an active record\n"));

 	puts("");

--- a/repmgr-action-primary.h
+++ b/repmgr-action-primary.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-primary.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/repmgr-action-standby.c
+++ b/repmgr-action-standby.c
--- a/repmgr-action-standby.h
+++ b/repmgr-action-standby.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-action-standby.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@ extern void do_standby_switchover(void);

 extern void do_standby_help(void);

-extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output);
+extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output, int *error_code);



--- a/repmgr-action-witness.c
+++ b/repmgr-action-witness.c
@@ -0,0 +1,465 @@
+/*
+ * repmgr-action-witness.c
+ *
+ * Implements witness actions for the repmgr command line utility
+ *
+ * Copyright (c) 2ndQuadrant, 2010-2018
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <sys/stat.h>
+
+#include "repmgr.h"
+#include "dirutil.h"
+#include "compat.h"
+#include "controldata.h"
+
+#include "repmgr-client-global.h"
+#include "repmgr-action-witness.h"
+
+static char		repmgr_user[MAXLEN];
+static char		repmgr_db[MAXLEN];
+
+void
+do_witness_register(void)
+{
+	PGconn	   *witness_conn = NULL;
+	PGconn	   *primary_conn = NULL;
+	RecoveryType recovery_type = RECTYPE_UNKNOWN;
+	NodeInfoList nodes = T_NODE_INFO_LIST_INITIALIZER;
+	t_node_info node_record = T_NODE_INFO_INITIALIZER;
+	RecordStatus record_status = RECORD_NOT_FOUND;
+	bool		record_created = false;
+
+	log_info(_("connecting to witness node \"%s\" (ID: %i)"),
+			 config_file_options.node_name,
+			 config_file_options.node_id);
+
+	witness_conn = establish_db_connection_quiet(config_file_options.conninfo);
+
+	if (PQstatus(witness_conn) != CONNECTION_OK)
+	{
+		log_error(_("unable to connect to witness node \"%s\" (ID: %i)"),
+				  config_file_options.node_name,
+				  config_file_options.node_id);
+		log_detail("%s",
+				   PQerrorMessage(witness_conn));
+		log_hint(_("the witness node must be running before it can be registered"));
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* check witness node's recovery type */
+	recovery_type = get_recovery_type(witness_conn);
+
+	if (recovery_type == RECTYPE_STANDBY)
+	{
+		log_error(_("provided node is a standby"));
+		log_hint(_("a witness node must run on an independent primary server"));
+
+		PQfinish(witness_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* check that witness node is not a BDR node */
+	if (is_bdr_db_quiet(witness_conn) == true)
+	{
+		log_error(_("witness node is a BDR node"));
+		log_hint(_("a witness node cannot be configured for a BDR cluster"));
+
+		PQfinish(witness_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+
+	/* connect to primary with provided parameters */
+	log_info(_("connecting to primary node"));
+
+	/*
+	 * Extract the repmgr user and database names from the conninfo string
+	 * provided in repmgr.conf
+	 */
+	get_conninfo_value(config_file_options.conninfo, "user", repmgr_user);
+	get_conninfo_value(config_file_options.conninfo, "dbname", repmgr_db);
+
+	param_set_ine(&source_conninfo, "user", repmgr_user);
+	param_set_ine(&source_conninfo, "dbname", repmgr_db);
+
+	/* We need to connect to check configuration and copy it */
+	primary_conn = establish_db_connection_by_params(&source_conninfo, false);
+
+	if (PQstatus(primary_conn) != CONNECTION_OK)
+	{
+		log_error(_("unable to connect to the primary node"));
+		log_hint(_("a primary node must be configured before registering a witness node"));
+
+		PQfinish(witness_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* check primary node's recovery type */
+	recovery_type = get_recovery_type(primary_conn);
+
+	if (recovery_type == RECTYPE_STANDBY)
+	{
+		log_error(_("provided primary node is a standby"));
+		log_hint(_("provide the connection details of the cluster's primary server"));
+
+		PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* check that primary node is not a BDR node */
+	if (is_bdr_db_quiet(primary_conn) == true)
+	{
+		log_error(_("primary node is a BDR node"));
+		log_hint(_("a witness node cannot be configured for a BDR cluster"));
+
+		PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/*
+	 * TODO:sanity check witness node is not part of main cluster; we could
+	 * add a random application_name to the respective connections,
+	 * and do a simple check of pg_stat_activity
+	 */
+
+	/* create repmgr extension, if does not exist */
+	if (runtime_options.dry_run == false &&  !create_repmgr_extension(witness_conn))
+	{
+		PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/*
+	 * check if node record exists on primary, overwrite if -F/--force provided,
+	 * otherwise exit with error
+	 */
+
+	record_status = get_node_record(primary_conn,
+									config_file_options.node_id,
+									&node_record);
+
+	if (record_status == RECORD_FOUND)
+	{
+		/*
+		 * If node is not a witness, cowardly refuse to do anything, let the
+		 * user work out what's the correct thing to do.
+		 */
+		if (node_record.type != WITNESS)
+		{
+			log_error(_("node \"%s\" (ID: %i) is already registered as a %s node"),
+					  config_file_options.node_name,
+					  config_file_options.node_id,
+					  get_node_type_string(node_record.type));
+			log_hint(_("use \"repmgr %s unregister\" to remove a non-witness node record"),
+					 get_node_type_string(node_record.type));
+
+			PQfinish(witness_conn);
+			PQfinish(primary_conn);
+
+			exit(ERR_BAD_CONFIG);
+		}
+
+		if (!runtime_options.force)
+		{
+			log_error(_("witness node is already registered"));
+			log_hint(_("use option -F/--force to reregister the node"));
+
+			PQfinish(witness_conn);
+			PQfinish(primary_conn);
+
+			exit(ERR_BAD_CONFIG);
+		}
+	}
+
+
+	// XXX check other node with same name does not exist
+
+	/*
+	 * if repmgr.nodes contains entries, delete if -F/--force provided,
+	 * otherwise exit with error
+	 */
+	get_all_node_records(witness_conn, &nodes);
+
+	log_verbose(LOG_DEBUG, "%i node records found", nodes.node_count);
+
+	if (nodes.node_count > 0)
+	{
+		if (!runtime_options.force)
+		{
+			log_error(_("witness node is already initialised and contains node records"));
+			log_hint(_("use option -F/--force to reinitialise the node"));
+			PQfinish(primary_conn);
+			PQfinish(witness_conn);
+			exit(ERR_BAD_CONFIG);
+		}
+	}
+
+	clear_node_info_list(&nodes);
+
+	if (runtime_options.dry_run == true)
+	{
+		log_info(_("prerequisites for registering the witness node are met"));
+		PQfinish(primary_conn);
+		PQfinish(witness_conn);
+		exit(SUCCESS);
+	}
+	/* create record on primary */
+
+	/*
+	 * node record exists - update it (at this point we have already
+	 * established that -F/--force is in use)
+	 */
+
+	init_node_record(&node_record);
+
+	/* these values are mandatory, setting them to anything else has no point */
+	node_record.type = WITNESS;
+	node_record.priority = 0;
+	node_record.upstream_node_id = get_primary_node_id(primary_conn);
+
+	if (record_status == RECORD_FOUND)
+	{
+		record_created = update_node_record(primary_conn,
+											"witness register",
+											&node_record);
+	}
+	else
+	{
+		record_created = create_node_record(primary_conn,
+											"witness register",
+											&node_record);
+	}
+
+	if (record_created == false)
+	{
+		log_error(_("unable to create or update node record on primary"));
+		PQfinish(primary_conn);
+		PQfinish(witness_conn);
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* sync records from primary */
+	if (witness_copy_node_records(primary_conn, witness_conn) == false)
+	{
+		log_error(_("unable to copy repmgr node records from primary"));
+		PQfinish(primary_conn);
+		PQfinish(witness_conn);
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* create event */
+	create_event_record(primary_conn,
+						&config_file_options,
+						config_file_options.node_id,
+						"witness_register",
+						true,
+						NULL);
+
+	PQfinish(primary_conn);
+	PQfinish(witness_conn);
+
+	log_info(_("witness registration complete"));
+	log_notice(_("witness node \"%s\" (ID: %i) successfully registered"),
+			   config_file_options.node_name, config_file_options.node_id);
+
+	return;
+}
+
+
+void
+do_witness_unregister(void)
+{
+	PGconn	   *witness_conn = NULL;
+	PGconn	   *primary_conn = NULL;
+	t_node_info node_record = T_NODE_INFO_INITIALIZER;
+	RecordStatus record_status = RECORD_NOT_FOUND;
+	bool		node_record_deleted = false;
+	bool		witness_available = true;
+
+	log_info(_("connecting to witness node \"%s\" (ID: %i)"),
+			 config_file_options.node_name,
+			 config_file_options.node_id);
+
+	witness_conn = establish_db_connection_quiet(config_file_options.conninfo);
+
+	if (PQstatus(witness_conn) != CONNECTION_OK)
+	{
+		if (!runtime_options.force)
+		{
+			log_error(_("unable to connect to witness node \"%s\" (ID: %i)"),
+					  config_file_options.node_name,
+					  config_file_options.node_id);
+			log_detail("%s", PQerrorMessage(witness_conn));
+			log_hint(_("provide -F/--force to remove the witness record if the server is not running"));
+			exit(ERR_BAD_CONFIG);
+		}
+
+		log_notice(_("unable to connect to witness node \"%s\" (ID: %i), removing node record on cluster primary only"),
+				   config_file_options.node_name,
+				   config_file_options.node_id);
+		witness_available = false;
+	}
+
+	if (witness_available == true)
+	{
+		primary_conn = get_primary_connection_quiet(witness_conn, NULL, NULL);
+	}
+	else
+	{
+		/*
+		 * Extract the repmgr user and database names from the conninfo string
+		 * provided in repmgr.conf
+		 */
+		get_conninfo_value(config_file_options.conninfo, "user", repmgr_user);
+		get_conninfo_value(config_file_options.conninfo, "dbname", repmgr_db);
+
+		param_set_ine(&source_conninfo, "user", repmgr_user);
+		param_set_ine(&source_conninfo, "dbname", repmgr_db);
+
+		primary_conn = establish_db_connection_by_params(&source_conninfo, false);
+
+	}
+
+	if (PQstatus(primary_conn) != CONNECTION_OK)
+	{
+		log_error(_("unable to connect to primary"));
+		log_detail("%s", PQerrorMessage(primary_conn));
+
+		if (witness_available == true)
+		{
+			PQfinish(witness_conn);
+		}
+		else
+		{
+			log_hint(_("provide connection details to primary server"));
+		}
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* Check node exists and is really a witness */
+	record_status = get_node_record(primary_conn, config_file_options.node_id, &node_record);
+
+	if (record_status != RECORD_FOUND)
+	{
+		log_error(_("no record found for node %i"), config_file_options.node_id);
+
+		if (witness_available == true)
+			PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	if (node_record.type != WITNESS)
+	{
+		log_error(_("node %i is not a witness node"), config_file_options.node_id);
+		log_detail(_("node %i is a %s node"), config_file_options.node_id, get_node_type_string(node_record.type));
+
+		if (witness_available == true)
+			PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(ERR_BAD_CONFIG);
+	}
+
+	if (runtime_options.dry_run == true)
+	{
+		log_info(_("prerequisites for unregistering the witness node are met"));
+		if (witness_available == true)
+			PQfinish(witness_conn);
+		PQfinish(primary_conn);
+
+		exit(SUCCESS);
+	}
+
+	log_info(_("unregistering witness node %i"), config_file_options.node_id);
+	node_record_deleted = delete_node_record(primary_conn,
+										     config_file_options.node_id);
+
+	if (node_record_deleted == false)
+	{
+		PQfinish(primary_conn);
+		PQfinish(witness_conn);
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* sync records from primary */
+	if (witness_available == true && witness_copy_node_records(primary_conn, witness_conn) == false)
+	{
+		log_error(_("unable to copy repmgr node records from primary"));
+		PQfinish(primary_conn);
+		PQfinish(witness_conn);
+		exit(ERR_BAD_CONFIG);
+	}
+
+	/* Log the event */
+	create_event_record(primary_conn,
+						&config_file_options,
+						config_file_options.node_id,
+						"witness_unregister",
+						true,
+						NULL);
+
+	PQfinish(primary_conn);
+
+	if (witness_available == true)
+		PQfinish(witness_conn);
+
+	log_info(_("witness unregistration complete"));
+	log_detail(_("witness node with id %i (conninfo: %s) successfully unregistered"),
+			    config_file_options.node_id, config_file_options.conninfo);
+
+	return;
+}
+
+
+void do_witness_help(void)
+{
+	print_help_header();
+
+	printf(_("Usage:\n"));
+	printf(_("    %s [OPTIONS] witness register\n"), progname());
+	printf(_("    %s [OPTIONS] witness unregister\n"), progname());
+
+	printf(_("WITNESS REGISTER\n"));
+	puts("");
+	printf(_("  \"witness register\" registers a witness node.\n"));
+	puts("");
+	printf(_("  Requires provision of connection information for the primary\n"));
+	puts("");
+	printf(_("  --dry-run                           check prerequisites but don't make any changes\n"));
+	printf(_("  -F, --force                         overwrite an existing node record\n"));
+	puts("");
+
+	printf(_("WITNESS UNREGISTER\n"));
+	puts("");
+	printf(_("  \"witness register\" unregisters a witness node.\n"));
+	puts("");
+	printf(_("  --dry-run                           check prerequisites but don't make any changes\n"));
+	printf(_("  -F, --force                         unregister when witness node not running\n"));
+	puts("");
+
+	return;
+}
--- a/repmgr-action-witness.h
+++ b/repmgr-action-witness.h
@@ -0,0 +1,27 @@
+/*
+ * repmgr-action-witness.h
+ * Copyright (c) 2ndQuadrant, 2010-2018
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _REPMGR_ACTION_WITNESS_H_
+#define _REPMGR_ACTION_WITNESS_H_
+
+extern void do_witness_register(void);
+extern void do_witness_unregister(void);
+
+extern void do_witness_help(void);
+
+#endif							/* _REPMGR_ACTION_WITNESS_H_ */
--- a/repmgr-client-global.h
+++ b/repmgr-client-global.h
@@ -1,6 +1,6 @@
 /*
 * repmgr-client-global.h
- * Copyright (c) 2ndQuadrant, 2010-2017
+ * Copyright (c) 2ndQuadrant, 2010-2018
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -42,6 +42,7 @@ typedef struct
 	bool		force;
 	char		pg_bindir[MAXLEN];	/* overrides setting in repmgr.conf */
 	bool		wait;
+	bool		no_wait;

 	/* logging options */
 	char		log_level[MAXLEN];	/* overrides setting in repmgr.conf */
@@ -68,6 +69,7 @@ typedef struct
 	int			node_id;
 	char		node_name[MAXLEN];
 	char		data_dir[MAXPGPATH];
+	int			remote_node_id;

 	/* "standby clone" options */
 	bool		copy_external_config_files;
@@ -79,6 +81,7 @@ typedef struct
 	char		replication_user[MAXLEN];
 	char		upstream_conninfo[MAXLEN];
 	bool		without_barman;
+	bool		recovery_conf_only;

 	/* "standby clone"/"standby follow" options */
 	int			upstream_node_id;
@@ -86,10 +89,12 @@ typedef struct
 	/* "standby register" options */
 	bool		wait_register_sync;
 	int			wait_register_sync_seconds;
+	int			wait_start;

 	/* "standby switchover" options */
 	bool		always_promote;
-	bool		force_rewind;
+	bool		force_rewind_used;
+	char		force_rewind_path[MAXPGPATH];
 	bool		siblings_follow;

 	/* "node status" options */
@@ -101,6 +106,8 @@ typedef struct
 	bool		replication_lag;
 	bool		role;
 	bool		slots;
+	bool		has_passfile;
+	bool		replication_connection;

 	/* "node join" options */
 	char		config_files[MAXLEN];
@@ -128,30 +135,30 @@ typedef struct
 		/* configuration metadata */ \
 		false, false, false, false,	\
 		/* general configuration options */	\
-		"", false, false, "", false,	\
+		"", false, false, "", false, false,	\
 		/* logging options */ \
 		"", false, false, false, \
 		/* output options */ \
 		false, false, false,  \
 		/* database connection options */ \
-		"", "", "",	"",				  \
+		"", "", "",	"", \
 		/* other connection options */ \
-		"",	"",  \
-		/* node options */ \
-		UNKNOWN_NODE_ID, "", "", \
+		"",	"", \
+		/* general node options */ \
+		UNKNOWN_NODE_ID, "", "", UNKNOWN_NODE_ID, \
 		/* "standby clone" options */ \
 		false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \
-		false,  \
+		false, false, \
 		/* "standby clone"/"standby follow" options */ \
 		NO_UPSTREAM_NODE, \
 		/* "standby register" options */ \
-		false, 0, \
+		false, 0, DEFAULT_WAIT_START,   \
 		/* "standby switchover" options */ \
-		false, false, false, \
+		false, false, "", false,		   \
 		/* "node status" options */ \
 		false, \
 		/* "node check" options */ \
-		false, false, false, false, false, \
+		false, false, false, false, false, false, false, \
 		/* "node join" options */ \
 		"", \
 		/* "node service" options */ \
@@ -160,7 +167,7 @@ typedef struct
 		false, "", CLUSTER_EVENT_LIMIT,	\
 		/* "cluster cleanup" options */ \
 		0, \
-		/* Following options for internal use */ \
+		/* following options for internal use */ \
 		"/tmp", OM_TEXT	\
 }

@@ -177,6 +184,7 @@ typedef enum
 	ACTION_NONE,
 	ACTION_START,
 	ACTION_STOP,
+	ACTION_STOP_WAIT,
 	ACTION_RESTART,
 	ACTION_RELOAD,
 	ACTION_PROMOTE
@@ -202,6 +210,7 @@ extern void check_93_config(void);
 extern bool create_repmgr_extension(PGconn *conn);
 extern int	test_ssh_connection(char *host, char *remote_user);
 extern bool local_command(const char *command, PQExpBufferData *outputbuf);
+extern bool local_command_simple(const char *command, PQExpBufferData *outputbuf);

 extern standy_clone_mode get_standby_clone_mode(void);

@@ -222,7 +231,9 @@ extern void print_help_header(void);
 /* server control functions */
 extern void get_server_action(t_server_action action, char *script, char *data_dir);
 extern bool data_dir_required_for_action(t_server_action action);
+extern void get_node_config_directory(char *config_dir_buf);
 extern void get_node_data_directory(char *data_dir_buf);
 extern void init_node_record(t_node_info *node_record);
+extern bool can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);

 #endif							/* _REPMGR_CLIENT_GLOBAL_H_ */
--- a/Show More
+++ b/Show More