Update HISTORY

Patch Makefile from downstream
Per GitHub #282 Ref: https://anonscm.debian.org/cgit/pkg-postgresql/repmgr.git/tree/debian/patches/makefile-no-libs.patch
2026-03-23 15:16:29 +00:00 · 2017-05-29 11:43:30 +09:00 · 2017-05-22 22:26:27 +09:00 · 2017-05-22 22:26:21 +09:00 · 2017-05-22 22:26:15 +09:00 · 2017-05-22 22:25:53 +09:00
37 changed files with 6198 additions and 1750 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,7 +2,7 @@ License and Contributions
 =========================
 `repmgr` is licensed under the GPL v3.  All of its code and documentation is
-Copyright 2010-2016, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
+Copyright 2010-2017, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
 details.
 The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
--- a/2
+++ b/2
@@ -1,4 +1,4 @@
-Copyright (c) 2010-2016, 2ndQuadrant Limited
+Copyright (c) 2010-2017, 2ndQuadrant Limited
 All rights reserved.
 This program is free software: you can redistribute it and/or modify
--- a/FAQ.md
+++ b/FAQ.md
@@ -137,6 +137,7 @@ General
  of events which includes servers removed from the replication cluster
  which no longer have an entry in the `repl_nodes` table.
 `repmgrd`
 ---------
@@ -151,6 +152,9 @@ General
  In `repmgr.conf`, set its priority to a value of 0 or less.
  Additionally, if `failover` is set to `manual`, the node will never
  be considered as a promotion candidate.
 - Does `repmgrd` support delayed standbys?
  `repmgrd` can monitor delayed standbys - those set up with
@@ -169,3 +173,11 @@ General
  Configure your system's `logrotate` service to do this; see example
  in README.md
 - I've recloned a failed master as a standby, but `repmgrd` refuses to start?
  Check you registered the standby after recloning. If unregistered the standby
  cannot be considered as a promotion candidate even if `failover` is set to
  `automatic`, which is probably not what you want. `repmgrd` will start if
  `failover` is set to `manual` so the node's replication status can still
  be monitored, if desired.
--- a/74
+++ b/74
@@ -1,3 +1,77 @@
 3.3.2   2017-06-01
        Add support for PostgreSQL 10 (Ian)
        repmgr: ensure --replication-user option is honoured when passing database
          connection parameters as a conninfo string (Ian)
        repmgr: improve detection of pg_rewind on remote server (Ian)
        repmgr: add DETAIL log output for additional clarification of error messages (Ian)
        repmgr: suppress various spurious error messages in `standby follow` and
          `standby switchover` (Ian)
        repmgr: add missing `-P` option (Ian)
        repmgrd: monitoring statistic reporting fixes (Ian)
 3.3.1   2017-03-13
        repmgrd: prevent invalid apply lag value being written to the
          monitoring table (Ian)
        repmgrd: fix error in XLogRecPtr conversion when calculating
          monitoring statistics (Ian)
        repmgr: if replication slots in use, where possible delete slot on old
          upstream node after following new upstream (Ian)
        repmgr: improve logging of rsync actions (Ian)
        repmgr: improve `standby clone` when synchronous replication in use (Ian)
        repmgr: stricter checking of allowed node id values
        repmgr: enable `master register --force` when there is a foreign key
          dependency from a standby node (Ian)
 3.3     2016-12-27
        repmgr: always log to STDERR even if log facility defined (Ian)
        repmgr: add --log-to-file to log repmgr output to the defined
          log facility (Ian)
        repmgr: improve handling of command line parameter errors (Ian)
        repmgr: add option --upstream-conninfo to explicitly set
          'primary_conninfo' in recovery.conf (Ian)
        repmgr: enable a standby to be registered which isn't running (Ian)
        repmgr: enable `standby register --force` to update a node record
          with cascaded downstream node records (Ian)
        repmgr: add option `--no-conninfo-password` (Abhijit, Ian)
        repmgr: add initial support for PostgreSQL 10.0 (Ian)
        repmgr: escape values in primary_conninfo if needed (Ian)
 3.2.1   2016-10-24
        repmgr: require a valid repmgr cluster name unless -F/--force
          supplied (Ian)
        repmgr: check master server is registered with repmgr before
          cloning (Ian)
        repmgr: ensure data directory defaults to that of the source node (Ian)
        repmgr: various fixes to Barman cloning mode (Gianni, Ian)
        repmgr: fix `repmgr cluster crosscheck` output (Ian)
 3.2     2016-10-05
        repmgr: add support for cloning from a Barman backup (Gianni)
        repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni)
        repmgr: suppress connection error display in `repmgr cluster show`
          unless `--verbose` supplied (Ian)
        repmgr: add commands `witness register` and `witness unregister` (Ian)
        repmgr: enable `standby unregister` / `witness unregister` to be
          executed for a node which is not running (Ian)
        repmgr: remove deprecated command line options --initdb-no-pwprompt and
           -l/--local-port (Ian)
        repmgr: before cloning with pg_basebackup, check that sufficient free
           walsenders are available (Ian)
        repmgr: add option `--wait-sync` for `standby register` which causes
           repmgr to wait for the registered node record to synchronise to
           the standby (Ian)
        repmgr: add option `--copy-external-config-files` for files outside
           of the data directory (Ian)
        repmgr: only require `wal_keep_segments` to be set in certain corner
           cases (Ian)
        repmgr: better support cloning from a node other than the one to
           stream from (Ian)
        repmgrd: add configuration options to override the default pg_ctl
           commands (Jarkko Oranen, Ian)
        repmgrd: don't start if node is inactive and failover=automatic (Ian)
        packaging: improve "repmgr-auto" Debian package (Gianni)
 3.1.5   2016-08-15
        repmgrd: in a failover situation, prevent endless looping when
          attempting to establish the status of a node with
--- a/17
+++ b/17
@@ -1,15 +1,16 @@
 #
 # Makefile
-# Copyright (c) 2ndQuadrant, 2010-2016
+# Copyright (c) 2ndQuadrant, 2010-2017
 HEADERS = $(wildcard *.h)
 repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o
-repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o dirmod.o
+repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o dirmod.o compat.o
 DATA = repmgr.sql uninstall_repmgr.sql
 REGRESS = repmgr_funcs repmgr_test
-PG_CPPFLAGS = -I$(libpq_srcdir)
+PG_CPPFLAGS = -I$(includedir_internal) -I$(libpq_srcdir)
 PG_LIBS     = $(libpq_pgport)
@@ -17,11 +18,11 @@ all: repmgrd repmgr
 	$(MAKE) -C sql
 repmgrd: $(repmgrd_OBJS)
-	$(CC) -o repmgrd $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS)
+	$(CC) -o repmgrd $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX)
 	$(MAKE) -C sql
 repmgr: $(repmgr_OBJS)
-	$(CC) -o repmgr $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS)
+	$(CC) -o repmgr $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX)
 # Make all objects depend on all include files. This is a bit of a
 # shotgun approach, but the codebase is small enough that a complete rebuild
@@ -87,10 +88,12 @@ PG_VERSION = $(shell pg_config --version | cut -d ' ' -f 2 | cut -d '.' -f 1,2)
 REPMGR_VERSION = $(shell grep REPMGR_VERSION version.h | cut -d ' ' -f 3 | cut -d '"' -f 2)
 PKGLIBDIR = $(shell pg_config --pkglibdir)
 SHAREDIR = $(shell pg_config --sharedir)
 PGBINDIR = /usr/lib/postgresql/$(PG_VERSION)/bin
 deb: repmgrd repmgr
-	mkdir -p ./debian/usr/bin
+	mkdir -p ./debian/usr/bin ./debian$(PGBINDIR)
-	cp repmgrd repmgr ./debian/usr/bin/
+	cp repmgrd repmgr ./debian$(PGBINDIR)
 	ln -s ../..$(PGBINDIR)/repmgr ./debian/usr/bin/repmgr
 	mkdir -p ./debian$(SHAREDIR)/contrib/
 	cp sql/repmgr_funcs.sql ./debian$(SHAREDIR)/contrib/
 	cp sql/uninstall_repmgr_funcs.sql ./debian$(SHAREDIR)/contrib/
--- a/README.md
+++ b/README.md
@@ -7,6 +7,8 @@ replication capabilities with utilities to set up standby servers, monitor
 replication, and perform administrative tasks such as failover or switchover
 operations.
 The current `repmgr` version (3.3) supports all PostgreSQL versions from
 9.3 to 9.6.
 Overview
 --------
@@ -119,7 +121,8 @@ views:
    status for each node
 The `repmgr` metadata schema can be stored in an existing database or in its own
-dedicated database.
+dedicated database. Note that the `repmgr` metadata schema cannot reside on a database
 server which is not part of the replication cluster managed by `repmgr`.
 A dedicated database superuser is required to own the meta-database as well as carry
 out administrative actions.
@@ -143,10 +146,27 @@ The `repmgr` tools must be installed on each server in the replication cluster.
 A dedicated system user for `repmgr` is *not* required; as many `repmgr` and
 `repmgrd` actions require direct access to the PostgreSQL data directory,
-it should be executed by the `postgres` user.
+these commands should be executed by the `postgres` user.
-Additionally, we recommend installing `rsync` and enabling passwordless
+Passwordless `ssh` connectivity between all servers in the replication cluster
-`ssh` connectivity between all servers in the replication cluster.
+is not required, but is necessary in the following cases:
 * if you need `repmgr` to copy configuration files from outside the PostgreSQL
  data directory
 * when using `rsync` to clone a standby
 * to perform switchover operations
 * when executing `repmgr cluster matrix` and `repmgr cluster crosscheck`
 In these cases `rsync` is required on all servers too.
 * * *
 > *TIP*: We recommend using a session multiplexer utility such as `screen` or
 > `tmux` when performing long-running actions (such as cloning a database)
 > on a remote server - this will ensure the `repmgr` action won't be prematurely
 > terminated if your `ssh` session to the server is interrupted or closed.
 * * *
 ### Packages
@@ -210,15 +230,29 @@ The configuration file will be searched for in the following locations:
 Note that if a file is explicitly specified with `-f/--config-file`, an error will
 be raised if it is not found or not readable and no attempt will be made to check
-default locations; this is to prevent `repmgr` reading the wrong file.
+default locations; this is to prevent `repmgr` unexpectedly reading the wrong file.
 For a full list of annotated configuration items, see the file `repmgr.conf.sample`.
 The following parameters in the configuration file can be overridden with
 command line options:
- `-L/--log-level`
+- `log_level` with `-L/--log-level`
- `-b/--pg_bindir`
+- `pg_bindir` with `-b/--pg_bindir`
 ### Logging
 By default `repmgr` and `repmgrd` will log directly to `STDERR`. For `repmgrd`
 we recommend capturing output in a logfile or using your system's log facility;
 see `repmgr.conf.sample` for details.
 As a command line utility, `repmgr` will log directly to the console by default
 (this is a change in behaviour from versions before 3.3, where it would always
 log to the same location as `repmgrd`). However in some circumstances, such as
 when `repmgr` is executed by `repmgrd` during a failover event, it makes sense to
 capture `repmgr`'s log output - this can be done by supplying the command-line
 option `--log-to-file` to `repmgr`.
 ### Command line options and environment variables
@@ -255,21 +289,21 @@ Setting up a simple replication cluster with repmgr
 The following section will describe how to set up a basic replication cluster
 with a master and a standby server using the `repmgr` command line tool.
 It is assumed PostgreSQL is installed on both servers in the cluster,
-`rsync` is available and password-less SSH connections are possible between
+`rsync` is available and passwordless SSH connections are possible between
 both servers.
 * * *
 > *TIP*: for testing `repmgr`, it's possible to use multiple PostgreSQL
 > instances running on different ports on the same computer, with
-> password-less SSH access to `localhost` enabled.
+> passwordless SSH access to `localhost` enabled.
 * * *
 ### PostgreSQL configuration
 On the master server, a PostgreSQL instance must be initialised and running.
-The following replication settings must be included in `postgresql.conf`:
+The following replication settings may need to be adjusted:
    # Enable replication connections; set this figure to at least one more
@@ -284,13 +318,6 @@ The following replication settings must be included in `postgresql.conf`:
    wal_level = 'hot_standby'
    # How much WAL to retain on the master to allow a temporarily
    # disconnected standby to catch up again. The larger this is, the
    # longer the standby can be disconnected. This is needed only in
    # 9.3; from 9.4, replication slots can be used instead (see below).
    wal_keep_segments = 5000
    # Enable read-only queries on a standby
    # (Note: this will be ignored on a master but we recommend including
    # it anyway)
@@ -305,6 +332,15 @@ The following replication settings must be included in `postgresql.conf`:
    # ignores archiving. Use something more sensible.
    archive_command = '/bin/true'
    # If cloning using rsync, or you have configured `pg_basebackup_options`
    # in `repmgr.conf` to include the setting `--xlog-method=fetch` (from
    # PostgreSQL 10 `--wal-method=fetch`), *and* you have not set
    # `restore_command` in `repmgr.conf`to fetch WAL files from another
    # source such as Barman, you'll need to set `wal_keep_segments` to a
    # high enough value to ensure that all WAL files generated while
    # the standby is being cloned are retained until the standby starts up.
    # wal_keep_segments = 5000
 * * *
@@ -355,7 +391,8 @@ least the following parameters:
 - `cluster`: an arbitrary name for the replication cluster; this must be identical
     on all nodes
- `node`: a unique integer identifying the node
+- `node`: a unique integer identifying the node; note this must be a positive
     32 bit signed integer between 1 and 2147483647
 - `node_name`: a unique string identifying the node; we recommend a name
     specific to the server (e.g. 'server_1'); avoid names indicating the
     current replication role like 'master' or 'standby' as the server's
@@ -375,6 +412,16 @@ to include this schema name, e.g.
    ALTER USER repmgr SET search_path TO repmgr_test, "$user", public;
 * * *
 > *TIP*: for Debian-based distributions we recommend explictly setting
 > `pg_bindir` to the directory where `pg_ctl` and other binaries not in
 > the standard path are located. For PostgreSQL 9.5 this would be
 > `/usr/lib/postgresql/9.5/bin/`.
 * * *
 ### Initialise the master server
 To enable `repmgr` to support a replication cluster, the master node must
@@ -382,7 +429,7 @@ be registered with `repmgr`, which creates the `repmgr` database and adds
 a metadata record for the server:
    $ repmgr -f repmgr.conf master register
-    [2016-01-07 16:56:46] [NOTICE] master node correctly registered for cluster test with id 1 (conninfo: host=repmgr_node1 user=repmgr dbname=repmgr)
+    NOTICE: master node correctly registered for cluster test with id 1 (conninfo: host=repmgr_node1 user=repmgr dbname=repmgr)
 The metadata record looks like this:
@@ -409,19 +456,46 @@ the values `node`, `node_name` and `conninfo` adjusted accordingly, e.g.:
 Clone the standby with:
    $ repmgr -h repmgr_node1 -U repmgr -d repmgr -D /path/to/node2/data/ -f /etc/repmgr.conf standby clone
-    [2016-01-07 17:21:26] [NOTICE] destination directory '/path/to/node2/data/' provided
+    NOTICE: destination directory '/path/to/node2/data/' provided
-    [2016-01-07 17:21:26] [NOTICE] starting backup...
+    NOTICE: starting backup...
-    [2016-01-07 17:21:26] [HINT] this may take some time; consider using the -c/--fast-checkpoint option
+    HINT: this may take some time; consider using the -c/--fast-checkpoint option
    NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
-    [2016-01-07 17:21:28] [NOTICE] standby clone (using pg_basebackup) complete
+    NOTICE: standby clone (using pg_basebackup) complete
-    [2016-01-07 17:21:28] [NOTICE] you can now start your PostgreSQL server
+    NOTICE: you can now start your PostgreSQL server
-    [2016-01-07 17:21:28] [HINT] for example : pg_ctl -D /path/to/node2/data/ start
+    HINT: for example : pg_ctl -D /path/to/node2/data/ start
 This will clone the PostgreSQL data directory files from the master at `repmgr_node1`
 using PostgreSQL's `pg_basebackup` utility. A `recovery.conf` file containing the
 correct parameters to start streaming from this master server will be created
-automatically, and unless otherwise specified, the `postgresql.conf` and `pg_hba.conf`
+automatically.
-files will be copied from the master.
+
 Note that by default, any configuration files in the master's data directory will be
 copied to the standby. Typically these will be `postgresql.conf`, `postgresql.auto.conf`,
 `pg_hba.conf` and `pg_ident.conf`. These may require modification before the standby
 is started so it functions as desired.
 In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
 configuration files are located outside of the data directory and will
 not be copied by default. `repmgr` can copy these files, either to the same
 location on the standby server (provided appropriate directory and file permissions
 are available), or into the standby's data directory. This requires passwordless
 SSH access to the master server. Add the option `--copy-external-config-files`
 to the `repmgr standby clone` command; by default files will be copied to
 the same path as on the upstream server. To have them placed in the standby's
 data directory, specify `--copy-external-config-files=pgdata`, but note that
 any include directives in the copied files may need to be updated.
 *Caveat*: when copying external configuration files: `repmgr` will only be able
 to detect files which contain active settings. If a file is referenced by
 an include directive but is empty, only contains comments or contains
 settings which have not been activated, the file will not be copied.
 * * *
 > *TIP*: for reliable configuration file management we recommend using a
 > configuration management tool such as Ansible, Chef, Puppet or Salt.
 * * *
 Be aware that when initially cloning a standby, you will need to ensure
 that all required WAL files remain available while the cloning is taking
@@ -429,7 +503,8 @@ place. To ensure this happens when using the default `pg_basebackup` method,
 `repmgr` will set `pg_basebackup`'s `--xlog-method` parameter to `stream`,
 which will ensure all WAL files generated during the cloning process are
 streamed in parallel with the main backup. Note that this requires two
-replication connections to be available.
+replication connections to be available (`repmgr` will verify sufficient
 connections are available before attempting to clone).
 To override this behaviour, in `repmgr.conf` set `pg_basebackup`'s
 `--xlog-method` parameter to `fetch`:
@@ -441,6 +516,9 @@ See the `pg_basebackup` documentation for details:
    https://www.postgresql.org/docs/current/static/app-pgbasebackup.html
 > *NOTE*: From PostgreSQL 10, `pg_basebackup`'s `--xlog-method` parameter
 > has been renamed to `--wal-method`.
 Make any adjustments to the standby's PostgreSQL configuration files now,
 then start the server.
@@ -483,8 +561,8 @@ Connect to the master server and execute:
 Register the standby server with:
-    repmgr -f /etc/repmgr.conf standby register
+    $ repmgr -f /etc/repmgr.conf standby register
-    [2016-01-08 11:13:16] [NOTICE] standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr)
+    NOTICE: standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr)
 Connect to the standby server's `repmgr` database and check the `repl_nodes`
 table:
@@ -503,13 +581,117 @@ standby's upstream server is the replication cluster master. While of limited
 use in a simple master/standby replication cluster, this information is required
 to effectively manage cascading replication (see below).
 * * *
 > *TIP*: depending on your environment and workload, it may take some time for
 > the standby's node record to propagate from the master to the standby. Some
 > actions (such as starting `repmgrd`) require that the standby's node record
 > is present and up-to-date to function correctly - by providing the option
 > `--wait-sync` to the `repmgr standby register` command, `repmgr` will wait
 > until the record is synchronised before exiting. An optional timeout (in
 > seconds) can be added to this option (e.g. `--wait-sync=60`).
 * * *
 Under some circumstances you may wish to register a standby which is not
 yet running; this can be the case when using provisioning tools to create
 a complex replication cluster. In this case, by using the `-F/--force`
 option and providing the connection parameters to the master server,
 the standby can be registered.
 Similarly, with cascading replication it may be necessary to register
 a standby whose upstream node has not yet been registered - in this case,
 using `-F/--force` will result in the creation of an inactive placeholder
 record for the upstream node, which will however later need to be registered
 with the `-F/--force` option too.
 When used with `standby register`, care should be taken that use of the
 `-F/--force` option does not result in an incorrectly configured cluster.
 ### Using Barman to clone a standby
 `repmgr standby clone` also supports Barman, the Backup and
 Replication manager (http://www.pgbarman.org/), as a provider of both
 base backups and WAL files.
 Barman support provides the following advantages:
 - the master node does not need to perform a new backup every time a
  new standby is cloned;
 - a standby node can be disconnected for longer periods without losing
  the ability to catch up, and without causing accumulation of WAL
  files on the master node;
 - therefore, `repmgr` does not need to use replication slots, and the
  master node does not need to set `wal_keep_segments`.
 > *NOTE*: In view of the above, Barman support is incompatible with
 > the `use_replication_slots` setting in `repmgr.conf`.
 In order to enable Barman support for `repmgr standby clone`, you must
 ensure that:
 - the name of the server configured in Barman is equal to the
  `cluster_name` setting in `repmgr.conf`;
 - the `barman_server` setting in `repmgr.conf` is set to the SSH
  hostname of the Barman server;
 - the `restore_command` setting in `repmgr.conf` is configured to
  use a copy of the `barman-wal-restore` script shipped with the
  `barman-cli` package (see below);
 - the Barman catalogue includes at least one valid backup for this
  server.
 > *NOTE*: Barman support is automatically enabled if `barman_server`
 > is set. Normally it is a good practice to use Barman, for instance
 > when fetching a base backup while cloning a standby; in any case,
 > Barman mode can be disabled using the `--without-barman` command
 > line option.
 > *NOTE*: if you have a non-default SSH configuration on the Barman
 > server, e.g. using a port other than 22, then you can set those
 > parameters in a dedicated Host section in `~/.ssh/config`
 > corresponding to the value of `barman_server` in `repmgr.conf`. See
 > the "Host" section in `man 5 ssh_config` for more details.
 `barman-wal-restore` is a Python script provided by the Barman
 development team as part of the `barman-cli` package (Barman 2.0
 and later; for Barman 1.x the script is provided separately as
 `barman-wal-restore.py`).
 `restore_command` must then be set in `repmgr.conf` as follows:
    <script> <Barman hostname> <cluster_name> %f %p
 For instance, suppose that we have installed Barman on the `barmansrv`
 host, and that `barman-wal-restore` is located as an executable at
 `/usr/bin/barman-wal-restore`;  `repmgr.conf` should include the following
 lines:
    barman_server=barmansrv
    restore_command=/usr/bin/barman-wal-restore barmansrv test %f %p
 NOTE: to use a non-default Barman configuration file on the Barman server,
 specify this in `repmgr.conf` with `barman_config`:
    barman_config=/path/to/barman.conf
 Now we can clone a standby using the Barman server:
    $ repmgr -h node1 -d repmgr -D 9.5/main -f /etc/repmgr.conf standby clone
    NOTICE: destination directory '9.5/main' provided
    NOTICE: getting backup from Barman...
    NOTICE: standby clone (from Barman) complete
    NOTICE: you can now start your PostgreSQL server
    HINT: for example : pg_ctl -D 9.5/data start
    HINT: After starting the server, you need to register this standby with "repmgr standby register"
 Advanced options for cloning a standby
 --------------------------------------
-The above section demonstrates the simplest possible way to cloneb a standby
+The above section demonstrates the simplest possible way to clone a standby
-server. Depending on your circumstances, finer-grained controlover the cloning
+server. Depending on your circumstances, finer-grained control over the
-process may be necessary.
+cloning process may be necessary.
 ### pg_basebackup options when cloning a standby
@@ -522,11 +704,7 @@ so should be used with care.
 Further options can be passed to the `pg_basebackup` utility via
 the setting `pg_basebackup_options` in `repmgr.conf`. See the PostgreSQL
 documentation for more details of available options:
 <<<<<<< HEAD
  http://www.postgresql.org/docs/current/static/app-pgbasebackup.html
 =======
  https://www.postgresql.org/docs/current/static/app-pgbasebackup.html
 >>>>>>> 72f9b0145afab1060dd1202c8f8937653c8b2e39
 ### Using rsync to clone a standby
@@ -544,34 +722,41 @@ and destination server as the contents of files existing on both servers need
 to be compared, meaning this method is not necessarily faster than making a
 fresh clone with `pg_basebackup`.
-### Dealing with PostgreSQL configuration files
+> *NOTE*: `barman-wal-restore` supports command line switches to
-
+> control parallelism (`--parallel=N`) and compression (`--bzip2`,
-By default, `repmgr` will attempt to copy the standard configuration files
+> `--gzip`).
 (`postgresql.conf`, `pg_hba.conf` and `pg_ident.conf`) even if they are located
 outside of the data directory (though currently they will be copied
 into the standby's data directory). To prevent this happening, when executing
 `repmgr standby clone` provide the `--ignore-external-config-files` option.
 If using `rsync` to clone a standby, additional control over which files
 not to transfer is possible by configuring `rsync_options` in `repmgr.conf`,
 which enables any valid `rsync` options to be passed to that command, e.g.:
    rsync_options='--exclude=postgresql.local.conf'
 ### Controlling `primary_conninfo` in `recovery.conf`
-`repmgr` will create the `primary_conninfo` setting in `recovery.conf` based
+The `primary_conninfo` setting in `recovery.conf` generated by `repmgr`
-on the connection parameters provided to `repmgr standby clone` and PostgreSQL's
+is generated from the following sources, in order of highest to lowest priority:
-standard connection defaults, including any environment variables set on the
+
-local node.
+- the upstream node's `conninfo` setting (as defined in the `repl_nodes` table)
 - the connection parameters provided to `repmgr standby clone`
 - PostgreSQL's standard connection defaults, including any environment variables
  set on the local node.
 To include specific connection parameters other than the standard host, port,
 username and database values (e.g. `sslmode`), include these in a `conninfo`-style
-tring passed to `repmgr` with `-d/--dbname` (see above for details), and/or set
+string passed to `repmgr` with `-d/--dbname` (see above for details), and/or set
 appropriate environment variables.
 Note that PostgreSQL will always set explicit defaults for `sslmode` and
-`sslcompression`.
+`sslcompression` (and from PostgreSQL 10.0 also `target_session_attrs`).
 If `application_name` is set in the standby's `conninfo` parameter in
 `repmgr.conf`, this value will be appended to `primary_conninfo`, otherwise
 `repmgr` will set `application_name` to the same value as the `node_name`
 parameter.
 By default `repmgr` assumes the user who owns the `repmgr` metadatabase will
 also be the replication user; a different replication user can be specified
 with `--replication-user`.
 If the upstream server requires a password, and this was provided via
 `PGPASSWORD`, `.pgpass` etc., by default `repmgr` will include this in
 `primary_conninfo`. Use the command line option `--no-conninfo-password` to
 suppress this.
 Setting up cascading replication with repmgr
@@ -606,15 +791,15 @@ created standby. Clone this standby (using the connection parameters
 for the existing standby) and register it:
    $ repmgr -h repmgr_node2 -U repmgr -d repmgr -D /path/to/node3/data/ -f /etc/repmgr.conf standby clone
-    [2016-01-08 13:44:52] [NOTICE] destination directory 'node_3/data/' provided
+    NOTICE: destination directory 'node_3/data/' provided
-    [2016-01-08 13:44:52] [NOTICE] starting backup (using pg_basebackup)...
+    NOTICE: starting backup (using pg_basebackup)...
-    [2016-01-08 13:44:52] [HINT] this may take some time; consider using the -c/--fast-checkpoint option
+    HINT: this may take some time; consider using the -c/--fast-checkpoint option
-    [2016-01-08 13:44:52] [NOTICE] standby clone (using pg_basebackup) complete
+    NOTICE: standby clone (using pg_basebackup) complete
-    [2016-01-08 13:44:52] [NOTICE] you can now start your PostgreSQL server
+    NOTICE: you can now start your PostgreSQL server
-    [2016-01-08 13:44:52] [HINT] for example : pg_ctl -D /path/to/node_3/data start
+    HINT: for example : pg_ctl -D /path/to/node_3/data start
    $ repmgr -f /etc/repmgr.conf standby register
-    [2016-01-08 14:04:32] [NOTICE] standby node correctly registered for cluster test with id 3 (conninfo: host=repmgr_node3 dbname=repmgr user=repmgr)
+    NOTICE: standby node correctly registered for cluster test with id 3 (conninfo: host=repmgr_node3 dbname=repmgr user=repmgr)
 After starting the standby, the `repl_nodes` table will look like this:
@@ -626,6 +811,15 @@ After starting the standby, the `repl_nodes` table will look like this:
      3 | standby |                2 | test    | node3 | host=repmgr_node3 dbname=repmgr user=repmgr |           |      100 | t
    (3 rows)
 * * *
 > *TIP*: under some circumstances when setting up a cascading replication
 > cluster, you may wish to clone a downstream standby whose upstream node
 > does not yet exist. In this case you can clone from the master (or
 > another upstream node) and provide the parameter `--upstream-conninfo`
 > to explictly set the upstream's `primary_conninfo` string in `recovery.conf`.
 * * *
 Using replication slots with repmgr
 -----------------------------------
@@ -711,19 +905,19 @@ Promote the first standby with:
 This will produce output similar to the following:
-    [2016-01-08 16:07:31] [ERROR] connection to database failed: could not connect to server: Connection refused
+    ERROR: connection to database failed: could not connect to server: Connection refused
            Is the server running on host "repmgr_node1" (192.161.2.1) and accepting
            TCP/IP connections on port 5432?
    could not connect to server: Connection refused
            Is the server running on host "repmgr_node1" (192.161.2.1) and accepting
            TCP/IP connections on port 5432?
-    [2016-01-08 16:07:31] [NOTICE] promoting standby
+    NOTICE: promoting standby
-    [2016-01-08 16:07:31] [NOTICE] promoting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_2/data promote'
+    NOTICE: promoting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_2/data promote'
    server promoting
-    [2016-01-08 16:07:33] [NOTICE] STANDBY PROMOTE successful
+    NOTICE: STANDBY PROMOTE successful
-Note: the first `[ERROR]` is `repmgr` attempting to connect to the current
+Note: the first `ERROR` is `repmgr` attempting to connect to the current
 master to verify that it has failed. If a valid master is found, `repmgr`
 will refuse to promote a standby.
@@ -755,7 +949,7 @@ end of the preceding section ("Promoting a standby server with repmgr"),
 execute this:
    $ repmgr -f /etc/repmgr.conf -D /path/to/node_3/data/ -h repmgr_node2 -U repmgr -d repmgr standby follow
-    [2016-01-08 16:57:06] [NOTICE] restarting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_3/data/ -w -m fast restart'
+    NOTICE: restarting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_3/data/ -w -m fast restart'
    waiting for server to shut down.... done
    server stopped
    waiting for server to start.... done
@@ -827,26 +1021,26 @@ local server, as well as the normal default locations. `repmgr` will check
 this file can be found before performing any further actions.
    $ repmgr -f /etc/repmgr.conf -C /etc/repmgr.conf standby switchover -v
-    [2016-01-27 16:38:33] [NOTICE] using configuration file "/etc/repmgr.conf"
+    NOTICE: using configuration file "/etc/repmgr.conf"
-    [2016-01-27 16:38:33] [NOTICE] switching current node 2 to master server and demoting current master to standby...
+    NOTICE: switching current node 2 to master server and demoting current master to standby...
-    [2016-01-27 16:38:34] [NOTICE] 5 files copied to /tmp/repmgr-node1-archive
+    NOTICE: 5 files copied to /tmp/repmgr-node1-archive
-    [2016-01-27 16:38:34] [NOTICE] connection to database failed: FATAL:  the database system is shutting down
+    NOTICE: connection to database failed: FATAL:  the database system is shutting down
-    [2016-01-27 16:38:34] [NOTICE] current master has been stopped
+    NOTICE: current master has been stopped
-    [2016-01-27 16:38:34] [ERROR] connection to database failed: FATAL:  the database system is shutting down
+    ERROR: connection to database failed: FATAL:  the database system is shutting down
-    [2016-01-27 16:38:34] [NOTICE] promoting standby
+    NOTICE: promoting standby
-    [2016-01-27 16:38:34] [NOTICE] promoting server using '/usr/local/bin/pg_ctl -D /var/lib/postgresql/9.5/node_2/data promote'
+    NOTICE: promoting server using '/usr/local/bin/pg_ctl -D /var/lib/postgresql/9.5/node_2/data promote'
    server promoting
-    [2016-01-27 16:38:36] [NOTICE] STANDBY PROMOTE successful
+    NOTICE: STANDBY PROMOTE successful
-    [2016-01-27 16:38:36] [NOTICE] Executing pg_rewind on old master server
+    NOTICE: Executing pg_rewind on old master server
-    [2016-01-27 16:38:36] [NOTICE] 5 files copied to /var/lib/postgresql/9.5/data
+    NOTICE: 5 files copied to /var/lib/postgresql/9.5/data
-    [2016-01-27 16:38:36] [NOTICE] restarting server using '/usr/local/bin/pg_ctl -w -D /var/lib/postgresql/9.5/node_1/data -m fast restart'
+    NOTICE: restarting server using '/usr/local/bin/pg_ctl -w -D /var/lib/postgresql/9.5/node_1/data -m fast restart'
    pg_ctl: PID file "/var/lib/postgresql/9.5/node_1/data/postmaster.pid" does not exist
    Is server running?
    starting server anyway
-    [2016-01-27 16:38:37] [NOTICE] node 1 is replicating in state "streaming"
+    NOTICE: node 1 is replicating in state "streaming"
-    [2016-01-27 16:38:37] [NOTICE] switchover was successful
+    NOTICE: switchover was successful
 Messages containing the line `connection to database failed: FATAL: the database
 system is shutting down` are not errors - `repmgr` is polling the old master database
@@ -872,7 +1066,7 @@ should have been updated to reflect this:
  at a two-server master/standby replication cluster and currently does
  not support additional standbys.
 - `repmgr standby switchover` is designed to use the `pg_rewind` utility,
-  standard in 9.5 and later and available for seperately in 9.3 and 9.4
+  standard in 9.5 and later and available separately in 9.3 and 9.4
  (see note below)
 - `pg_rewind` *requires* that either `wal_log_hints` is enabled, or that
   data checksums were enabled when the cluster was initialized. See the
@@ -884,7 +1078,7 @@ should have been updated to reflect this:
  instructed to point to the new master (e.g. with `repmgr standby follow`).
 - You must ensure that following a server start using `pg_ctl`, log output
  is not send to STDERR (the default behaviour). If logging is not configured,
-  We recommend setting `logging_collector=on` in `postgresql.conf` and
+  we recommend setting `logging_collector=on` in `postgresql.conf` and
  providing an explicit `-l/--log` setting in `repmgr.conf`'s `pg_ctl_options`
  parameter.
@@ -900,7 +1094,7 @@ will have diverged slightly following the shutdown of the old master.
 The utility `pg_rewind` provides an efficient way of doing this, however
 is not included in the core PostgreSQL distribution for versions 9.3 and 9.4.
-Hoever, `pg_rewind` is available separately for these versions and we
+However, `pg_rewind` is available separately for these versions and we
 strongly recommend its installation. To use it with versions 9.3 and 9.4,
 provide the command line option `--pg_rewind`, optionally with the
 path to the `pg_rewind` binary location if not installed in the PostgreSQL
@@ -909,6 +1103,10 @@ path to the `pg_rewind` binary location if not installed in the PostgreSQL
 `pg_rewind` for versions 9.3 and 9.4 can be obtained from:
  https://github.com/vmware/pg_rewind
 Note that building this version of `pg_rewind` requires the PostgreSQL source
 code. Also, PostgreSQL 9.3 does not provide `wal_log_hints`, meaning data
 checksums must have been enabled when the database was initialized.
 If `pg_rewind` is not available, as a fallback `repmgr` will use `repmgr
 standby clone` to resynchronise the old master's data directory using
 `rsync`. However, in order to ensure all files are synchronised, the
@@ -928,16 +1126,17 @@ This will remove the standby record from `repmgr`'s internal metadata
 table (`repl_nodes`). A `standby_unregister` event notification will be
 recorded in the `repl_events` table.
-Note that this command will not stop the server itself or remove
+Note that this command will not stop the server itself or remove it from
-it from the replication cluster.
+the replication cluster. Note that if the standby was using a replication
 slot, this will not be removed.
-If the standby is not running, the standby record must be manually
+If the standby is not running, the command can be executed on another
-removed from the `repl_nodes` table with e.g.:
+node by providing the id of the node to be unregistered using
 the command line parameter `--node`, e.g. executing the following
 command on the master server will unregister the standby with
 id 3:
-    DELETE FROM repmgr_test.repl_nodes WHERE id = 3;
+    repmgr standby unregister -f /etc/repmgr.conf --node=3
 Adjust schema and node ID accordingly. A future `repmgr` release
 will make it possible to unregister failed standbys.
 Automatic failover with `repmgrd`
@@ -947,19 +1146,23 @@ Automatic failover with `repmgrd`
 and which can automate actions such as failover and updating standbys to
 follow the new master.
-To use `repmgrd` for automatic failover, the following `repmgrd` options must
+To use `repmgrd` for automatic failover, `postgresql.conf` must contain the
-be set in `repmgr.conf`:
+following line:
    failover=automatic
    promote_command='repmgr standby promote -f /etc/repmgr/repmgr.conf'
    follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf'
 (See `repmgr.conf.sample` for further `repmgrd`-specific settings).
 Additionally, `postgresql.conf` must contain the following line:
    shared_preload_libraries = 'repmgr_funcs'
 (changing this setting requires a restart of PostgreSQL).
 Additionally the following `repmgrd` options must be set in `repmgr.conf`:
    failover=automatic
    promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
    follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file'
 Note that the `--log-to-file` option will cause `repmgr`'s output to be logged to
 the destination configured to receive log output for `repmgrd`.
 See `repmgr.conf.sample` for further `repmgrd`-specific settings
 When `failover` is set to `automatic`, upon detecting failure of the current
 master, `repmgrd` will execute one of `promote_command` or `follow_command`,
 depending on whether the current server is becoming the new master or
@@ -1240,7 +1443,8 @@ The following event types are available:
  * `standby_switchover`
  * `standby_disconnect_manual`
  * `witness_create`
-  * `witness_create`
+  * `witness_register`
  * `witness_unregister`
  * `repmgrd_start`
  * `repmgrd_shutdown`
  * `repmgrd_failover_promote`
@@ -1260,7 +1464,45 @@ functionality will be included in a feature release (e.g. 3.0.x to 3.1.x).
 In general `repmgr` can be upgraded as-is without any further action required,
 however feature releases may require the `repmgr` database to be upgraded.
-An SQL script will be provided - please check the release notes for details.
+An SQL script will be provided - please check the release notes for details:
 * http://repmgr.org/release-notes-3.3.html#UPGRADING
 Distribution-specific configuration
 -----------------------------------
 `repmgr` is largely OS-agnostic and can be run on any UNIX-like environment
 including various Linux distributions, Solaris, macOS and the various BSDs.
 However, often OS-specific configuration is required, particularly when
 dealing with system service management (e.g. stopping and starting the
 PostgreSQL server), file paths and configuration file locations.
 ### PostgreSQL server control
 By default, `repmgr` will use PostgreSQL's standard `pg_ctl` utility to control
 a running PostgreSQL server. However it may be better to use the operating
 system's service management system, e.g. `systemd`. To specify which service
 control commands are used, the following `repmgr.conf` configuration settings
 are available:
    service_start_command
    service_stop_command
    service_restart_command
    service_reload_command
    service_promote_command
 See `repmgr.conf.sample` for further details.
 ### Binary directory
 Some PostgreSQL system packages, such as those provided for Debian/Ubuntu, like
 to hide some PostgreSQL utility programs outside of the default path. To ensure
 `repmgr` finds all required executables, explicitly set `pg_bindir` to the
 appropriate location, e.g. for PostgreSQL 9.6 on Debian/Ubuntu this would be
 `/usr/lib/postgresql/9.6/bin/`.
 Reference
 ---------
@@ -1327,7 +1569,7 @@ which contains connection details for the local database.
    bootstrapping new installations. To update an existing but 'stale'
    data directory (for example belonging to a failed master), `rsync`
    must be used by specifying `--rsync-only`. In this case,
-    password-less SSH connections between servers are required.
+    passwordless SSH connections between servers are required.
 * `standby promote`
@@ -1341,13 +1583,13 @@ which contains connection details for the local database.
    by using `standby follow` (see below); if `repmgrd` is active, it will
    handle this.
-    This command will not function if the current master is still running.
+    This command will fail with an error if the current master is still running.
 * `standby switchover`
    Promotes a standby to master and demotes the existing master to a standby.
    This command must be run on the standby to be promoted, and requires a
-    password-less SSH connection to the current master. Additionally the
+    passwordless SSH connection to the current master. Additionally the
    location of the master's `repmgr.conf` file must be provided with
    `-C/--remote-config-file`.
@@ -1385,17 +1627,31 @@ which contains connection details for the local database.
    This command also requires the location of the witness server's data
    directory to be provided (`-D/--datadir`) as well as valid connection
-    parameters for the master server.
+    parameters for the master server. If not explicitly provided,
    database and user names will be extracted from the `conninfo` string in
    `repmgr.conf`.
    By default this command will create a superuser and a repmgr user.
    The `repmgr` user name will be extracted from the `conninfo` string
    in `repmgr.conf`.
 * `witness register`
    This will set up the witness server configuration, including the witness
    server's copy of the `repmgr` meta database, on a running PostgreSQL
    instance and register the witness server with the master. It requires
    the same command line options as `witness create`.
 * `witness unregister`
    Removes the entry for a witness server from the `repl_nodes` table. This
    command will not shut down the witness server or remove its data directory.
 * `cluster show`
    Displays information about each active node in the replication cluster. This
-    command polls each registered server and shows its role (master / standby /
+    command polls each registered server and shows its role (`master` / `standby` /
-    witness) or `FAILED` if the node doesn't respond. It polls each server
+    `witness`) or `FAILED` if the node doesn't respond. It polls each server
    directly and can be run on any node in the cluster; this is also useful
    when analyzing connectivity from a particular node.
@@ -1425,7 +1681,102 @@ which contains connection details for the local database.
        3,1
    The first column is the node's ID, and the second column represents the
-    node's status (0 = master, 1 = standby, -1 = failed).
+    node's status (0 = available, -1 = failed).
 * `cluster matrix` and `cluster crosscheck`
    These commands display connection information for each pair of
    nodes in the replication cluster.
    - `cluster matrix` runs a `cluster show` on each node and arranges
      the results in a matrix, recording success or failure;
    - `cluster crosscheck` runs a `cluster matrix` on each node and
      combines the results in a single matrix, providing a full
      overview of connections between all databases in the cluster.
    These commands require a valid `repmgr.conf` file on each node.
    Additionally passwordless `ssh` connections are required between
    all nodes.
    Example 1 (all nodes up):
        $ repmgr -f /etc/repmgr.conf cluster matrix
        Name   | Id |  1 |  2 |  3
        -------+----+----+----+----
         node1 |  1 |  * |  * |  *
         node2 |  2 |  * |  * |  *
         node3 |  3 |  * |  * |  *
    Here `cluster matrix` is sufficient to establish the state of each
    possible connection.
    Example 2 (node1 and `node2` up, `node3` down):
        $ repmgr -f /etc/repmgr.conf cluster matrix
        Name   | Id |  1 |  2 |  3
        -------+----+----+----+----
         node1 |  1 |  * |  * |  x
         node2 |  2 |  * |  * |  x
         node3 |  3 |  ? |  ? |  ?
    Each row corresponds to one server, and indicates the result of
    testing an outbound connection from that server.
    Since `node3` is down, all the entries in its row are filled with
    "?", meaning that there we cannot test outbound connections.
    The other two nodes are up; the corresponding rows have "x" in the
    column corresponding to node3, meaning that inbound connections to
    that node have failed, and "*" in the columns corresponding to
    node1 and node2, meaning that inbound connections to these nodes
    have succeeded.
    In this case, `cluster crosscheck` gives the same result as `cluster
    matrix`, because from any functioning node we can observe the same
    state: `node1` and `node2` are up, `node3` is down.
    Example 3 (all nodes up, firewall dropping packets originating
               from `node1` and directed to port 5432 on node3)
    Running `cluster matrix` from `node1` gives the following output:
        $ repmgr -f /etc/repmgr.conf cluster matrix
        Name   | Id |  1 |  2 |  3
        -------+----+----+----+----
         node1 |  1 |  * |  * |  x
         node2 |  2 |  * |  * |  *
         node3 |  3 |  ? |  ? |  ?
    (Note this may take some time depending on the `connect_timeout`
    setting in the registered node `conninfo` strings; default is 1
    minute which means without modification the above command would
    take around 2 minutes to run; see comment elsewhere about setting
    `connect_timeout`)
    The matrix tells us that we cannot connect from `node1` to `node3`,
    and that (therefore) we don't know the state of any outbound
    connection from node3.
    In this case, the `cluster crosscheck` command is more informative:
        $ repmgr -f /etc/repmgr.conf cluster crosscheck
        Name   | Id |  1 |  2 |  3
        -------+----+----+----+----
         node1 |  1 |  * |  * |  x
         node2 |  2 |  * |  * |  *
         node3 |  3 |  * |  * |  *
    What happened is that `cluster crosscheck` merged its own `cluster
    matrix` with the `cluster matrix` output from `node2`; the latter is
    able to connect to `node3` and therefore determine the state of
    outbound connections from that node.
 * `cluster cleanup`
@@ -1439,27 +1790,45 @@ which contains connection details for the local database.
    the current working directory; no additional arguments are required.
 ### Further documentation
 As well as this README, the `repmgr` source contains following additional
 documentation files:
 * FAQ.md - frequently asked questions
 * CONTRIBUTING.md - how to contribute to `repmgr`
 * PACKAGES.md - details on building packages
 * SSH-RSYNC.md - how to set up passwordless SSH between nodes
 * docs/repmgrd-failover-mechanism.md - how repmgrd picks which node to promote
 * docs/repmgrd-node-fencing.md - how to "fence" a failed master node
 ### Error codes
 `repmgr` or `repmgrd` will return one of the following error codes on program
 exit:
-* SUCCESS (0)               Program ran successfully.
+* SUCCESS (0)                Program ran successfully.
-* ERR_BAD_CONFIG (1)        Configuration file could not be parsed or was invalid
+* ERR_BAD_CONFIG (1)         Configuration file could not be parsed or was invalid
-* ERR_BAD_RSYNC (2)         An rsync call made by the program returned an error (repmgr only)
+* ERR_BAD_RSYNC (2)          An rsync call made by the program returned an error
-* ERR_NO_RESTART (4)        An attempt to restart a PostgreSQL instance failed
+                               (repmgr only)
-* ERR_DB_CON (6)            Error when trying to connect to a database
+* ERR_NO_RESTART (4)         An attempt to restart a PostgreSQL instance failed
-* ERR_DB_QUERY (7)          Error while executing a database query
+* ERR_DB_CON (6)             Error when trying to connect to a database
-* ERR_PROMOTED (8)          Exiting program because the node has been promoted to master
+* ERR_DB_QUERY (7)           Error while executing a database query
-* ERR_STR_OVERFLOW (10)     String overflow error
+* ERR_PROMOTED (8)           Exiting program because the node has been promoted to master
-* ERR_FAILOVER_FAIL (11)    Error encountered during failover (repmgrd only)
+* ERR_STR_OVERFLOW (10)      String overflow error
-* ERR_BAD_SSH (12)          Error when connecting to remote host via SSH (repmgr only)
+* ERR_FAILOVER_FAIL (11)     Error encountered during failover (repmgrd only)
-* ERR_SYS_FAILURE (13)      Error when forking (repmgrd only)
+* ERR_BAD_SSH (12)           Error when connecting to remote host via SSH (repmgr only)
-* ERR_BAD_BASEBACKUP (14)   Error when executing pg_basebackup (repmgr only)
+* ERR_SYS_FAILURE (13)       Error when forking (repmgrd only)
-* ERR_MONITORING_FAIL (16)  Unrecoverable error encountered during monitoring (repmgrd only)
+* ERR_BAD_BASEBACKUP (14)    Error when executing pg_basebackup (repmgr only)
-* ERR_BAD_BACKUP_LABEL (17) Corrupt or unreadable backup label encountered (repmgr only)
+* ERR_MONITORING_FAIL (16)   Unrecoverable error encountered during monitoring (repmgrd only)
-* ERR_SWITCHOVER_FAIL (18)  Error encountered during switchover (repmgr only)
+* ERR_BAD_BACKUP_LABEL (17)  Corrupt or unreadable backup label encountered (repmgr only)
-
+* ERR_SWITCHOVER_FAIL (18)   Error encountered during switchover (repmgr only)
 * ERR_BARMAN (19)            Unrecoverable error while accessing the barman server (repmgr only)
 * ERR_REGISTRATION_SYNC (20) After registering a standby, local node record was not
                                syncrhonised (repmgr only, with --wait option)
 Support and Assistance
 ----------------------
@@ -1505,6 +1874,7 @@ Thanks from the repmgr core team.
 Further reading
 ---------------
 * http://blog.2ndquadrant.com/repmgr-3-2-is-here-barman-support-brand-new-high-availability-features/
 * http://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/
 * http://blog.2ndquadrant.com/managing-useful-clusters-repmgr/
 * http://blog.2ndquadrant.com/easier_postgresql_90_clusters/
--- a/check_dir.c
+++ b/check_dir.c
@@ -1,6 +1,6 @@
 /*
 * check_dir.c - Directories management functions
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/check_dir.h
+++ b/check_dir.h
@@ -1,6 +1,6 @@
 /*
 * check_dir.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/compat.c
+++ b/compat.c
@@ -0,0 +1,107 @@
 /*
 *
 * compat.c
 *	  Provides a couple of useful string utility functions adapted
 *	  from the backend code, which are not publicly exposed. They're
 *	  unlikely to change but it would be worth keeping an eye on them
 *	  for any fixes/improvements
 *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 *
 */
 #include "repmgr.h"
 #include "compat.h"
 /*
 * Append the given string to the buffer, with suitable quoting for passing
 * the string as a value, in a keyword/pair value in a libpq connection
 * string
 *
 * This function is adapted from src/fe_utils/string_utils.c (before 9.6
 * located in: src/bin/pg_dump/dumputils.c)
 */
 void
 appendConnStrVal(PQExpBuffer buf, const char *str)
 {
 	const char *s;
 	bool		needquotes;
 	/*
 	 * If the string is one or more plain ASCII characters, no need to quote
 	 * it. This is quite conservative, but better safe than sorry.
 	 */
 	needquotes = true;
 	for (s = str; *s; s++)
 	{
 		if (!((*s >= 'a' && *s <= 'z') || (*s >= 'A' && *s <= 'Z') ||
 			  (*s >= '0' && *s <= '9') || *s == '_' || *s == '.'))
 		{
 			needquotes = true;
 			break;
 		}
 		needquotes = false;
 	}
 	if (needquotes)
 	{
 		appendPQExpBufferChar(buf, '\'');
 		while (*str)
 		{
 			/* ' and \ must be escaped by to \' and \\ */
 			if (*str == '\'' || *str == '\\')
 				appendPQExpBufferChar(buf, '\\');
 			appendPQExpBufferChar(buf, *str);
 			str++;
 		}
 		appendPQExpBufferChar(buf, '\'');
 	}
 	else
 		appendPQExpBufferStr(buf, str);
 }
 /*
 * Adapted from: src/fe_utils/string_utils.c
 */
 void
 appendShellString(PQExpBuffer buf, const char *str)
 {
 	const char *p;
 	appendPQExpBufferChar(buf, '\'');
 	for (p = str; *p; p++)
 	{
 		if (*p == '\n' || *p == '\r')
 		{
 			fprintf(stderr,
 					_("shell command argument contains a newline or carriage return: \"%s\"\n"),
 					str);
 			exit(ERR_BAD_CONFIG);
 		}
 		if (*p == '\'')
 			appendPQExpBufferStr(buf, "'\"'\"'");
 		else
 			appendPQExpBufferChar(buf, *p);
 	}
 	appendPQExpBufferChar(buf, '\'');
 }
--- a/compat.h
+++ b/compat.h
@@ -0,0 +1,29 @@
 /*
 * compat.h
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 *
 */
 #ifndef _COMPAT_H_
 #define _COMPAT_H_
 extern void
 appendConnStrVal(PQExpBuffer buf, const char *str);
 extern void
 appendShellString(PQExpBuffer buf, const char *str);
 #endif
--- a/config.c
+++ b/config.c
@@ -1,6 +1,7 @@
 /*
 * config.c - Functions to parse the config file
- * Copyright (C) 2ndQuadrant, 2010-2016
+ *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -9,11 +10,11 @@
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ * along with this program.	 If not, see <http://www.gnu.org/licenses/>.
 *
 */
@@ -26,7 +27,7 @@
 static void parse_event_notifications_list(t_configuration_options *options, const char *arg);
 static void tablespace_list_append(t_configuration_options *options, const char *arg);
-static void exit_with_errors(ErrorList *config_errors);
+static void exit_with_errors(ItemList *config_errors);
 const static char *_progname = NULL;
 static char config_file_path[MAXPGPATH];
@@ -54,8 +55,8 @@ progname(void)
 *
 * Returns true if a configuration file could be parsed, otherwise false.
 *
- * Any configuration options changed in this function must also be changed in
+ * Any *repmgrd-specific* configuration options added/changed in this function must also be
- * reload_config()
+ * added/changed in reload_config()
 *
 * NOTE: this function is called before the logger is set up, so we need
 * to handle the verbose option ourselves; also the default log level is NOTICE,
@@ -98,9 +99,9 @@ load_config(const char *config_file, bool verbose, t_configuration_options *opti
 	/*
 	 * If no configuration file was provided, attempt to find a default file
 	 * in this order:
-	 *  - current directory
+	 *	- current directory
-	 *  - /etc/repmgr.conf
+	 *	- /etc/repmgr.conf
-	 *  - default sysconfdir
+	 *	- default sysconfdir
 	 *
 	 * here we just check for the existence of the file; parse_config()
 	 * will handle read errors etc.
@@ -180,6 +181,23 @@ load_config(const char *config_file, bool verbose, t_configuration_options *opti
 }
 bool
 parse_config(t_configuration_options *options)
 {
 	/* Collate configuration file errors here for friendlier reporting */
 	static ItemList config_errors = { NULL, NULL };
 	_parse_config(options, &config_errors);
 	if (config_errors.head != NULL)
 	{
 		exit_with_errors(&config_errors);
 	}
 	return true;
 }
 /*
 * Parse configuration file; if any errors are encountered,
 * list them and exit.
@@ -187,8 +205,8 @@ load_config(const char *config_file, bool verbose, t_configuration_options *opti
 * Ensure any default values set here are synced with repmgr.conf.sample
 * and any other documentation.
 */
-bool
+void
-parse_config(t_configuration_options *options)
+_parse_config(t_configuration_options *options, ItemList *error_list)
 {
 	FILE	   *fp;
 	char	   *s,
@@ -200,9 +218,6 @@ parse_config(t_configuration_options *options)
 	PQconninfoOption *conninfo_options;
 	char	   *conninfo_errmsg = NULL;
 	/* Collate configuration file errors here for friendlier reporting */
 	static ErrorList config_errors = { NULL, NULL };
 	bool		node_found = false;
 	/* Initialize configuration options with sensible defaults
@@ -210,18 +225,22 @@ parse_config(t_configuration_options *options)
 	 * to be initialised here
 	 */
 	memset(options->cluster_name, 0, sizeof(options->cluster_name));
-	options->node = -1;
+	options->node = UNKNOWN_NODE_ID;
 	options->upstream_node = NO_UPSTREAM_NODE;
 	options->use_replication_slots = 0;
 	memset(options->conninfo, 0, sizeof(options->conninfo));
 	memset(options->barman_server, 0, sizeof(options->barman_server));
 	memset(options->barman_config, 0, sizeof(options->barman_config));
 	options->failover = MANUAL_FAILOVER;
 	options->priority = DEFAULT_PRIORITY;
 	memset(options->node_name, 0, sizeof(options->node_name));
 	memset(options->promote_command, 0, sizeof(options->promote_command));
 	memset(options->follow_command, 0, sizeof(options->follow_command));
-	memset(options->stop_command, 0, sizeof(options->stop_command));
+	memset(options->service_stop_command, 0, sizeof(options->service_stop_command));
-	memset(options->start_command, 0, sizeof(options->start_command));
+	memset(options->service_start_command, 0, sizeof(options->service_start_command));
-	memset(options->restart_command, 0, sizeof(options->restart_command));
+	memset(options->service_restart_command, 0, sizeof(options->service_restart_command));
 	memset(options->service_reload_command, 0, sizeof(options->service_reload_command));
 	memset(options->service_promote_command, 0, sizeof(options->service_promote_command));
 	memset(options->rsync_options, 0, sizeof(options->rsync_options));
 	memset(options->ssh_options, 0, sizeof(options->ssh_options));
 	memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
@@ -257,7 +276,7 @@ parse_config(t_configuration_options *options)
 	{
 		log_verbose(LOG_NOTICE, _("no configuration file provided and no default file found - "
 					 "continuing with default values\n"));
-		return true;
+		return;
 	}
 	fp = fopen(config_file_path, "r");
@@ -302,13 +321,17 @@ parse_config(t_configuration_options *options)
 			strncpy(options->cluster_name, value, MAXLEN);
 		else if (strcmp(name, "node") == 0)
 		{
-			options->node = repmgr_atoi(value, "node", &config_errors, false);
+			options->node = repmgr_atoi(value, "node", error_list, false);
 			node_found = true;
 		}
 		else if (strcmp(name, "upstream_node") == 0)
-			options->upstream_node = repmgr_atoi(value, "upstream_node", &config_errors, false);
+			options->upstream_node = repmgr_atoi(value, "upstream_node", error_list, false);
 		else if (strcmp(name, "conninfo") == 0)
 			strncpy(options->conninfo, value, MAXLEN);
 		else if (strcmp(name, "barman_server") == 0)
 			strncpy(options->barman_server, value, MAXLEN);
 		else if (strcmp(name, "barman_config") == 0)
 			strncpy(options->barman_config, value, MAXLEN);
 		else if (strcmp(name, "rsync_options") == 0)
 			strncpy(options->rsync_options, value, QUERY_STR_LEN);
 		else if (strcmp(name, "ssh_options") == 0)
@@ -333,35 +356,39 @@ parse_config(t_configuration_options *options)
 			}
 			else
 			{
-				error_list_append(&config_errors,_("value for 'failover' must be 'automatic' or 'manual'\n"));
+				item_list_append(error_list, _("value for 'failover' must be 'automatic' or 'manual'\n"));
 			}
 		}
 		else if (strcmp(name, "priority") == 0)
-			options->priority = repmgr_atoi(value, "priority", &config_errors, true);
+			options->priority = repmgr_atoi(value, "priority", error_list, true);
 		else if (strcmp(name, "node_name") == 0)
 			strncpy(options->node_name, value, MAXLEN);
 		else if (strcmp(name, "promote_command") == 0)
 			strncpy(options->promote_command, value, MAXLEN);
 		else if (strcmp(name, "follow_command") == 0)
 			strncpy(options->follow_command, value, MAXLEN);
-		else if (strcmp(name, "stop_command") == 0)
+		else if (strcmp(name, "service_stop_command") == 0)
-			strncpy(options->stop_command, value, MAXLEN);
+			strncpy(options->service_stop_command, value, MAXLEN);
-		else if (strcmp(name, "start_command") == 0)
+		else if (strcmp(name, "service_start_command") == 0)
-			strncpy(options->start_command, value, MAXLEN);
+			strncpy(options->service_start_command, value, MAXLEN);
-		else if (strcmp(name, "restart_command") == 0)
+		else if (strcmp(name, "service_restart_command") == 0)
-			strncpy(options->restart_command, value, MAXLEN);
+			strncpy(options->service_restart_command, value, MAXLEN);
 		else if (strcmp(name, "service_reload_command") == 0)
 			strncpy(options->service_reload_command, value, MAXLEN);
 		else if (strcmp(name, "service_promote_command") == 0)
 			strncpy(options->service_promote_command, value, MAXLEN);
 		else if (strcmp(name, "master_response_timeout") == 0)
-			options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", &config_errors, false);
+			options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", error_list, false);
 		/*
 		 * 'primary_response_timeout' as synonym for 'master_response_timeout' -
 		 * we'll switch terminology in a future release (3.1?)
 		 */
 		else if (strcmp(name, "primary_response_timeout") == 0)
-			options->master_response_timeout = repmgr_atoi(value, "primary_response_timeout", &config_errors, false);
+			options->master_response_timeout = repmgr_atoi(value, "primary_response_timeout", error_list, false);
 		else if (strcmp(name, "reconnect_attempts") == 0)
-			options->reconnect_attempts = repmgr_atoi(value, "reconnect_attempts", &config_errors, false);
+			options->reconnect_attempts = repmgr_atoi(value, "reconnect_attempts", error_list, false);
 		else if (strcmp(name, "reconnect_interval") == 0)
-			options->reconnect_interval = repmgr_atoi(value, "reconnect_interval", &config_errors, false);
+			options->reconnect_interval = repmgr_atoi(value, "reconnect_interval", error_list, false);
 		else if (strcmp(name, "pg_bindir") == 0)
 			strncpy(options->pg_bindir, value, MAXLEN);
 		else if (strcmp(name, "pg_ctl_options") == 0)
@@ -371,14 +398,14 @@ parse_config(t_configuration_options *options)
 		else if (strcmp(name, "logfile") == 0)
 			strncpy(options->logfile, value, MAXLEN);
 		else if (strcmp(name, "monitor_interval_secs") == 0)
-			options->monitor_interval_secs = repmgr_atoi(value, "monitor_interval_secs", &config_errors, false);
+			options->monitor_interval_secs = repmgr_atoi(value, "monitor_interval_secs", error_list, false);
 		else if (strcmp(name, "retry_promote_interval_secs") == 0)
-			options->retry_promote_interval_secs = repmgr_atoi(value, "retry_promote_interval_secs", &config_errors, false);
+			options->retry_promote_interval_secs = repmgr_atoi(value, "retry_promote_interval_secs", error_list, false);
 		else if (strcmp(name, "witness_repl_nodes_sync_interval_secs") == 0)
-			options->witness_repl_nodes_sync_interval_secs = repmgr_atoi(value, "witness_repl_nodes_sync_interval_secs", &config_errors, false);
+			options->witness_repl_nodes_sync_interval_secs = repmgr_atoi(value, "witness_repl_nodes_sync_interval_secs", error_list, false);
 		else if (strcmp(name, "use_replication_slots") == 0)
 			/* XXX we should have a dedicated boolean argument format */
-			options->use_replication_slots = repmgr_atoi(value, "use_replication_slots", &config_errors, false);
+			options->use_replication_slots = repmgr_atoi(value, "use_replication_slots", error_list, false);
 		else if (strcmp(name, "event_notification_command") == 0)
 			strncpy(options->event_notification_command, value, MAXLEN);
 		else if (strcmp(name, "event_notifications") == 0)
@@ -406,7 +433,7 @@ parse_config(t_configuration_options *options)
 					 _("no value provided for parameter \"%s\""),
 					 name);
-			error_list_append(&config_errors, error_message_buf);
+			item_list_append(error_list, error_message_buf);
 		}
 	}
@@ -415,11 +442,15 @@ parse_config(t_configuration_options *options)
 	if (node_found == false)
 	{
-		error_list_append(&config_errors, _("\"node\": parameter was not found"));
+		item_list_append(error_list, _("\"node\": parameter was not found"));
 	}
 	else if (options->node == 0)
 	{
-		error_list_append(&config_errors, _("\"node\": must be greater than zero"));
+		item_list_append(error_list, _("\"node\": must be greater than zero"));
 	}
 	else if (options->node < 0)
 	{
 		item_list_append(error_list, _("\"node\": must be a positive signed 32 bit integer, i.e. 2147483647 or less"));
 	}
 	if (strlen(options->conninfo))
@@ -439,18 +470,11 @@ parse_config(t_configuration_options *options)
 					 _("\"conninfo\": %s"),
 					 conninfo_errmsg);
-			error_list_append(&config_errors, error_message_buf);
+			item_list_append(error_list, error_message_buf);
 		}
 		PQconninfoFree(conninfo_options);
 	}
 	if (config_errors.head != NULL)
 	{
 		exit_with_errors(&config_errors);
 	}
 	return true;
 }
@@ -540,70 +564,85 @@ parse_line(char *buf, char *name, char *value)
 	trim(value);
 }
 /*
 * reload_config()
 *
 * This is only called by repmgrd after receiving a SIGHUP or when a monitoring
 * loop is started up; it therefore only needs to reload options required
 * by repmgrd, which are as follows:
 *
 * changeable options:
 * - failover
 * - follow_command
 * - logfacility
 * - logfile
 * - loglevel
 * - master_response_timeout
 * - monitor_interval_secs
 * - priority
 * - promote_command
 * - reconnect_attempts
 * - reconnect_interval
 * - retry_promote_interval_secs
 * - witness_repl_nodes_sync_interval_secs
 *
 * non-changeable options:
 * - cluster_name
 * - conninfo
 * - node
 * - node_name
 *
 * extract with something like:
 *	 grep local_options\\. repmgrd.c | perl -n -e '/local_options\.([\w_]+)/ && print qq|$1\n|;' | sort | uniq
 */
 bool
 reload_config(t_configuration_options *orig_options)
 {
 	PGconn	   *conn;
-	t_configuration_options new_options;
+	t_configuration_options new_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
 	bool	  config_changed = false;
 	bool	  log_config_changed = false;
 	static ItemList config_errors = { NULL, NULL };
 	/*
 	 * Re-read the configuration file: repmgr.conf
 	 */
-	log_info(_("reloading configuration file and updating repmgr tables\n"));
+	log_info(_("reloading configuration file\n"));
-	parse_config(&new_options);
+	_parse_config(&new_options, &config_errors);
-	if (new_options.node == -1)
+
 	if (config_errors.head != NULL)
 	{
 		/* XXX dump errors to log */
 		log_warning(_("unable to parse new configuration, retaining current configuration\n"));
 		return false;
 	}
 	/* The following options cannot be changed */
 	if (strcmp(new_options.cluster_name, orig_options->cluster_name) != 0)
 	{
-		log_warning(_("unable to change cluster name, retaining current configuration\n"));
+		log_warning(_("cluster_name cannot be changed, retaining current configuration\n"));
 		return false;
 	}
 	if (new_options.node != orig_options->node)
 	{
-		log_warning(_("unable to change node ID, retaining current configuration\n"));
+		log_warning(_("node ID cannot be changed, retaining current configuration\n"));
 		return false;
 	}
 	if (strcmp(new_options.node_name, orig_options->node_name) != 0)
 	{
-		log_warning(_("unable to change standby name, keeping current configuration\n"));
+		log_warning(_("node_name cannot be changed, keeping current configuration\n"));
 		return false;
 	}
 	if (new_options.failover != MANUAL_FAILOVER && new_options.failover != AUTOMATIC_FAILOVER)
 	{
 		log_warning(_("new value for 'failover' must be 'automatic' or 'manual'\n"));
 		return false;
 	}
 	if (new_options.master_response_timeout <= 0)
 	{
 		log_warning(_("new value for 'master_response_timeout' must be greater than zero\n"));
 		return false;
 	}
 	if (new_options.reconnect_attempts < 0)
 	{
 		log_warning(_("new value for 'reconnect_attempts' must be zero or greater\n"));
 		return false;
 	}
 	if (new_options.reconnect_interval < 0)
 	{
 		log_warning(_("new value for 'reconnect_interval' must be zero or greater\n"));
 		return false;
 	}
 	if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
 	{
-		/* Test conninfo string */
+		/* Test conninfo string works*/
 		conn = establish_db_connection(new_options.conninfo, false);
 		if (!conn || (PQstatus(conn) != CONNECTION_OK))
 		{
@@ -620,27 +659,6 @@ reload_config(t_configuration_options *orig_options)
 	 * to manage them
 	 */
 	/* cluster_name */
 	if (strcmp(orig_options->cluster_name, new_options.cluster_name) != 0)
 	{
 		strcpy(orig_options->cluster_name, new_options.cluster_name);
 		config_changed = true;
 	}
 	/* conninfo */
 	if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
 	{
 		strcpy(orig_options->conninfo, new_options.conninfo);
 		config_changed = true;
 	}
 	/* node */
 	if (orig_options->node != new_options.node)
 	{
 		orig_options->node = new_options.node;
 		config_changed = true;
 	}
 	/* failover */
 	if (orig_options->failover != new_options.failover)
 	{
@@ -648,27 +666,6 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
 	/* priority */
 	if (orig_options->priority != new_options.priority)
 	{
 		orig_options->priority = new_options.priority;
 		config_changed = true;
 	}
 	/* node_name */
 	if (strcmp(orig_options->node_name, new_options.node_name) != 0)
 	{
 		strcpy(orig_options->node_name, new_options.node_name);
 		config_changed = true;
 	}
 	/* promote_command */
 	if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
 	{
 		strcpy(orig_options->promote_command, new_options.promote_command);
 		config_changed = true;
 	}
 	/* follow_command */
 	if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
 	{
@@ -676,30 +673,6 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
 	/*
 	 * XXX These ones can change with a simple SIGHUP?
 	 *
 	 * strcpy (orig_options->loglevel, new_options.loglevel); strcpy
 	 * (orig_options->logfacility, new_options.logfacility);
 	 *
 	 * logger_shutdown(); XXX do we have progname here ? logger_init(progname,
 	 * orig_options.loglevel, orig_options.logfacility);
 	 */
 	/* rsync_options */
 	if (strcmp(orig_options->rsync_options, new_options.rsync_options) != 0)
 	{
 		strcpy(orig_options->rsync_options, new_options.rsync_options);
 		config_changed = true;
 	}
 	/* ssh_options */
 	if (strcmp(orig_options->ssh_options, new_options.ssh_options) != 0)
 	{
 		strcpy(orig_options->ssh_options, new_options.ssh_options);
 		config_changed = true;
 	}
 	/* master_response_timeout */
 	if (orig_options->master_response_timeout != new_options.master_response_timeout)
 	{
@@ -707,6 +680,27 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
 	/* monitor_interval_secs */
 	if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
 	{
 		orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
 		config_changed = true;
 	}
 	/* priority */
 	if (orig_options->priority != new_options.priority)
 	{
 		orig_options->priority = new_options.priority;
 		config_changed = true;
 	}
 	/* promote_command */
 	if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
 	{
 		strcpy(orig_options->promote_command, new_options.promote_command);
 		config_changed = true;
 	}
 	/* reconnect_attempts */
 	if (orig_options->reconnect_attempts != new_options.reconnect_attempts)
 	{
@@ -721,27 +715,6 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
 	/* pg_ctl_options */
 	if (strcmp(orig_options->pg_ctl_options, new_options.pg_ctl_options) != 0)
 	{
 		strcpy(orig_options->pg_ctl_options, new_options.pg_ctl_options);
 		config_changed = true;
 	}
 	/* pg_basebackup_options */
 	if (strcmp(orig_options->pg_basebackup_options, new_options.pg_basebackup_options) != 0)
 	{
 		strcpy(orig_options->pg_basebackup_options, new_options.pg_basebackup_options);
 		config_changed = true;
 	}
 	/* monitor_interval_secs */
 	if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
 	{
 		orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
 		config_changed = true;
 	}
 	/* retry_promote_interval_secs */
 	if (orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs)
 	{
@@ -749,20 +722,54 @@ reload_config(t_configuration_options *orig_options)
 		config_changed = true;
 	}
-	/* use_replication_slots */
+
-	if (orig_options->use_replication_slots != new_options.use_replication_slots)
+	/* witness_repl_nodes_sync_interval_secs */
 	if (orig_options->witness_repl_nodes_sync_interval_secs != new_options.witness_repl_nodes_sync_interval_secs)
 	{
-		orig_options->use_replication_slots = new_options.use_replication_slots;
+		orig_options->witness_repl_nodes_sync_interval_secs = new_options.witness_repl_nodes_sync_interval_secs;
 		config_changed = true;
 	}
 	/*
 	 * Handle changes to logging configuration
 	 */
 	if (strcmp(orig_options->logfacility, new_options.logfacility) != 0)
 	{
 		strcpy(orig_options->logfacility, new_options.logfacility);
 		log_config_changed = true;
 	}
 	if (strcmp(orig_options->logfile, new_options.logfile) != 0)
 	{
 		strcpy(orig_options->logfile, new_options.logfile);
 		log_config_changed = true;
 	}
 	if (strcmp(orig_options->loglevel, new_options.loglevel) != 0)
 	{
 		strcpy(orig_options->loglevel, new_options.loglevel);
 		log_config_changed = true;
 	}
 	if (log_config_changed == true)
 	{
 		log_notice(_("restarting logging with changed parameters\n"));
 		logger_shutdown();
 		logger_init(orig_options, progname());
 	}
 	if (config_changed == true)
 	{
-		log_debug(_("reload_config(): configuration has changed\n"));
+		log_notice(_("configuration file reloaded with changed parameters\n"));
 	}
-	else
+	/*
 	 * if logging configuration changed, don't say the configuration didn't
 	 * change, as it clearly has.
 	 */
 	else if (log_config_changed == false)
 	{
-		log_debug(_("reload_config(): configuration has not changed\n"));
+		log_info(_("configuration has not changed\n"));
 	}
 	return config_changed;
@@ -770,11 +777,11 @@ reload_config(t_configuration_options *orig_options)
 void
-error_list_append(ErrorList *error_list, char *error_message)
+item_list_append(ItemList *item_list, char *error_message)
 {
-	ErrorListCell *cell;
+	ItemListCell *cell;
-	cell = (ErrorListCell *) pg_malloc0(sizeof(ErrorListCell));
+	cell = (ItemListCell *) pg_malloc0(sizeof(ItemListCell));
 	if (cell == NULL)
 	{
@@ -782,19 +789,19 @@ error_list_append(ErrorList *error_list, char *error_message)
 		exit(ERR_BAD_CONFIG);
 	}
-	cell->error_message = pg_malloc0(MAXLEN);
+	cell->string = pg_malloc0(MAXLEN);
-	strncpy(cell->error_message, error_message, MAXLEN);
+	strncpy(cell->string, error_message, MAXLEN);
-	if (error_list->tail)
+	if (item_list->tail)
 	{
-		error_list->tail->next = cell;
+		item_list->tail->next = cell;
 	}
 	else
 	{
-		error_list->head = cell;
+		item_list->head = cell;
 	}
-	error_list->tail = cell;
+	item_list->tail = cell;
 }
@@ -804,7 +811,7 @@ error_list_append(ErrorList *error_list, char *error_message)
 * otherwise exit
 */
 int
-repmgr_atoi(const char *value, const char *config_item, ErrorList *error_list, bool allow_negative)
+repmgr_atoi(const char *value, const char *config_item, ItemList *error_list, bool allow_negative)
 {
 	char	  *endptr;
 	long	   longval = 0;
@@ -853,7 +860,7 @@ repmgr_atoi(const char *value, const char *config_item, ErrorList *error_list, b
 			exit(ERR_BAD_CONFIG);
 		}
-		error_list_append(error_list, error_message_buf);
+		item_list_append(error_list, error_message_buf);
 	}
 	return (int32) longval;
@@ -936,7 +943,7 @@ static void
 parse_event_notifications_list(t_configuration_options *options, const char *arg)
 {
 	const char *arg_ptr;
-	char	    event_type_buf[MAXLEN] = "";
+	char		event_type_buf[MAXLEN] = "";
 	char	   *dst_ptr = event_type_buf;
@@ -995,15 +1002,15 @@ parse_event_notifications_list(t_configuration_options *options, const char *arg
 static void
-exit_with_errors(ErrorList *config_errors)
+exit_with_errors(ItemList *config_errors)
 {
-	ErrorListCell *cell;
+	ItemListCell *cell;
 	log_err(_("%s: following errors were found in the configuration file.\n"), progname());
 	for (cell = config_errors->head; cell; cell = cell->next)
 	{
-		log_err("%s\n", cell->error_message);
+		log_err("%s\n", cell->string);
 	}
 	exit(ERR_BAD_CONFIG);
--- a/config.h
+++ b/config.h
@@ -1,6 +1,7 @@
 /*
 * config.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -57,14 +58,20 @@ typedef struct
 	int			node;
 	int         upstream_node;
 	char		conninfo[MAXLEN];
 	char		barman_server[MAXLEN];
 	char		barman_config[MAXLEN];
 	int			failover;
 	int			priority;
 	char		node_name[MAXLEN];
 	/* commands executed by repmgrd */
 	char		promote_command[MAXLEN];
 	char		follow_command[MAXLEN];
-	char		stop_command[MAXLEN];
+	/* Overrides for pg_ctl commands */
-	char		start_command[MAXLEN];
+	char		service_stop_command[MAXLEN];
-	char		restart_command[MAXLEN];
+	char		service_start_command[MAXLEN];
 	char		service_restart_command[MAXLEN];
 	char		service_reload_command[MAXLEN];
 	char		service_promote_command[MAXLEN];
 	char		loglevel[MAXLEN];
 	char		logfacility[MAXLEN];
 	char		rsync_options[QUERY_STR_LEN];
@@ -90,32 +97,51 @@ typedef struct
 * The following will initialize the structure with a minimal set of options;
 * actual defaults are set in parse_config() before parsing the configuration file
 */
-#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, {NULL, NULL} }
+#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", UNKNOWN_NODE_ID, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } }
-typedef struct ErrorListCell
+typedef struct ItemListCell
 {
-	struct ErrorListCell *next;
+	struct ItemListCell *next;
-	char			     *error_message;
+	char			    *string;
-} ErrorListCell;
+} ItemListCell;
-typedef struct ErrorList
+typedef struct ItemList
 {
-	ErrorListCell *head;
+	ItemListCell *head;
-	ErrorListCell *tail;
+	ItemListCell *tail;
-} ErrorList;
+} ItemList;
 typedef struct TablespaceDataListCell
 {
 	struct TablespaceDataListCell *next;
 	char	   *name;
 	char	   *oid;
 	char	   *location;
 	/* optional payload */
 	FILE       *f;
 } TablespaceDataListCell;
 typedef struct TablespaceDataList
 {
 	TablespaceDataListCell *head;
 	TablespaceDataListCell *tail;
 } TablespaceDataList;
 void set_progname(const char *argv0);
 const char * progname(void);
 bool		load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0);
-bool		reload_config(t_configuration_options *orig_options);
+
 void		_parse_config(t_configuration_options *options, ItemList *error_list);
 bool		parse_config(t_configuration_options *options);
 bool		reload_config(t_configuration_options *orig_options);
 void		parse_line(char *buff, char *name, char *value);
 char	   *trim(char *s);
-void		error_list_append(ErrorList *error_list, char *error_message);
+void		item_list_append(ItemList *item_list, char *error_message);
 int			repmgr_atoi(const char *s,
 						const char *config_item,
-						ErrorList *error_list,
+						ItemList *error_list,
 						bool allow_negative);
-
+extern bool		config_file_found;
 #endif
--- a/dbutils.c
+++ b/dbutils.c
@@ -1,6 +1,7 @@
 /*
 * dbutils.c - Database connection/management functions
- * Copyright (C) 2ndQuadrant, 2010-2016
+ *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -32,6 +33,15 @@ char repmgr_schema[MAXLEN] = "";
 char repmgr_schema_quoted[MAXLEN] = "";
 static int _get_node_record(PGconn *conn, char *cluster, char *sqlquery, t_node_info *node_info);
 static bool _set_config(PGconn *conn, const char *config_param, const char *sqlquery);
 /*
 * _establish_db_connection()
 *
 * Connect to a database using a conninfo string.
 *
 * NOTE: *do not* use this for replication connections; use establish_db_connection_by_params() instead.
 */
 PGconn *
 _establish_db_connection(const char *conninfo, const bool exit_on_error, const bool log_notice, const bool verbose_only)
@@ -76,6 +86,19 @@ _establish_db_connection(const char *conninfo, const bool exit_on_error, const b
 		}
 	}
 	/*
 	 * set "synchronous_commit" to "local" in case synchronous replication is in use
 	 */
 	else if (set_config(conn, "synchronous_commit", "local") == false)
 	{
 		if (exit_on_error)
 		{
 			PQfinish(conn);
 			exit(ERR_DB_CON);
 		}
 	}
 	return conn;
 }
@@ -115,8 +138,12 @@ PGconn *
 establish_db_connection_by_params(const char *keywords[], const char *values[],
 								  const bool exit_on_error)
 {
-	/* Make a connection to the database */
+	PGconn	   *conn;
-	PGconn	   *conn = PQconnectdbParams(keywords, values, true);
+	bool	    replication_connection = false;
 	int	   	    i;
 	/* Connect to the database using the provided parameters */
 	conn = PQconnectdbParams(keywords, values, true);
 	/* Check to see that the backend connection was successfully made */
 	if ((PQstatus(conn) != CONNECTION_OK))
@@ -129,6 +156,28 @@ establish_db_connection_by_params(const char *keywords[], const char *values[],
 			exit(ERR_DB_CON);
 		}
 	}
 	else
 	{
 		/*
 		 * set "synchronous_commit" to "local" in case synchronous replication is in
 		 * use (provided this is not a replication connection)
 		 */
 		for (i = 0; keywords[i]; i++)
 		{
 			if (strcmp(keywords[i], "replication") == 0)
 				replication_connection = true;
 		}
 		if (replication_connection == false && set_config(conn, "synchronous_commit", "local") == false)
 		{
 			if (exit_on_error)
 			{
 				PQfinish(conn);
 				exit(ERR_DB_CON);
 			}
 		}
 	}
 	return conn;
 }
@@ -213,7 +262,7 @@ check_cluster_schema(PGconn *conn)
 	char		sqlquery[QUERY_STR_LEN];
 	sqlquery_snprintf(sqlquery,
-					  "SELECT 1 FROM pg_namespace WHERE nspname = '%s'",
+					  "SELECT 1 FROM pg_catalog.pg_namespace WHERE nspname = '%s'",
 					  get_repmgr_schema());
 	log_verbose(LOG_DEBUG, "check_cluster_schema(): %s\n", sqlquery);
@@ -278,7 +327,6 @@ is_pgup(PGconn *conn, int timeout)
 	/* Check the connection status twice in case it changes after reset */
 	bool		twice = false;
 	/* Check the connection status twice in case it changes after reset */
 	for (;;)
 	{
 		if (PQstatus(conn) != CONNECTION_OK)
@@ -408,7 +456,7 @@ guc_set(PGconn *conn, const char *parameter, const char *op,
 	int			retval = 1;
 	sqlquery_snprintf(sqlquery,
-					  "SELECT true FROM pg_settings "
+					  "SELECT true FROM pg_catalog.pg_settings "
 					  " WHERE name = '%s' AND setting %s '%s'",
 					  parameter, op, value);
@@ -444,7 +492,7 @@ guc_set_typed(PGconn *conn, const char *parameter, const char *op,
 	int			retval = 1;
 	sqlquery_snprintf(sqlquery,
-					  "SELECT true FROM pg_settings "
+					  "SELECT true FROM pg_catalog.pg_settings "
 					  " WHERE name = '%s' AND setting::%s %s '%s'::%s",
 					  parameter, datatype, op, value, datatype);
@@ -476,7 +524,7 @@ get_cluster_size(PGconn *conn, char *size)
 	sqlquery_snprintf(sqlquery,
 					  "SELECT pg_catalog.pg_size_pretty(SUM(pg_catalog.pg_database_size(oid))::bigint) "
-					  "	 FROM pg_database ");
+					  "	 FROM pg_catalog.pg_database ");
 	log_verbose(LOG_DEBUG, "get_cluster_size():\n%s\n", sqlquery);
@@ -503,11 +551,11 @@ get_pg_setting(PGconn *conn, const char *setting, char *output)
 	char		sqlquery[QUERY_STR_LEN];
 	PGresult   *res;
 	int			i;
-	bool        success = true;
+	bool        success = false;
 	sqlquery_snprintf(sqlquery,
 					  "SELECT name, setting "
-					  " FROM pg_settings WHERE name = '%s'",
+					  "  FROM pg_catalog.pg_settings WHERE name = '%s'",
 					  setting);
 	log_verbose(LOG_DEBUG, "get_pg_setting(): %s\n", sqlquery);
@@ -944,7 +992,7 @@ get_repmgr_schema_quoted(PGconn *conn)
 bool
-create_replication_slot(PGconn *conn, char *slot_name, int server_version_num)
+create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg)
 {
 	char				sqlquery[QUERY_STR_LEN];
 	int					query_res;
@@ -963,8 +1011,9 @@ create_replication_slot(PGconn *conn, char *slot_name, int server_version_num)
 	{
 		if (strcmp(slot_info.slot_type, "physical") != 0)
 		{
-			log_err(_("Slot '%s' exists and is not a physical slot\n"),
+			appendPQExpBuffer(error_msg,
-					slot_name);
+							  _("Slot '%s' exists and is not a physical slot\n"),
 							  slot_name);
 			return false;
 		}
@@ -976,8 +1025,9 @@ create_replication_slot(PGconn *conn, char *slot_name, int server_version_num)
 			return true;
 		}
-		log_err(_("Slot '%s' already exists as an active slot\n"),
+		appendPQExpBuffer(error_msg,
-				slot_name);
+						  _("Slot '%s' already exists as an active slot\n"),
 						  slot_name);
 		return false;
 	}
@@ -985,25 +1035,26 @@ create_replication_slot(PGconn *conn, char *slot_name, int server_version_num)
 	if (server_version_num >= 90600)
 	{
 		sqlquery_snprintf(sqlquery,
-						  "SELECT * FROM pg_create_physical_replication_slot('%s', TRUE)",
+						  "SELECT * FROM pg_catalog.pg_create_physical_replication_slot('%s', TRUE)",
 						  slot_name);
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery,
-						  "SELECT * FROM pg_create_physical_replication_slot('%s')",
+						  "SELECT * FROM pg_catalog.pg_create_physical_replication_slot('%s')",
 						  slot_name);
 	}
-	log_debug(_("create_replication_slot(): Creating slot '%s' on primary\n"), slot_name);
+	log_debug(_("create_replication_slot(): Creating slot '%s' on master\n"), slot_name);
 	log_verbose(LOG_DEBUG, "create_replication_slot():\n%s\n", sqlquery);
 	res = PQexec(conn, sqlquery);
 	if (!res || PQresultStatus(res) != PGRES_TUPLES_OK)
 	{
-		log_err(_("unable to create slot '%s' on the primary node: %s\n"),
+		appendPQExpBuffer(error_msg,
-				slot_name,
+						  _("unable to create slot '%s' on the master node: %s\n"),
-				PQerrorMessage(conn));
+						  slot_name,
 						  PQerrorMessage(conn));
 		PQclear(res);
 		return false;
 	}
@@ -1021,7 +1072,7 @@ get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record)
 	sqlquery_snprintf(sqlquery,
 					  "SELECT slot_name, slot_type, active "
-                      "  FROM pg_replication_slots "
+                      "  FROM pg_catalog.pg_replication_slots "
 					  " WHERE slot_name = '%s' ",
 					  slot_name);
@@ -1081,15 +1132,25 @@ drop_replication_slot(PGconn *conn, char *slot_name)
 bool
-start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint)
+start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int server_version_num)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	PGresult   *res;
-	sqlquery_snprintf(sqlquery,
+	if (server_version_num >= 100000)
-					  "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))",
+	{
-					  time(NULL),
+		sqlquery_snprintf(sqlquery,
-					  fast_checkpoint ? "TRUE" : "FALSE");
+						  "SELECT pg_catalog.pg_walfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))",
 						  time(NULL),
 						  fast_checkpoint ? "TRUE" : "FALSE");
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery,
 						  "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))",
 						  time(NULL),
 						  fast_checkpoint ? "TRUE" : "FALSE");
 	}
 	log_verbose(LOG_DEBUG, "start_backup():\n%s\n", sqlquery);
@@ -1117,12 +1178,19 @@ start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint)
 bool
-stop_backup(PGconn *conn, char *last_wal_segment)
+stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	PGresult   *res;
-	sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_stop_backup())");
+	if (server_version_num >= 100000)
 	{
 		sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_walfile_name(pg_catalog.pg_stop_backup())");
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_stop_backup())");
 	}
 	res = PQexec(conn, sqlquery);
 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
@@ -1147,19 +1215,12 @@ stop_backup(PGconn *conn, char *last_wal_segment)
 }
 bool
-set_config_bool(PGconn *conn, const char *config_param, bool state)
+_set_config(PGconn *conn, const char *config_param, const char *sqlquery)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	PGresult   *res;
 	sqlquery_snprintf(sqlquery,
 					  "SET %s TO %s",
 					  config_param,
 					  state ? "TRUE" : "FALSE");
 	log_verbose(LOG_DEBUG, "set_config_bool():\n%s\n", sqlquery);
 	res = PQexec(conn, sqlquery);
 	if (PQresultStatus(res) != PGRES_COMMAND_OK)
@@ -1174,6 +1235,36 @@ set_config_bool(PGconn *conn, const char *config_param, bool state)
 	return true;
 }
 bool
 set_config(PGconn *conn, const char *config_param,  const char *config_value)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	sqlquery_snprintf(sqlquery,
 					  "SET %s TO '%s'",
 					  config_param,
 					  config_value);
 	log_verbose(LOG_DEBUG, "set_config():\n%s\n", sqlquery);
 	return _set_config(conn, config_param, sqlquery);
 }
 bool
 set_config_bool(PGconn *conn, const char *config_param, bool state)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	sqlquery_snprintf(sqlquery,
 					  "SET %s TO %s",
 					  config_param,
 					  state ? "TRUE" : "FALSE");
 	log_verbose(LOG_DEBUG, "set_config_bool():\n%s\n", sqlquery);
 	return _set_config(conn, config_param, sqlquery);
 }
 /*
 * witness_copy_node_records()
@@ -1223,7 +1314,8 @@ witness_copy_node_records(PGconn *masterconn, PGconn *witnessconn, char *cluster
 	/* Get current records from primary */
 	sqlquery_snprintf(sqlquery,
-					  "SELECT id, type, upstream_node_id, name, conninfo, priority, slot_name, active FROM %s.repl_nodes",
+					  "SELECT id, type, upstream_node_id, name, conninfo, priority, slot_name, active "
 					  "  FROM %s.repl_nodes",
 					  get_repmgr_schema_quoted(masterconn));
 	log_verbose(LOG_DEBUG, "witness_copy_node_records():\n%s\n", sqlquery);
@@ -1337,7 +1429,8 @@ create_node_record(PGconn *conn, char *action, int node, char *type, int upstrea
 	sqlquery_snprintf(sqlquery,
 					  "INSERT INTO %s.repl_nodes "
 					  "       (id, type, upstream_node_id, cluster, "
-					  "        name, conninfo, slot_name, priority, active) "
+					  "        name, conninfo, slot_name, "
 					  "        priority, active) "
 					  "VALUES (%i, '%s', %s, '%s', '%s', '%s', %s, %i, %s) ",
 					  get_repmgr_schema_quoted(conn),
 					  node,
@@ -1431,10 +1524,11 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
 	bool		success = true;
 	struct tm	ts;
-	/* Only attempt to write a record if a connection handle was provided.
+	/*
-	   Also check that the repmgr schema has been properly intialised - if
+	 * Only attempt to write a record if a connection handle was provided.
-	   not it means no configuration file was provided, which can happen with
+	 * Also check that the repmgr schema has been properly initialised - if
-	   e.g. `repmgr standby clone`, and we won't know which schema to write to.
+	 * not it means no configuration file was provided, which can happen with
 	 * e.g. `repmgr standby clone`, and we won't know which schema to write to.
 	 */
 	if (conn != NULL && strcmp(repmgr_schema, DEFAULT_REPMGR_SCHEMA_PREFIX) != 0)
 	{
@@ -1483,7 +1577,6 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
 						PQerrorMessage(conn));
 			success = false;
 		}
 		else
 		{
@@ -1624,6 +1717,89 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
 }
 bool
 update_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active)
 {
 	char		sqlquery[QUERY_STR_LEN];
 	char		upstream_node_id[MAXLEN];
 	char		slot_name_buf[MAXLEN];
 	PGresult   *res;
 	/* XXX this segment copied from create_node_record() */
 	if (upstream_node == NO_UPSTREAM_NODE)
 	{
 		/*
 		 * No explicit upstream node id provided for standby - attempt to
 		 * get primary node id
 		 */
 		if (strcmp(type, "standby") == 0)
 		{
 			int primary_node_id = get_master_node_id(conn, cluster_name);
 			maxlen_snprintf(upstream_node_id, "%i", primary_node_id);
 		}
 		else
 		{
 			maxlen_snprintf(upstream_node_id, "%s", "NULL");
 		}
 	}
 	else
 	{
 		maxlen_snprintf(upstream_node_id, "%i", upstream_node);
 	}
 	if (slot_name != NULL && slot_name[0])
 	{
 		maxlen_snprintf(slot_name_buf, "'%s'", slot_name);
 	}
 	else
 	{
 		maxlen_snprintf(slot_name_buf, "%s", "NULL");
 	}
 	/* XXX convert to placeholder query */
 	sqlquery_snprintf(sqlquery,
 					  "UPDATE %s.repl_nodes SET "
 					  "       type = '%s', "
 					  "       upstream_node_id = %s, "
 					  "       cluster = '%s', "
 					  "       name = '%s', "
 					  "       conninfo = '%s', "
 					  "       slot_name = %s, "
 					  "       priority = %i, "
 					  "       active = %s "
 					  " WHERE id = %i ",
 					  get_repmgr_schema_quoted(conn),
 					  type,
 					  upstream_node_id,
 					  cluster_name,
 					  node_name,
 					  conninfo,
 					  slot_name_buf,
 					  priority,
 					  active == true ? "TRUE" : "FALSE",
 					  node);
 	log_verbose(LOG_DEBUG, "update_node_record(): %s\n", sqlquery);
 	if (action != NULL)
 	{
 		log_verbose(LOG_DEBUG, "update_node_record(): action is \"%s\"\n", action);
 	}
 	res = PQexec(conn, sqlquery);
 	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
 	{
 		log_err(_("Unable to update node record\n%s\n"),
 				PQerrorMessage(conn));
 		PQclear(res);
 		return false;
 	}
 	PQclear(res);
 	return true;
 }
 /*
 * Update node record following change of status
 * (e.g. inactive primary converted to standby)
@@ -1713,7 +1889,8 @@ get_node_record(PGconn *conn, char *cluster, int node_id, t_node_info *node_info
 	sqlquery_snprintf(
 		sqlquery,
-		"SELECT id, type, upstream_node_id, name, conninfo, slot_name, priority, active"
+		"SELECT id, type, upstream_node_id, name, conninfo, "
 		"       slot_name, priority, active"
 		"  FROM %s.repl_nodes "
 		" WHERE cluster = '%s' "
 		"   AND id = %i",
@@ -1783,7 +1960,16 @@ _get_node_record(PGconn *conn, char *cluster, char *sqlquery, t_node_info *node_
 	node_info->node_id = atoi(PQgetvalue(res, 0, 0));
 	node_info->type = parse_node_type(PQgetvalue(res, 0, 1));
-	node_info->upstream_node_id = atoi(PQgetvalue(res, 0, 2));
+
 	if (PQgetisnull(res, 0, 2))
 	{
 		node_info->upstream_node_id = NO_UPSTREAM_NODE;
 	}
 	else
 	{
 		node_info->upstream_node_id = atoi(PQgetvalue(res, 0, 2));
 	}
 	strncpy(node_info->name, PQgetvalue(res, 0, 3), MAXLEN);
 	strncpy(node_info->conninfo_str, PQgetvalue(res, 0, 4), MAXLEN);
 	strncpy(node_info->slot_name, PQgetvalue(res, 0, 5), MAXLEN);
--- a/dbutils.h
+++ b/dbutils.h
@@ -1,6 +1,7 @@
 /*
 * dbutils.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -21,6 +22,7 @@
 #define _REPMGR_DBUTILS_H_
 #include "access/xlogdefs.h"
 #include "pqexpbuffer.h"
 #include "config.h"
 #include "strutil.h"
@@ -77,7 +79,7 @@ typedef struct s_replication_slot
 	bool active;
 }   t_replication_slot;
-
+extern char		repmgr_schema[MAXLEN];
 PGconn *_establish_db_connection(const char *conninfo,
 								 const bool exit_on_error,
@@ -117,17 +119,19 @@ int			wait_connection_availability(PGconn *conn, long long timeout);
 bool		cancel_query(PGconn *conn, int timeout);
 char       *get_repmgr_schema(void);
 char       *get_repmgr_schema_quoted(PGconn *conn);
-bool		create_replication_slot(PGconn *conn, char *slot_name, int server_version_num);
+bool		create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
 int			get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
 bool		drop_replication_slot(PGconn *conn, char *slot_name);
-bool		start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint);
+bool		start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int server_version_num);
-bool		stop_backup(PGconn *conn, char *last_wal_segment);
+bool		stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num);
 bool		set_config(PGconn *conn, const char *config_param,  const char *config_value);
 bool		set_config_bool(PGconn *conn, const char *config_param, bool state);
 bool		witness_copy_node_records(PGconn *masterconn, PGconn *witnessconn, char *cluster_name);
 bool		create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
 bool		delete_node_record(PGconn *conn, int node, char *action);
 int			get_node_record(PGconn *conn, char *cluster, int node_id, t_node_info *node_info);
 int			get_node_record_by_name(PGconn *conn, char *cluster, const char *node_name, t_node_info *node_info);
 bool        update_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
 bool        update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active);
 bool        update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id);
 bool        create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
--- a/debian/DEBIAN/control
+++ b/debian/DEBIAN/control
@@ -1,5 +1,5 @@
 Package: repmgr-auto
-Version: 3.1.3
+Version: 3.2dev
 Section: database
 Priority: optional
 Architecture: all
--- a/dirmod.c
+++ b/dirmod.c
@@ -3,7 +3,7 @@
 * dirmod.c
 *	  directory handling functions
 *
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
--- a/dirmod.h
+++ b/dirmod.h
@@ -1,6 +1,6 @@
 /*
 * dirmod.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
--- a/docs/repmgrd-failover-mechanism.md
+++ b/docs/repmgrd-failover-mechanism.md
@@ -0,0 +1,75 @@
 repmgrd's failover algorithm
 ============================
 When implementing automatic failover, there are two factors which are critical in
 ensuring the desired result is achieved:
  - has the master node genuinely failed?
  - which is the best node to promote to the new master?
 This document outlines repmgrd's decision-making process during automatic failover
 for standbys directly connected to the master node.
 Master node failure detection
 -----------------------------
 If a `repmgrd` instance running on a PostgreSQL standby node is unable to connect to
 the master node, this doesn't neccesarily mean that the master is down and a
 failover is required. Factors such as network connectivity issues could mean that
 even though the standby node is isolated, the replication cluster as a whole
 is functioning correctly, and promoting the standby without further verification
 could result in a "split-brain" situation.
 In the event that `repmgrd` is unable to connect to the master node, it will attempt
 to reconnect to the master server several times (as defined by the `reconnect_attempts`
 parameter in `repmgr.conf`), with reconnection attempts  occuring at the interval
 specified by `reconnect_interval`. This happens to verify that the master is definitively
 not accessible (e.g. that connection was not lost due to a brief network glitch).
 Appropriate values for these settings will depend very much on the replication
 cluster environment. There will necessarily be a trade-off between the time it
 takes to assume the master is not reachable, and the reliability of that conclusion.
 A standby in a different physical location to the master will probably need a longer
 check interval to rule out possible network issues, whereas one located in the same
 rack with a direct connection between servers could perform the check very quickly.
 Note that it's possible the master comes back online after this point is reached,
 but before a new master has been selected; in this case it will be noticed
 during the selection of a new master and no actual failover will take place.
 Promotion candidate selection
 -----------------------------
 Once `repmgrd` has decided the master is definitively unreachable, following checks
 will be carried out:
 * attempts to connect to all other nodes in the cluster (including the witness
  node, if defined) to establish the state of the cluster, including their
  current LSN
 * If less than half of the nodes are visible (from the viewpoint
  of this node), `repmgrd` will not take any further action. This is to ensure that
  e.g. if a replication cluster is spread over multiple data centres, a split-brain
  situation does not occur if there is a network failure between datacentres. Note
  that if nodes are split evenly between data centres, a witness server can be
  used to establish the "majority" data centre.
 * `repmgrd` polls all visible servers and waits for each node to return a valid LSN;
  it updates the LSN previously  stored for this node if it has increased since
  the initial check
 * once all LSNs have been retrieved, `repmgrd` will check for the highest LSN; if
  its own node has the highest LSN, it will attempt to promote itself (using the
  command defined in `promote_command` in `repmgr.conf`. Note that if using
  `repmgr standby promote` as the promotion command, and the original master becomes available
  before the promotion takes effect, `repmgr` will return an error and no promotion
  will take place, and `repmgrd` will resume monitoring as usual.
 * if the node is not the promotion candidate, `repmgrd` will execute the
  `follow_command` defined in `repmgr.conf`. If using `repmgr standby follow` here,
  `repmgr` will attempt to detect the new master node and attach to that.
--- a/docs/repmgrd-node-fencing.md
+++ b/docs/repmgrd-node-fencing.md
@@ -0,0 +1,152 @@
 Fencing a failed master node with repmgrd and pgbouncer
 =======================================================
 With automatic failover, it's essential to ensure that a failed master
 remains inaccessible to your application, even if it comes back online
 again, to avoid a split-brain situation.
 By using `pgbouncer` together with `repmgrd`, it's possible to combine
 automatic failover with a process to isolate the failed master from
 your application and ensure that all connections which should go to
 the master are directed there smoothly without having to reconfigure
 your application. (Note that as a connection pooler, `pgbouncer` can
 benefit your application in other ways, but those are beyond the scope
 of this document).
 * * *
 > *WARNING*: automatic failover is tricky to get right. This document
 > demonstrates one possible implementation method, however you should
 > carefully configure and test any setup to suit the needs of your own
 > replication cluster/application.
 * * *
 In a failover situation, `repmgrd` promotes a standby to master by executing
 the command defined in `promote_command`. Normally this would be something like:
    repmgr standby promote -f /etc/repmgr.conf
 By wrapping this in a custom script which adjusts the `pgbouncer` configuration
 on all nodes, it's possible to fence the failed master and redirect write
 connections to the new master.
 The script consists of three sections:
 * commands to pause `pgbouncer` on all nodes
 * the promotion command itself
 * commands to reconfigure and restart `pgbouncer` on all nodes
 Note that it requires password-less SSH access between all nodes to be able to
 update the `pgbouncer` configuration files.
 For the purposes of this demonstration, we'll assume there are 3 nodes (master
 and two standbys), with `pgbouncer` listening on port 6432 handling connections
 to a database called `appdb`.  The `postgres` system user must have write
 access to the `pgbouncer` configuration files on all nodes. We'll assume
 there's a main `pgbouncer` configuration file, `/etc/pgbouncer.ini`, which uses
 the `%include` directive (available from PgBouncer 1.6) to include a separate
 configuration file, `/etc/pgbouncer.database.ini`, which will be modified by
 `repmgr`.
 `/etc/pgbouncer.ini` should look something like this:
    [pgbouncer]
    logfile = /var/log/pgbouncer/pgbouncer.log
    pidfile = /var/run/pgbouncer/pgbouncer.pid
    listen_addr = *
    listen_port = 6532
    unix_socket_dir = /tmp
    auth_type = trust
    auth_file = /etc/pgbouncer.auth
    admin_users = postgres
    stats_users = postgres
    pool_mode = transaction
    max_client_conn = 100
    default_pool_size = 20
    min_pool_size = 5
    reserve_pool_size = 5
    reserve_pool_timeout = 3
    log_connections = 1
    log_disconnections = 1
    log_pooler_errors = 1
    %include /etc/pgbouncer.database.ini
 The actual script is as follows; adjust the configurable items as appropriate:
 `/var/lib/postgres/repmgr/promote.sh`
    #!/usr/bin/env bash
    set -u
    set -e
    # Configurable items
    PGBOUNCER_HOSTS="node1 node2 node3"
    PGBOUNCER_DATABASE_INI="/etc/pgbouncer.database.ini"
    PGBOUNCER_DATABASE="appdb"
    PGBOUNCER_PORT=6432
    REPMGR_DB="repmgr"
    REPMGR_USER="repmgr"
    REPMGR_SCHEMA="repmgr_test"
    # 1. Pause running pgbouncer instances
    for HOST in $PGBOUNCER_HOSTS
    do
        psql -t -c "pause" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
    done
    # 2. Promote this node from standby to master
    repmgr standby promote -f /etc/repmgr.conf
    # 3. Reconfigure pgbouncer instances
    PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini"
    for HOST in $PGBOUNCER_HOSTS
    do
        # Recreate the pgbouncer config file
        echo -e "[databases]\n" > $PGBOUNCER_DATABASE_INI_NEW
        psql -d $REPMGR_DB -U $REPMGR_USER -t -A \
          -c "SELECT '${PGBOUNCER_DATABASE}-rw= ' || conninfo || ' application_name=pgbouncer_${HOST}' \
              FROM ${REPMGR_SCHEMA}.repl_nodes \
              WHERE active = TRUE AND type='master'" >> $PGBOUNCER_DATABASE_INI_NEW
        psql -d $REPMGR_DB -U $REPMGR_USER -t -A \
          -c "SELECT '${PGBOUNCER_DATABASE}-ro= ' || conninfo || ' application_name=pgbouncer_${HOST}' \
              FROM $REPMGR_SCHEMA.repl_nodes \
              WHERE node_name='${HOST}'" >> $PGBOUNCER_DATABASE_INI_NEW
        rsync $PGBOUNCER_DATABASE_INI_NEW $HOST:$PGBOUNCER_DATABASE_INI
        psql -tc "reload" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
        psql -tc "resume" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
    done
    # Clean up generated file
    rm $PGBOUNCER_DATABASE_INI_NEW
    echo "Reconfiguration of pgbouncer complete"
 Script and template file should be installed on each node where
 `repmgrd` is running.
 Finally, set `promote_command` in `repmgr.conf` on each node to
 point to the custom promote script:
    promote_command=/var/lib/postgres/repmgr/promote.sh
 and reload/restart any running `repmgrd` instances for the changes to take
 effect.
--- a/errcode.h
+++ b/errcode.h
@@ -1,6 +1,6 @@
 /*
 * errcode.h
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -38,5 +38,8 @@
 #define ERR_MONITORING_FAIL 16
 #define ERR_BAD_BACKUP_LABEL 17
 #define ERR_SWITCHOVER_FAIL 18
 #define ERR_BARMAN 19
 #define ERR_REGISTRATION_SYNC 20
 #endif   /* _ERRCODE_H_ */
--- a/expected/repmgr_funcs.out
+++ b/expected/repmgr_funcs.out
@@ -0,0 +1,18 @@
 /*
 * repmgr_function.sql
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 */
 -- SET SEARCH_PATH TO 'repmgr';
 CREATE FUNCTION repmgr_update_standby_location(text) RETURNS boolean
 AS '$libdir/repmgr_funcs', 'repmgr_update_standby_location'
 LANGUAGE C STRICT;
 CREATE FUNCTION repmgr_get_last_standby_location() RETURNS text
 AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location'
 LANGUAGE C STRICT;
 CREATE FUNCTION repmgr_update_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
 AS '$libdir/repmgr_funcs', 'repmgr_update_last_updated'
 LANGUAGE C STRICT;
 CREATE FUNCTION repmgr_get_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
 AS '$libdir/repmgr_funcs', 'repmgr_get_last_updated'
 LANGUAGE C STRICT;
--- a/expected/repmgr_test.out
+++ b/expected/repmgr_test.out
@@ -0,0 +1,24 @@
 select * from repmgr_update_standby_location('');
 repmgr_update_standby_location 
 --------------------------------
 f
 (1 row)
 select * from repmgr_get_last_standby_location();
 repmgr_get_last_standby_location 
 ----------------------------------
 (1 row)
 select * from repmgr_update_last_updated();
 repmgr_update_last_updated 
 ----------------------------
 (1 row)
 select * from repmgr_get_last_updated();
 repmgr_get_last_updated 
 -------------------------
 (1 row)
--- a/log.c
+++ b/log.c
@@ -1,6 +1,6 @@
 /*
 * log.c - Logging methods
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This module is a set of methods for logging (currently only syslog)
 *
@@ -48,6 +48,11 @@ int			log_level = LOG_NOTICE;
 int			last_log_level = LOG_NOTICE;
 int			verbose_logging = false;
 int			terse_logging = false;
 /*
 * Global variable to be set by the main application to ensure any log output
 * emitted before logger_init is called, is output in the correct format
 */
 int			logger_output_mode = OM_DAEMON;
 extern void
 stderr_log_with_level(const char *level_name, int level, const char *fmt, ...)
@@ -62,22 +67,31 @@ stderr_log_with_level(const char *level_name, int level, const char *fmt, ...)
 static void
 _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap)
 {
-	time_t		t;
+	char		buf[100];
 	struct tm  *tm;
 	char		buff[100];
 	/*
 	 * Store the requested level so that if there's a subsequent
-	 * log_hint(), we can suppress that if appropriate.
+	 * log_hint() or log_detail(), we can suppress that if appropriate.
 	 */
 	last_log_level = level;
 	if (log_level >= level)
 	{
-		time(&t);
+
-		tm = localtime(&t);
+		/* Format log line prefix with timestamp if in daemon mode */
-		strftime(buff, 100, "[%Y-%m-%d %H:%M:%S]", tm);
+		if (logger_output_mode == OM_DAEMON)
-		fprintf(stderr, "%s [%s] ", buff, level_name);
+		{
 			time_t		t;
 			struct tm  *tm;
 			time(&t);
 			tm = localtime(&t);
 			strftime(buf, 100, "[%Y-%m-%d %H:%M:%S]", tm);
 			fprintf(stderr, "%s [%s] ", buf, level_name);
 		}
 		else
 		{
 			fprintf(stderr, "%s: ", level_name);
 		}
 		vfprintf(stderr, fmt, ap);
@@ -99,6 +113,20 @@ log_hint(const char *fmt, ...)
 }
 void
 log_detail(const char *fmt, ...)
 {
 	va_list		ap;
 	if (terse_logging == false)
 	{
 		va_start(ap, fmt);
 		_stderr_log_with_level("DETAIL", last_log_level, fmt, ap);
 		va_end(ap);
 	}
 }
 void
 log_verbose(int level, const char *fmt, ...)
 {
@@ -176,6 +204,13 @@ logger_init(t_configuration_options *opts, const char *ident)
 			stderr_log_warning(_("Invalid log level \"%s\" (available values: DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level);
 	}
 	/*
 	 * STDERR only logging requested - finish here without setting up any further
 	 * logging facility.
 	 */
 	if (logger_output_mode == OM_COMMAND_LINE)
 		return true;
 	if (facility && *facility)
 	{
@@ -236,9 +271,10 @@ logger_init(t_configuration_options *opts, const char *ident)
 		stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile);
 		fd = freopen(opts->logfile, "a", stderr);
-		/* It's possible freopen() may still fail due to e.g. a race condition;
+		/*
-		   as it's not feasible to restore stderr after a failed freopen(),
+		 * It's possible freopen() may still fail due to e.g. a race condition;
-		   we'll write to stdout as a last resort.
+		 * as it's not feasible to restore stderr after a failed freopen(),
 		 * we'll write to stdout as a last resort.
 		 */
 		if (fd == NULL)
 		{
--- a/log.h
+++ b/log.h
@@ -1,6 +1,6 @@
 /*
 * log.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -25,6 +25,9 @@
 #define REPMGR_SYSLOG 1
 #define REPMGR_STDERR 2
 #define OM_COMMAND_LINE 1
 #define OM_DAEMON       2
 extern void
 stderr_log_with_level(const char *level_name, int level, const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
@@ -123,6 +126,8 @@ bool		logger_shutdown(void);
 void		logger_set_verbose(void);
 void		logger_set_terse(void);
 void		log_detail(const char *fmt, ...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 void		log_hint(const char *fmt, ...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
 void		log_verbose(int level, const char *fmt, ...)
@@ -132,5 +137,6 @@ extern int	log_type;
 extern int	log_level;
 extern int	verbose_logging;
 extern int	terse_logging;
 extern int	logger_output_mode;
 #endif /* _REPMGR_LOG_H_ */
--- a/repmgr.c
+++ b/repmgr.c
--- a/repmgr.conf.sample
+++ b/repmgr.conf.sample
@@ -31,7 +31,7 @@
 #
 #   https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING
 #
-#conninfo='host=192.168.204.104 dbname=repmgr_db user=repmgr_usr'
+#conninfo='host=192.168.204.104 dbname=repmgr user=repmgr'
 #
 # If repmgrd is in use, consider explicitly setting `connect_timeout` in the
 # conninfo string to determine the length of time which elapses before
@@ -66,6 +66,12 @@
 # (default: NOTICE)
 #loglevel=NOTICE
 # Note that logging facility settings will only apply to `repmgrd` by default;
 # `repmgr` will always write to STDERR unless the switch `--log-to-file` is
 # supplied, in which case it will log to the same destination as `repmgrd`.
 # This is mainly intended for those cases when `repmgr` is executed directly
 # by `repmgrd`.
 # Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER
 # (default: STDERR)
 #logfacility=STDERR
@@ -100,11 +106,14 @@
 # path to PostgreSQL binary directory (location of pg_ctl, pg_basebackup etc.)
 # (if not provided, defaults to system $PATH)
 #pg_bindir=/usr/bin/
 #
 # Debian/Ubuntu users: you will probably need to set this to the directory
 # where `pg_ctl` is located, e.g. /usr/lib/postgresql/9.5/bin/
 # service control commands
 #
-# repmgr provides options to to override the default pg_ctl commands
+# repmgr provides options to override the default pg_ctl commands
-# used to stop, start  and restart the PostgreSQL cluster
+# used to stop, start, restart, reload and promote the PostgreSQL cluster
 #
 # NOTE: These commands must be runnable on remote nodes as well for switchover
 # to function correctly.
@@ -120,9 +129,11 @@
 #       /usr/bin/systemctl start postgresql-9.5, \
 #       /usr/bin/systemctl restart postgresql-9.5
 #
-# start_command = systemctl start postgresql-9.5
+# service_start_command = systemctl start postgresql-9.5
-# stop_command = systemctl stop postgresql-9.5
+# service_stop_command = systemctl stop postgresql-9.5
-# restart_command = systemctl restart postgresql-9.5
+# service_restart_command = systemctl restart postgresql-9.5
 # service_reload_command = pg_ctlcluster 9.5 main reload
 # service_promote_command = pg_ctlcluster 9.5 main promote
 # external command options
@@ -132,8 +143,15 @@
 # external command arguments. Values shown are examples.
 #pg_ctl_options='-s'
-#pg_basebackup_options='--xlog-method=s'
+#pg_basebackup_options='--label=repmgr_backup'
 # This is the host name of the barman server, which is used for connecting over
 # to the barman server (passwordless ssh keys should be in place)
 #barman_server='backup_server'
 # If you are placing the barman.conf file in a non-standard path, or using
 # a name other than barman.conf, use this parameter to specify the path and
 # name of the barman configuration file.
 #barman_config='/path/to/barman.conf'
 # Standby clone settings
 # ----------------------
@@ -155,9 +173,11 @@
 # These settings are only applied when repmgrd is running. Values shown
 # are defaults.
-# Number of seconds to wait for a response from the primary server before
+# monitoring interval in seconds; default is 2
-# deciding it has failed.
+#monitor_interval_secs=2
 # Maximum number of seconds to wait for a response from the primary server
 # before deciding it has failed.
 #master_response_timeout=60
 # Number of attempts at what interval (in seconds) to try and
@@ -182,9 +202,6 @@
 #promote_command='repmgr standby promote -f /path/to/repmgr.conf'
 #follow_command='repmgr standby follow -f /path/to/repmgr.conf -W'
 # monitoring interval in seconds; default is 2
 #monitor_interval_secs=2
 # change wait time for primary; before we bail out and exit when the primary
 # disappears, we wait 'reconnect_attempts' * 'retry_promote_interval_secs'
 # seconds; by default this would be half an hour, as 'retry_promote_interval_secs'
--- a/repmgr.h
+++ b/repmgr.h
@@ -1,6 +1,6 @@
 /*
 * repmgr.h
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -23,6 +23,7 @@
 #include <libpq-fe.h>
 #include <postgres_fe.h>
 #include <getopt_long.h>
 #include "pqexpbuffer.h"
 #include "strutil.h"
 #include "dbutils.h"
@@ -35,7 +36,7 @@
 #define ERRBUFF_SIZE	512
-#define DEFAULT_WAL_KEEP_SEGMENTS	"5000"
+#define DEFAULT_WAL_KEEP_SEGMENTS	"0"
 #define DEFAULT_DEST_DIR		"."
 #define DEFAULT_REPMGR_SCHEMA_PREFIX	"repmgr_"
 #define DEFAULT_PRIORITY		100
@@ -47,64 +48,132 @@
 #define NO_UPSTREAM_NODE	-1
 #define UNKNOWN_NODE_ID     -1
 /* command line options without short versions */
 #define OPT_HELP                         1
 #define OPT_CHECK_UPSTREAM_CONFIG        2
 #define OPT_RECOVERY_MIN_APPLY_DELAY     3
-#define OPT_IGNORE_EXTERNAL_CONFIG_FILES 4
+#define OPT_COPY_EXTERNAL_CONFIG_FILES   4
 #define OPT_CONFIG_ARCHIVE_DIR           5
 #define OPT_PG_REWIND                    6
 #define OPT_PWPROMPT                     7
 #define OPT_CSV                          8
-#define OPT_INITDB_NO_PWPROMPT           9
+#define OPT_NODE                         9
 #define OPT_WITHOUT_BARMAN               10
 #define OPT_NO_UPSTREAM_CONNECTION       11
 #define OPT_REGISTER_WAIT                12
 #define OPT_CLUSTER                      13
 #define OPT_LOG_TO_FILE                  14
 #define OPT_UPSTREAM_CONNINFO            15
 #define OPT_NO_CONNINFO_PASSWORD         16
 #define OPT_REPLICATION_USER             17
 /* deprecated command line options */
 #define OPT_INITDB_NO_PWPROMPT           998
 #define OPT_IGNORE_EXTERNAL_CONFIG_FILES 999
 /* values for --copy-external-config-files */
 #define CONFIG_FILE_SAMEPATH 1
 #define CONFIG_FILE_PGDATA 2
 /* Run time options type */
 typedef struct
 {
 	/* general repmgr options */
 	char		config_file[MAXPGPATH];
 	bool		verbose;
 	bool		terse;
 	bool		force;
 	char		pg_bindir[MAXLEN]; /* overrides setting in repmgr.conf */
 	/* logging parameters */
 	char		loglevel[MAXLEN];  /* overrides setting in repmgr.conf */
 	bool		log_to_file;
 	/* connection parameters */
 	char		dbname[MAXLEN];
 	char		host[MAXLEN];
 	char		username[MAXLEN];
 	char		dest_dir[MAXPGPATH];
 	char		config_file[MAXPGPATH];
 	char		remote_user[MAXLEN];
 	char		superuser[MAXLEN];
 	char		masterport[MAXLEN];
 	bool		conninfo_provided;
 	bool		connection_param_provided;
 	bool		host_param_provided;
 	/* standby clone parameters */
 	bool		wal_keep_segments_used;
 	char		wal_keep_segments[MAXLEN];
 	bool		verbose;
 	bool		terse;
 	bool		force;
 	bool		wait_for_master;
 	bool		ignore_rsync_warn;
 	bool		witness_pwprompt;
 	bool		rsync_only;
 	bool		fast_checkpoint;
-	bool		ignore_external_config_files;
+	bool		without_barman;
-	bool		csv_mode;
+	bool		no_upstream_connection;
-	char		masterport[MAXLEN];
+	bool		no_conninfo_password;
-	/*
+	bool		copy_external_config_files;
-	 * configuration file parameters which can be overridden on the
+	int			copy_external_config_files_destination;
-	 * command line
+	char		upstream_conninfo[MAXLEN];
-	 */
+	char		replication_user[MAXLEN];
 	char		loglevel[MAXLEN];
 	/* parameter used by STANDBY SWITCHOVER */
 	char		remote_config_file[MAXLEN];
 	char		pg_rewind[MAXPGPATH];
 	char		pg_ctl_mode[MAXLEN];
 	/* parameter used by STANDBY {ARCHIVE_CONFIG | RESTORE_CONFIG} */
 	char		config_archive_dir[MAXLEN];
 	/* parameter used by CLUSTER CLEANUP */
 	int			keep_history;
 	char		pg_bindir[MAXLEN];
 	char		recovery_min_apply_delay[MAXLEN];
-	/* deprecated command line options */
+	/* standby register parameters */
-	char            localport[MAXLEN];
+	bool		wait_register_sync;
 	int			wait_register_sync_seconds;
 	/* witness create parameters */
 	bool		witness_pwprompt;
 	/* standby follow parameters */
 	bool		wait_for_master;
 	/* cluster {show|matrix|crosscheck} parameters */
 	bool		csv_mode;
 	/* cluster cleanup parameters */
 	int			keep_history;
 	/* standby switchover parameters */
 	char		remote_config_file[MAXLEN];
 	bool		pg_rewind_supplied;
 	char		pg_rewind[MAXPGPATH];
 	char		pg_ctl_mode[MAXLEN];
 	/* standby {archive_config | restore_config} parameters  */
 	char		config_archive_dir[MAXLEN];
 	/* {standby|witness} unregister parameters */
 	int			node;
 }	t_runtime_options;
-#define T_RUNTIME_OPTIONS_INITIALIZER { "", "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, false, false, false, false, "", "", "", "", "fast", "", 0, "", "", ""}
+#define T_RUNTIME_OPTIONS_INITIALIZER { \
 		/* general repmgr options */	\
 		"", false, false, false, "",	\
 		/* logging parameters */ \
 		"", false,                      \
 		/* connection parameters */		\
 		"", "", "", "", "", "", "", 	\
 		false, false, false,		    \
 		/* standby clone parameters */  \
 		false, DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, \
 		false, CONFIG_FILE_SAMEPATH, "", "", "", \
 		/* standby register paarameters */ \
 	    false, 0,							 \
 		/* witness create parameters */ \
 		false,                          \
 		/* standby follow parameters */ \
 		false,                          \
 		/* cluster {show|matrix|crosscheck} parameters */ \
 		false,                          \
 		/* cluster cleanup parameters */ \
 		0,                              \
 		/* standby switchover parameters */ \
 		"", false, "", "fast",          \
 		/* standby {archive_config | restore_config} parameters  */ \
 		"",                             \
 		/* {standby|witness} unregister parameters */ \
 		UNKNOWN_NODE_ID }
 struct BackupLabel
 {
@@ -118,7 +187,61 @@ struct BackupLabel
 	XLogRecPtr min_failover_slot_lsn;
 };
-extern char		repmgr_schema[MAXLEN];
+
-extern bool		config_file_found;
+typedef struct
 {
 	char		slot[MAXLEN];
 	char		xlog_method[MAXLEN];
 	bool		no_slot; /* from PostgreSQL 10 */
 } t_basebackup_options;
 #define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false }
 typedef struct
 {
 	int    size;
 	char **keywords;
 	char **values;
 } t_conninfo_param_list;
 typedef struct
 {
 	char filepath[MAXPGPATH];
 	char filename[MAXPGPATH];
 	bool in_data_directory;
 } t_configfile_info;
 typedef struct
 {
 	int    size;
 	int    entries;
 	t_configfile_info **files;
 } t_configfile_list;
 #define T_CONFIGFILE_LIST_INITIALIZER { 0, 0, NULL }
 typedef struct
 {
 	int node_id;
 	int node_status;
 } t_node_status_rec;
 typedef struct
 {
 	int node_id;
 	char node_name[MAXLEN];
 	t_node_status_rec **node_status_list;
 } t_node_matrix_rec;
 typedef struct
 {
 	int node_id;
 	char node_name[MAXLEN];
 	t_node_matrix_rec **matrix_list_rec;
 } t_node_status_cube;
 #endif
--- a/repmgr.sql
+++ b/repmgr.sql
@@ -1,7 +1,7 @@
 /*
 * repmgr.sql
 *
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 */
@@ -64,7 +64,7 @@ CREATE INDEX idx_repl_status_sort ON repl_monitor(last_monitor_time, standby_nod
 * This view shows the list of nodes with the information of which one is the upstream
 * in each case (when appliable)
 */
-CREATE VIEW repl_show_nodes AS 
+CREATE VIEW repl_show_nodes AS
 SELECT rn.id, rn.conninfo, rn.type, rn.name, rn.cluster,
 	rn.priority, rn.active, sq.name AS upstream_node_name
 FROM repl_nodes as rn LEFT JOIN repl_nodes AS sq ON sq.id=rn.upstream_node_id;
--- a/repmgrd.c
+++ b/repmgrd.c
@@ -1,6 +1,7 @@
 /*
 * repmgrd.c - Replication manager daemon
- * Copyright (C) 2ndQuadrant, 2010-2016
+ *
 * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This module connects to the nodes of a replication cluster and monitors
 * how far are they from master
@@ -29,18 +30,10 @@
 #include <stdlib.h>
 #include <unistd.h>
 #include "repmgr.h"
 #include "config.h"
 #include "log.h"
 #include "strutil.h"
 #include "version.h"
 /* Required PostgreSQL headers */
 #include "access/xlogdefs.h"
 #include "pqexpbuffer.h"
 /* Message strings passed in repmgrSharedState->location */
 #define PASSIVE_NODE "PASSIVE_NODE"
@@ -70,6 +63,7 @@ bool		failover_done = false;
 bool		manual_mode_upstream_disconnected = false;
 char	   *pid_file = NULL;
 int			server_version_num = 0;
 static void help(void);
 static void usage(void);
@@ -110,17 +104,15 @@ static void check_and_create_pid_file(const char *pid_file);
 static void
 close_connections()
 {
-	if (master_conn != NULL && PQisBusy(master_conn) == 1)
+	if (PQstatus(master_conn) == CONNECTION_OK && PQisBusy(master_conn) == 1)
 		cancel_query(master_conn, local_options.master_response_timeout);
-	if (my_local_conn != NULL)
+
 	if (PQstatus(my_local_conn) == CONNECTION_OK)
 		PQfinish(my_local_conn);
-	if (master_conn != NULL && master_conn != my_local_conn)
+	if (PQstatus(master_conn) == CONNECTION_OK)
 		PQfinish(master_conn);
 	master_conn = NULL;
 	my_local_conn = NULL;
 }
@@ -146,8 +138,6 @@ main(int argc, char **argv)
 	FILE	   *fd;
 	int			server_version_num = 0;
 	set_progname(argv[0]);
 	/* Disallow running as root to prevent directory ownership problems */
@@ -208,6 +198,13 @@ main(int argc, char **argv)
 		}
 	}
 	/*
 	 * Tell the logger we're a daemon - this will ensure any output logged
 	 * before the logger is initialized will be formatted correctly
 	 */
 	logger_output_mode = OM_DAEMON;
 	/*
 	 * Parse the configuration file, if provided. If no configuration file
 	 * was provided, or one was but was incomplete, parse_config() will
@@ -248,6 +245,7 @@ main(int argc, char **argv)
 	}
 	logger_init(&local_options, progname());
 	if (verbose)
 		logger_set_verbose();
@@ -312,10 +310,46 @@ main(int argc, char **argv)
 	log_debug("node id is %i, upstream is %i\n", node_info.node_id, node_info.upstream_node_id);
    /*
     * Check if node record is active - if not, and `failover=automatic`, the node
     * won't be considered as a promotion candidate; this often happens when
     * a failed primary is recloned and the node was not re-registered, giving
     * the impression failover capability is there when it's not. In this case
     * abort with an error and a hint about registering.
     *
     * If `failover=manual`, repmgrd can continue to passively monitor the node, but
     * we should nevertheless issue a warning and the same hint.
     */
    if (node_info.active == false)
    {
        char *hint = "Check that 'repmgr (master|standby) register' was executed for this node";
        switch (local_options.failover)
        {
            case AUTOMATIC_FAILOVER:
                log_err(_("This node is marked as inactive and cannot be used for failover\n"));
                log_hint(_("%s\n"), hint);
                terminate(ERR_BAD_CONFIG);
            case MANUAL_FAILOVER:
                log_warning(_("This node is marked as inactive and will be passively monitored only\n"));
                log_hint(_("%s\n"), hint);
                break;
            default:
                /* This should never happen */
                log_err(_("Unknown failover mode %i\n"), local_options.failover);
                terminate(ERR_BAD_CONFIG);
        }
    }
 	/*
 	 * MAIN LOOP This loops cycles at startup and once per failover and
-	 * Requisites: - my_local_conn needs to be already setted with an active
+	 * Requisites:
-	 * connection - no master connection
+	 *  - my_local_conn must have an active connection to the monitored node
 	 *  - master_conn must not be open
 	 */
 	do
 	{
@@ -427,7 +461,7 @@ main(int argc, char **argv)
 													local_options.cluster_name,
 													&master_options.node, NULL);
-				if (master_conn == NULL)
+				if (PQstatus(master_conn) != CONNECTION_OK)
 				{
 					PQExpBufferData errmsg;
 					initPQExpBuffer(&errmsg);
@@ -612,15 +646,15 @@ witness_monitor(void)
 			}
 			else
 			{
-				log_debug(_("new master found with node ID: %i\n"), master_options.node);
+				log_info(_("new master found with node ID: %i\n"), master_options.node);
 				connection_ok = true;
 				/*
 				 * Update the repl_nodes table from the new master to reflect the changed
 				 * node configuration
 				 *
-				 * XXX it would be neat to be able to handle this with e.g. table-based
+				 * It would be neat to be able to handle this with e.g. table-based
-				 * logical replication
+				 * logical replication if available in core
 				 */
 				witness_copy_node_records(master_conn, my_local_conn, local_options.cluster_name);
@@ -675,26 +709,46 @@ witness_monitor(void)
 		return;
 	}
-	strcpy(monitor_witness_timestamp, PQgetvalue(res, 0, 0));
+	strncpy(monitor_witness_timestamp, PQgetvalue(res, 0, 0), MAXLEN);
 	PQclear(res);
 	/*
 	 * Build the SQL to execute on master
 	 */
-	sqlquery_snprintf(sqlquery,
+	if (server_version_num >= 100000)
-					  "INSERT INTO %s.repl_monitor "
+	{
-					  "           (primary_node, standby_node, "
+		sqlquery_snprintf(sqlquery,
-					  "            last_monitor_time, last_apply_time, "
+						  "INSERT INTO %s.repl_monitor "
-					  "            last_wal_primary_location, last_wal_standby_location, "
+						  "           (primary_node, standby_node, "
-					  "            replication_lag, apply_lag )"
+						  "            last_monitor_time, last_apply_time, "
-					  "      VALUES(%d, %d, "
+						  "            last_wal_primary_location, last_wal_standby_location, "
-					  "             '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
+						  "            replication_lag, apply_lag )"
-					  "             pg_catalog.pg_current_xlog_location(), NULL, "
+						  "      VALUES(%d, %d, "
-					  "             0, 0) ",
+						  "             '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
-					  get_repmgr_schema_quoted(my_local_conn),
+						  "             pg_catalog.pg_current_wal_lsn(), NULL, "
-					  master_options.node,
+						  "             0, 0) ",
-					  local_options.node,
+						  get_repmgr_schema_quoted(my_local_conn),
-					  monitor_witness_timestamp);
+						  master_options.node,
 						  local_options.node,
 						  monitor_witness_timestamp);
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery,
 						  "INSERT INTO %s.repl_monitor "
 						  "           (primary_node, standby_node, "
 						  "            last_monitor_time, last_apply_time, "
 						  "            last_wal_primary_location, last_wal_standby_location, "
 						  "            replication_lag, apply_lag )"
 						  "      VALUES(%d, %d, "
 						  "             '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
 						  "             pg_catalog.pg_current_xlog_location(), NULL, "
 						  "             0, 0) ",
 						  get_repmgr_schema_quoted(my_local_conn),
 						  master_options.node,
 						  local_options.node,
 						  monitor_witness_timestamp);
 	}
 	/*
 	 * Execute the query asynchronously, but don't check for a result. We will
@@ -716,13 +770,14 @@ static void
 standby_monitor(void)
 {
 	PGresult   *res;
 	char		sqlquery[QUERY_STR_LEN];
 	char		monitor_standby_timestamp[MAXLEN];
 	char		last_wal_primary_location[MAXLEN];
 	char		last_xlog_receive_location[MAXLEN];
 	char		last_xlog_replay_location[MAXLEN];
 	char		last_xact_replay_timestamp[MAXLEN];
-	bool		last_xlog_receive_location_gte_replayed;
+	bool		receiving_streamed_wal = true;
 	char		sqlquery[QUERY_STR_LEN];
 	XLogRecPtr	lsn_master_current_xlog_location;
 	XLogRecPtr	lsn_last_xlog_receive_location;
@@ -738,12 +793,10 @@ standby_monitor(void)
 	PGconn	   *upstream_conn;
 	char		upstream_conninfo[MAXCONNINFO];
 	int			upstream_node_id;
 	t_node_info upstream_node;
 	int			active_master_id;
 	const char *upstream_node_type = NULL;
 	bool		receiving_streamed_wal = true;
 	/*
@@ -797,14 +850,13 @@ standby_monitor(void)
 	/*
 	 * Check that the upstream node is still available
 	 * If not, initiate failover process
-	 */
+	 *
 	check_connection(&upstream_conn, upstream_node_type, upstream_conninfo);
 	/*
 	 * This takes up to local_options.reconnect_attempts *
 	 * local_options.reconnect_interval seconds
 	 */
 	check_connection(&upstream_conn, upstream_node_type, upstream_conninfo);
 	if (PQstatus(upstream_conn) != CONNECTION_OK)
 	{
 		int previous_master_node_id = master_options.node;
@@ -922,6 +974,8 @@ standby_monitor(void)
 			 * Failover handling is handled differently depending on whether
 			 * the failed node is the master or a cascading standby
 			 */
 			t_node_info upstream_node;
 			upstream_node = get_node_info(my_local_conn, local_options.cluster_name, upstream_node_id);
 			if (upstream_node.type == MASTER)
@@ -979,8 +1033,8 @@ standby_monitor(void)
 				 *
 				 * We should log a message so the user knows of the situation at hand.
 				 *
-				 * XXX check if the original master is still active and display a
+				 * XXX check if the original master is still active and display a warning
-				 * warning
+				 * XXX add event notification
 				 */
 				log_err(_("It seems this server was promoted manually (not by repmgr) so you might by in the presence of a split-brain.\n"));
 				log_err(_("Check your cluster and manually fix any anomaly.\n"));
@@ -1025,9 +1079,6 @@ standby_monitor(void)
 	 * from the upstream node to write monitoring information
 	 */
 	/* XXX not used? */
 	upstream_node = get_node_info(my_local_conn, local_options.cluster_name, upstream_node_id);
 	sprintf(sqlquery,
 			"SELECT id "
 			"  FROM %s.repl_nodes "
@@ -1079,19 +1130,48 @@ standby_monitor(void)
 	if (wait_connection_availability(master_conn, local_options.master_response_timeout) != 1)
 		return;
-	/* Get local xlog info */
+	/* Get local xlog info
 	 *
 	 * If receive_location is NULL, we're in archive recovery and not streaming WAL
 	 * If receive_location is less than replay location, we were streaming WAL but are
 	 *   somehow disconnected and evidently in archive recovery
 	 */
-	sqlquery_snprintf(sqlquery,
+	if (server_version_num >= 100000)
-					  " SELECT ts, "
+	{
-					  "        receive_location, "
+		sqlquery_snprintf(sqlquery,
-					  "        replay_location, "
+						  " SELECT ts, "
-					  "        replay_timestamp, "
+						  "        CASE WHEN (receive_location IS NULL OR receive_location < replay_location) "
-					  "        receive_location >= replay_location "
+						  "          THEN replay_location "
-					  "   FROM (SELECT CURRENT_TIMESTAMP AS ts, "
+						  "          ELSE receive_location"
-					  "         pg_catalog.pg_last_xlog_receive_location() AS receive_location, "
+						  "        END AS receive_location,"
-					  "         pg_catalog.pg_last_xlog_replay_location()  AS replay_location, "
+						  "        replay_location, "
-					  "         pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
+						  "        replay_timestamp, "
-					  "        ) q ");
+						  "        COALESCE(receive_location, '0/0') >= replay_location AS receiving_streamed_wal "
 						  "   FROM (SELECT CURRENT_TIMESTAMP AS ts, "
 						  "         pg_catalog.pg_last_wal_receive_lsn()  AS receive_location, "
 						  "         pg_catalog.pg_last_wal_replay_lsn()   AS replay_location, "
 						  "         pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
 						  "        ) q ");
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery,
 						  " SELECT ts, "
 						  "        CASE WHEN (receive_location IS NULL OR receive_location < replay_location) "
 						  "          THEN replay_location "
 						  "          ELSE receive_location"
 						  "        END AS receive_location,"
 						  "        replay_location, "
 						  "        replay_timestamp, "
 						  "        COALESCE(receive_location, '0/0') >= replay_location AS receiving_streamed_wal "
 						  "   FROM (SELECT CURRENT_TIMESTAMP AS ts, "
 						  "         pg_catalog.pg_last_xlog_receive_location() AS receive_location, "
 						  "         pg_catalog.pg_last_xlog_replay_location()  AS replay_location, "
 						  "         pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
 						  "        ) q ");
 	}
 	res = PQexec(my_local_conn, sqlquery);
@@ -1103,52 +1183,33 @@ standby_monitor(void)
 		return;
 	}
-	strncpy(monitor_standby_timestamp, PQgetvalue(res, 0, 0), MAXLEN);
+	strncpy(monitor_standby_timestamp,  PQgetvalue(res, 0, 0), MAXLEN);
 	strncpy(last_xlog_receive_location, PQgetvalue(res, 0, 1), MAXLEN);
-	strncpy(last_xlog_replay_location, PQgetvalue(res, 0, 2), MAXLEN);
+	strncpy(last_xlog_replay_location,  PQgetvalue(res, 0, 2), MAXLEN);
 	strncpy(last_xact_replay_timestamp, PQgetvalue(res, 0, 3), MAXLEN);
-	last_xlog_receive_location_gte_replayed = (strcmp(PQgetvalue(res, 0, 4), "t") == 0)
+	receiving_streamed_wal = (strcmp(PQgetvalue(res, 0, 4), "t") == 0)
 		? true
 		: false;
-	/*
+	if (receiving_streamed_wal == false)
 	 * If pg_last_xlog_receive_location is NULL, this means we're in archive
 	 * recovery and will need to calculate lag based on pg_last_xlog_replay_location
 	 */
 	/*
 	 * Replayed WAL is greater than received streamed WAL
 	 */
 	if (PQgetisnull(res, 0, 1))
 	{
-		receiving_streamed_wal = false;
+		log_verbose(LOG_DEBUG, _("standby %i not connected to streaming replication"), local_options.node);
 	}
 	PQclear(res);
 	/*
 	 * In the unusual event of a standby becoming disconnected from the primary,
 	 * while this repmgrd remains connected to the primary,  subtracting
 	 * "last_xlog_replay_location" from "lsn_last_xlog_receive_location" and coercing to
 	 * (long long unsigned int) will result in a meaningless, very large
 	 * value which will overflow a BIGINT column and spew error messages into the
 	 * PostgreSQL log. In the absence of a better strategy, skip attempting
 	 * to insert a monitoring record.
 	 */
 	if (receiving_streamed_wal == true && last_xlog_receive_location_gte_replayed == false)
 	{
 		log_verbose(LOG_WARNING,
 					"Replayed WAL newer than received WAL - is this standby connected to its upstream?\n");
 	}
 	/*
 	 * Get master xlog position
 	 *
 	 * TODO: investigate whether pg_current_xlog_insert_location() would be a better
 	 * choice; see: https://github.com/2ndQuadrant/repmgr/issues/189
 	 */
-	sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_xlog_location()");
+
 	if (server_version_num >= 100000)
 		sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_wal_lsn()");
 	else
 		sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_xlog_location()");
 	res = PQexec(master_conn, sqlquery);
 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
@@ -1162,24 +1223,20 @@ standby_monitor(void)
 	PQclear(res);
 	lsn_master_current_xlog_location = lsn_to_xlogrecptr(last_wal_primary_location, NULL);
 	lsn_last_xlog_receive_location = lsn_to_xlogrecptr(last_xlog_receive_location, NULL);
 	lsn_last_xlog_replay_location = lsn_to_xlogrecptr(last_xlog_replay_location, NULL);
-	/* Calculate apply lag */
+	if (lsn_last_xlog_receive_location >= lsn_last_xlog_replay_location)
 	if (last_xlog_receive_location_gte_replayed == false)
 	{
-		/*
+		apply_lag = (long long unsigned int)lsn_last_xlog_receive_location - lsn_last_xlog_replay_location;
 		 * We're not receiving streaming WAL - in this case the receive location
 		 * equals the last replayed location
 		 */
 		apply_lag = 0;
 		strncpy(last_xlog_receive_location, last_xlog_replay_location, MAXLEN);
 		lsn_last_xlog_receive_location = lsn_to_xlogrecptr(last_xlog_replay_location, NULL);
 	}
 	else
 	{
-		lsn_last_xlog_receive_location = lsn_to_xlogrecptr(last_xlog_receive_location, NULL);
+		/* This should never happen, but in case it does set apply lag to zero */
-
+		log_warning("Standby receive (%s) location appears less than standby replay location (%s)\n",
-		apply_lag = (long long unsigned int)lsn_last_xlog_receive_location - lsn_last_xlog_replay_location;
+					last_xlog_receive_location,
 					last_xlog_replay_location);
 		apply_lag = 0;
 	}
@@ -1190,7 +1247,7 @@ standby_monitor(void)
 	}
 	else
 	{
-		/* This should never happen, but in case it does set lag to zero */
+		/* This should never happen, but in case it does set replication lag to zero */
 		log_warning("Master xlog (%s) location appears less than standby receive location (%s)\n",
 					last_wal_primary_location,
 					last_xlog_receive_location);
@@ -1227,6 +1284,7 @@ standby_monitor(void)
 					  last_xlog_receive_location,
 					  replication_lag,
 					  apply_lag);
 	/*
 	 * Execute the query asynchronously, but don't check for a result. We will
 	 * check the result next time we pause for a monitor step.
@@ -1234,8 +1292,23 @@ standby_monitor(void)
 	log_verbose(LOG_DEBUG, "standby_monitor:() %s\n", sqlquery);
 	if (PQsendQuery(master_conn, sqlquery) == 0)
-		log_warning(_("query could not be sent to master. %s\n"),
+	{
 		log_warning(_("query could not be sent to master: %s\n"),
 					PQerrorMessage(master_conn));
 	}
 	else
 	{
 		sqlquery_snprintf(sqlquery,
 						  "SELECT %s.repmgr_update_last_updated();",
 						  get_repmgr_schema_quoted(my_local_conn));
 		res = PQexec(my_local_conn, sqlquery);
 		/* not critical if the above query fails*/
 		if (PQresultStatus(res) != PGRES_TUPLES_OK)
 			log_warning(_("unable to set last_updated: %s\n"), PQerrorMessage(my_local_conn));
 		PQclear(res);
 	}
 }
@@ -1251,7 +1324,7 @@ do_master_failover(void)
 	PGresult   *res;
 	char		sqlquery[QUERY_STR_LEN];
-	int			total_nodes = 0;
+	int			total_active_nodes = 0;
 	int			visible_nodes = 0;
 	int			ready_nodes = 0;
@@ -1282,7 +1355,7 @@ do_master_failover(void)
 			"SELECT id, conninfo, type, upstream_node_id "
 			"  FROM %s.repl_nodes "
 			" WHERE cluster = '%s' "
-		        "   AND active IS TRUE "
+			"   AND active IS TRUE "
 			"   AND priority > 0 "
 			" ORDER BY priority DESC, id "
 			" LIMIT %i ",
@@ -1298,32 +1371,25 @@ do_master_failover(void)
 		terminate(ERR_DB_QUERY);
 	}
-	/*
+	total_active_nodes = PQntuples(res);
-	 * total nodes that are registered
+	log_debug(_("%d active nodes registered\n"), total_active_nodes);
 	 */
 	total_nodes = PQntuples(res);
 	log_debug(_("%d active nodes registered\n"), total_nodes);
 	/*
 	 * Build an array with the nodes and indicate which ones are visible and
 	 * ready
 	 */
-	for (i = 0; i < total_nodes; i++)
+	for (i = 0; i < total_active_nodes; i++)
 	{
 		char node_type[MAXLEN];
 		nodes[i] = (t_node_info) T_NODE_INFO_INITIALIZER;
 		nodes[i].node_id = atoi(PQgetvalue(res, i, 0));
 		strncpy(nodes[i].conninfo_str, PQgetvalue(res, i, 1), MAXCONNINFO);
 		strncpy(node_type, PQgetvalue(res, i, 2), MAXLEN);
-		nodes[i].type = parse_node_type(PQgetvalue(res, i, 2));
+		nodes[i].type = parse_node_type(node_type);
 		/* Copy details of the failed node */
 		/* XXX only node_id is actually used later */
 		if (nodes[i].type == MASTER)
 		{
 			failed_master.node_id = nodes[i].node_id;
 			failed_master.xlog_location = nodes[i].xlog_location;
 			failed_master.is_ready = nodes[i].is_ready;
 		}
 		nodes[i].upstream_node_id = atoi(PQgetvalue(res, i, 3));
@@ -1334,11 +1400,42 @@ do_master_failover(void)
 		nodes[i].is_visible = false;
 		nodes[i].is_ready = false;
-		nodes[i].xlog_location = InvalidXLogRecPtr;
+		log_debug(_("node=%i conninfo=\"%s\" type=%s\n"),
 				  nodes[i].node_id,
 				  nodes[i].conninfo_str,
 				  node_type);
-		log_debug(_("node=%d conninfo=\"%s\" type=%s\n"),
+		/* Copy details of the failed master node */
-				  nodes[i].node_id, nodes[i].conninfo_str,
+		if (nodes[i].type == MASTER)
-				  PQgetvalue(res, i, 2));
+		{
 			/* XXX only node_id is currently used */
 			failed_master.node_id = nodes[i].node_id;
 			/*
 			 * XXX experimental
 			 *
 			 * Currently an attempt is made to connect to the master,
 			 * which is very likely to be a waste of time at this point, as we'll
 			 * have spent the last however many seconds trying to do just that
 			 * in check_connection() before deciding it's gone away.
 			 *
 			 * If the master did come back at this point, the voting algorithm should decide
 			 * it's the "best candidate" anyway and no standby will promote itself or
 			 * attempt to follow* another server.
 			 *
 			 * If we don't try and connect to the master here (and the code generally
 			 * assumes it's failed anyway) but it does come back any time from here
 			 * onwards, promotion will fail and the promotion candidate will
 			 * notice the reappearance.
 			 *
 			 * TLDR version: by skipping the master connection attempt (and the chances
 			 * the master would reappear between the last attempt in check_connection()
 			 * and now are minimal) we can remove useless cycles during the failover process;
 			 * if the master does reappear it will be caught before later anyway.
 			 */
 			continue;
 		}
 		node_conn = establish_db_connection(nodes[i].conninfo_str, false);
@@ -1359,13 +1456,13 @@ do_master_failover(void)
 	PQclear(res);
 	log_debug(_("total nodes counted: registered=%d, visible=%d\n"),
-			  total_nodes, visible_nodes);
+			  total_active_nodes, visible_nodes);
 	/*
 	 * Am I on the group that should keep alive? If I see less than half of
-	 * total_nodes then I should do nothing
+	 * total_active_nodes then I should do nothing
 	 */
-	if (visible_nodes < (total_nodes / 2.0))
+	if (visible_nodes < (total_active_nodes / 2.0))
 	{
 		log_err(_("Unable to reach most of the nodes.\n"
 				  "Let the other standby servers decide which one will be the master.\n"
@@ -1374,7 +1471,7 @@ do_master_failover(void)
 	}
 	/* Query all available nodes to determine readiness and LSN */
-	for (i = 0; i < total_nodes; i++)
+	for (i = 0; i < total_active_nodes; i++)
 	{
 		log_debug("checking node %i...\n", nodes[i].node_id);
@@ -1403,7 +1500,11 @@ do_master_failover(void)
 			terminate(ERR_FAILOVER_FAIL);
 		}
-		sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
+		if (server_version_num >= 100000)
 			sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_wal_receive_lsn()");
 		else
 			sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
 		res = PQexec(node_conn, sqlquery);
 		if (PQresultStatus(res) != PGRES_TUPLES_OK)
 		{
@@ -1435,7 +1536,12 @@ do_master_failover(void)
 	}
 	/* last we get info about this node, and update shared memory */
-	sprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
+
 	if (server_version_num >= 100000)
 		sprintf(sqlquery, "SELECT pg_catalog.pg_last_wal_receive_lsn()");
 	else
 		sprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
 	res = PQexec(my_local_conn, sqlquery);
 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
 	{
@@ -1452,7 +1558,7 @@ do_master_failover(void)
 	PQclear(res);
 	/* Wait for each node to come up and report a valid LSN */
-	for (i = 0; i < total_nodes; i++)
+	for (i = 0; i < total_active_nodes; i++)
 	{
 		/*
 		 * ensure witness server is marked as ready, and skip
@@ -1485,7 +1591,6 @@ do_master_failover(void)
 		 */
 		if (PQstatus(node_conn) != CONNECTION_OK)
 		{
 			/* XXX */
 			log_info(_("At this point, it could be some race conditions "
 					   "that are acceptable, assume the node is restarting "
 					   "and starting failover procedure\n"));
@@ -1612,7 +1717,7 @@ do_master_failover(void)
 	/*
 	 * determine which one is the best candidate to promote to master
 	 */
-	for (i = 0; i < total_nodes; i++)
+	for (i = 0; i < total_active_nodes; i++)
 	{
 		/* witness server can never be a candidate */
 		if (nodes[i].type == WITNESS)
@@ -1701,6 +1806,8 @@ do_master_failover(void)
 				{
 					log_notice(_("Original master reappeared before this standby was promoted - no action taken\n"));
 					/* XXX log an event here?  */
 					PQfinish(master_conn);
 					master_conn = NULL;
@@ -1837,8 +1944,10 @@ do_master_failover(void)
 		termPQExpBuffer(&event_details);
 	}
-	/* to force it to re-calculate mode and master node */
+	/*
-	// ^ ZZZ check that behaviour ^
+	 * setting "failover_done" to true will cause the node's monitoring loop
 	 * to restart in the appropriate mode for the node's (possibly new) role
 	 */
 	failover_done = true;
 }
@@ -2014,7 +2123,7 @@ check_connection(PGconn **conn, const char *type, const char *conninfo)
 		{
 			if (conninfo == NULL)
 			{
-				log_err("INTERNAL ERROR: *conn == NULL && conninfo == NULL");
+				log_err("INTERNAL ERROR: *conn == NULL && conninfo == NULL\n");
 				terminate(ERR_INTERNAL);
 			}
 			*conn = establish_db_connection(conninfo, false);
@@ -2054,18 +2163,21 @@ check_connection(PGconn **conn, const char *type, const char *conninfo)
 /*
 * set_local_node_status()
 *
- * If failure of the local node is detected, attempt to connect
+ * Attempt to connect to the current master server (as stored in the global
- * to the current master server (as stored in the global variable
+ * variable `master_conn`) and set the local node's status to the result
- * `master_conn`) and update its record to failed.
+ * of `is_standby(my_local_conn)`. Normally this will be used to mark
 * a node as failed, but in some circumstances we may be marking it
 * as recovered.
 */
 static bool
 set_local_node_status(void)
 {
-	PGresult       *res;
+	PGresult   *res;
 	char		sqlquery[QUERY_STR_LEN];
-	int		active_master_node_id = NODE_NOT_FOUND;
+	int			active_master_node_id = NODE_NOT_FOUND;
 	char		master_conninfo[MAXLEN];
 	bool		local_node_status;
 	if (!check_connection(&master_conn, "master", NULL))
 	{
@@ -2124,24 +2236,29 @@ set_local_node_status(void)
 	/*
 	 * Attempt to set the active record to the correct value.
 	 * First
 	 */
 	local_node_status = (is_standby(my_local_conn) == 1);
 	if (!update_node_record_status(master_conn,
 					    local_options.cluster_name,
 					    node_info.node_id,
 					    "standby",
 					    node_info.upstream_node_id,
-					    is_standby(my_local_conn)==1))
+					    local_node_status))
 	{
-		log_err(_("unable to set local node %i as inactive on master: %s\n"),
+		log_err(_("unable to set local node %i as %s on master: %s\n"),
 				node_info.node_id,
 				local_node_status == false ? "inactive" : "active",
 				PQerrorMessage(master_conn));
 		return false;
 	}
-	log_notice(_("marking this node (%i) as inactive on master\n"), node_info.node_id);
+	log_notice(_("marking this node (%i) as %s on master\n"),
 			   node_info.node_id,
 			   local_node_status == false ? "inactive" : "active");
 	return true;
 }
@@ -2282,13 +2399,13 @@ lsn_to_xlogrecptr(char *lsn, bool *format_ok)
 	if (format_ok != NULL)
 		*format_ok = true;
-	return (((XLogRecPtr) xlogid * 16 * 1024 * 1024 * 255) + xrecoff);
+	return (XLogRecPtr) ((uint64) xlogid) << 32 | (uint64) xrecoff;
 }
 void
 usage(void)
 {
-	log_err(_("%s: Replicator manager daemon \n"), progname());
+	log_err(_("%s: replication management daemon for PostgreSQL\n"), progname());
 	log_err(_("Try \"%s --help\" for more information.\n"), progname());
 }
--- a/sql/Makefile
+++ b/sql/Makefile
@@ -1,7 +1,7 @@
 #
 # Makefile
 #
-# Copyright (c) 2ndQuadrant, 2010-2016
+# Copyright (c) 2ndQuadrant, 2010-2017
 #
 MODULE_big = repmgr_funcs
--- a/sql/repmgr_funcs.sql.in
+++ b/sql/repmgr_funcs.sql.in
@@ -1,6 +1,6 @@
 /*
 * repmgr_function.sql
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 */
--- a/sql/repmgr_test.sql
+++ b/sql/repmgr_test.sql
@@ -0,0 +1,4 @@
 select * from repmgr_update_standby_location('');
 select * from repmgr_get_last_standby_location();
 select * from repmgr_update_last_updated();
 select * from repmgr_get_last_updated();
--- a/sql/uninstall_repmgr_funcs.sql
+++ b/sql/uninstall_repmgr_funcs.sql
@@ -1,6 +1,6 @@
 /*
 * uninstall_repmgr_funcs.sql
- * Copyright (c) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 */
--- a/strutil.c
+++ b/strutil.c
@@ -1,7 +1,7 @@
 /*
 * strutil.c
 *
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
@@ -87,3 +87,21 @@ maxlen_snprintf(char *str, const char *format,...)
 	return retval;
 }
 /*
 * Escape a string for use as a parameter in recovery.conf
 * Caller must free returned value
 */
 char *
 escape_recovery_conf_value(const char *src)
 {
 	char	   *result = escape_single_quotes_ascii(src);
 	if (!result)
 	{
 		fprintf(stderr, _("%s: out of memory\n"), progname());
 		exit(ERR_INTERNAL);
 	}
 	return result;
 }
--- a/strutil.h
+++ b/strutil.h
@@ -1,6 +1,6 @@
 /*
 * strutil.h
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 *
 * This program is free software: you can redistribute it and/or modify
@@ -22,6 +22,7 @@
 #define _STRUTIL_H_
 #include <stdlib.h>
 #include "pqexpbuffer.h"
 #include "errcode.h"
@@ -48,4 +49,6 @@ extern int
 maxlen_snprintf(char *str, const char *format,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
 extern char *
 escape_recovery_conf_value(const char *src);
 #endif   /* _STRUTIL_H_ */
--- a/uninstall_repmgr.sql
+++ b/uninstall_repmgr.sql
@@ -1,7 +1,7 @@
 /*
 * uninstall_repmgr.sql
 *
- * Copyright (C) 2ndQuadrant, 2010-2016
+ * Copyright (c) 2ndQuadrant, 2010-2017
 *
 */
--- a/version.h
+++ b/version.h
@@ -1,6 +1,6 @@
 #ifndef _VERSION_H_
 #define _VERSION_H_
-#define REPMGR_VERSION "3.1.5"
+#define REPMGR_VERSION "3.3.2"
 #endif