repmgr/OLD-README.rst

===================================================
repmgr: Replication Manager for PostgreSQL clusters
===================================================

Introduction
============

Introduction to repmgr commands
===============================

Suppose we have 3 nodes: node1 (the initial master), node2 and node3.
To make node2 and node3 be standbys of node1, execute this on both nodes
(node2 and node3)::

  repmgr -D /var/lib/pgsql/9.0 standby clone node1

In order to get full monitoring and easier state transitions,
you register each of the nodes, by creating a ``repmgr.conf`` file
and executing commands like this on the appropriate nodes::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf --verbose master register
  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf --verbose standby register

Once everything is registered, you start the repmgrd daemon.  It
will maintain a view showing the state of all the nodes in the cluster,
including how far they are lagging behind the master.

If you lose node1 you can then run this on node2::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby promote

To make node2 the new master.  Then on node3 run::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby follow

To make node3 follow node2 (rather than node1).

If now we want to add a new node, we can a prepare a new server (node4)
and run::

  repmgr -D /var/lib/pgsql/9.0 standby clone node2

And if a previously failed node becomes available again, such as
the lost node1 above, you can get it to resynchronize by only copying
over changes made while it was down.  That happens with what's
called a forced clone, which overwrites existing data rather than
assuming it starts with an empty database directory tree::

  repmgr -D /var/lib/pgsql/9.0 --force standby clone node1

This can be much faster than creating a brand new node that must
copy over every file in the database.

Installation Outline
====================

To install and use repmgr and repmgrd follow these steps:

1. Build repmgr programs

2. Set up trusted copy between postgres accounts, needed for the
   ``STANDBY CLONE`` step

3. Check your primary server is correctly configured

4. Write a suitable ``repmgr.conf`` for the node

5. Setup repmgrd to aid in failover transitions

Confirm software was built correctly
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You should now find the repmgr programs available in the subdirectory where
the rest of your PostgreSQL binary files are located.  You can confirm the
software is available by checking its version::

  repmgr --version
  repmgrd --version

You may need to include the full path of the binary instead, such as this
RHEL example::

  /usr/pgsql-9.0/bin/repmgr --version
  /usr/pgsql-9.0/bin/repmgrd --version

Or in this Debian example::

  /usr/lib/postgresql/9.0/bin/repmgr --version
  /usr/lib/postgresql/9.0/bin/repmgrd --version

Below this binary installation base directory is referred to as PGDIR.

Primary server configuration
----------------------------

PostgreSQL should have been previously built and installed on the system.  Here
is a sample of changes to the ``postgresql.conf`` file::

  listen_addresses='*'
  wal_level = 'hot_standby'
  archive_mode = on
  archive_command = 'cd .'	 # we can also use exit 0, anything that
                             # just does nothing
  max_wal_senders = 10
  wal_keep_segments = 5000     # 80 GB required on pg_xlog
  hot_standby = on

Also you need to add the machines that will participate in the cluster in
``pg_hba.conf`` file.  One possibility is to trust all connections from the
replication users from all internal addresses, such as::

  host     all              all         192.168.1.0/24         trust
  host     replication      all         192.168.1.0/24         trust

A more secure setup adds a repmgr user and database, just giving
access to that user::

  host     repmgr           repmgr      192.168.1.0/24         trust
  host     replication      all         192.168.1.0/24         trust

If you give a password to the user, you need to create a ``.pgpass`` file for
them as well to allow automatic login.  In this case you might use the
``md5`` authentication method instead of ``trust`` for the repmgr user.

Don't forget to restart the database server after making all these changes.

Usage walkthrough
=================

This assumes you've already followed the steps in "Installation Outline" to
install repmgr and repmgrd on the system.

A typical production installation of ``repmgr`` might involve two PostgreSQL
instances on seperate servers, both running under the ``postgres`` user account
and both using the default port (5432). This walkthrough assumes the following
setup:

* A primary (master) server called "node1," running as the "postgres" user
  who is also the owner of the files. This server is operating on port 5432.  This
  server will be known as "node1" in the cluster "test".

* A secondary (standby) server called "node2," running as the "postgres" user
  who is also the owner of the files. This server is operating on port 5432.  This
  server will be known as "node2" in the cluster "test".

* Another standby server called "node3" with a similar configuration to "node2".

* The Postgres installation in each of the above is defined as $PGDATA,
  which is represented here as ``/var/lib/pgsql/9.0/data``

Creating some sample data
-------------------------

If you already have a database with useful data to replicate, you can
skip this step and use it instead.  But if you do not already have
data in this cluster to replication, you can create some like this::

    createdb pgbench
    pgbench -i -s 10 pgbench

Examples below will use the database name ``pgbench`` to match this.
Substitute the name of your database instead.  Note that the standby
nodes created here will include information for every database in the
cluster, not just the specified one.  Needing the database name is
mainly for user authentication purposes.

Setting up a repmgr user
------------------------

Make sure that the "standby" user has a role in the database, "pgbench" in this
case, and can login.   On "node1"::

  createuser --login --superuser repmgr

Alternately you could start ``psql`` on the pgbench database on "node1" and at
the node1b# prompt type::

  CREATE ROLE repmgr SUPERUSER LOGIN;

The main advantage of the latter is that you can do it remotely to any
system you already have superuser access to.

Clearing the PostgreSQL installation on the Standby
---------------------------------------------------

To setup a new streaming replica, startin by removing any PostgreSQL
installation on the existing standby nodes.

* Stop any server on "node2" and "node3".  You can confirm the database
  servers running using a command like this::

    ps -eaf | grep postgres

  And looking for the various database server processes:  server, logger,
  wal writer, and autovacuum launcher.

* Go to "node2" and "node3" database directories and remove the PostgreSQL installation::

    cd $PGDATA
    rm -rf *

  This will delete the entire database installation in ``/var/lib/pgsql/9.0/data``.
  Be careful that $PGDATA is defined here; executing ``ls`` to confirm you're
  in the right place is always a good idea before executing ``rm``.

Testing remote access to the master
-----------------------------------

On the "node2" server, first test that you can connect to "node1" the
way repmgr will by executing::

  psql -h node1 -U repmgr -d pgbench

Possible sources for a problem here include:

* Login role specified was not created on "node1"

* The database configuration on "node1" is not listening on a TCP/IP port.
  That could be because the ``listen_addresses`` parameter was not updated,
  or if it was but the server wasn't restarted afterwards.  You can
  test this on "node1" itself the same way::

    psql -h node1 -U repmgr -d pgbench

  With the "-h" parameter forcing a connnection over TCP/IP, rather
  than the default UNIX socket method.

* There is a firewall setup that prevents incoming access to the
  PostgreSQL port (defaulting to 5432) used to access "node1".  In
  this situation you would be able to connect to the "node1" server
  on itself, but not from any other host, and you'd just get a timeout
  when trying rather than a proper error message.

* The ``pg_hba.conf`` file does not list appropriate statements to allow
  this user to login.  In this case you should connect to the server,
  but see an error message mentioning the ``pg_hba.conf``.

Cloning the standby
-------------------

With "node1" server running, we want to use the ``clone standby`` command
in repmgr to copy over the entire PostgreSQL database cluster onto the
"node2" server.  Execute the clone process with::

  repmgr -D $PGDATA -d pgbench -p 5432 -U repmgr -R postgres --verbose standby clone node1

Here "-U" specifies the database user to connect to the master as, while
"-R" specifies what user to run the rsync command as.  Potentially you
could leave out one or both of these, in situations where the user and/or
role setup is the same on each node.

If this fails with an error message about accessing the master database,
you should return to the previous step and confirm access to "node1"
from "node2" with ``psql``, using the same parameters given to repmgr.

NOTE: you need to have $PGDIR/bin (where the PostgreSQL binaries are installed)
in your path for the above to work.  If you don't want that as a permanent
setting, you can temporarily set it before running individual commands like
this::

  PATH=$PGDIR/bin:$PATH repmgr -D $PGDATA ...

Setup repmgr configuration file
-------------------------------

Create a directory to store each repmgr configuration in for each node.
In that, there needs to be a ``repmgr.conf`` file for each node in the cluster.
For each node we'll assume this is stored in ``/var/lib/pgsql/repmgr/repmgr.conf``
following the standard directory structure of a RHEL system.  It should contain::

  cluster=test
  node=1
  node_name=earth
  conninfo='host=node1 user=repmgr dbname=pgbench'

On "node2" create the file ``/var/lib/pgsql/repmgr/repmgr.conf`` with::

  cluster=test
  node=2
  node_name=mars
  conninfo='host=node2 user=repmgr dbname=pgbench'

The STANDBY CLONE process should have created a recovery.conf file on
"node2" in the $PGDATA directory that reads as follows::

  standby_mode = 'on'
  primary_conninfo = 'host=node1 port=5432'

Registering the master and standby
----------------------------------

First, register the master by typing on "node1"::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf --verbose master register

Then start the "standby" server.

You could now register the standby by typing on "node2"::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf --verbose standby register

However, you can instead start repmgrd::

  repmgrd -f /var/lib/pgsql/repmgr/repmgr.conf --daemonize --verbose > /var/lib/pgsql/repmgr/repmgr.log 2>&1

Which will automatically register your standby system.  And eventually
you need repmgrd running anyway, to save lag monitoring information.
repmgrd will log the deamon activity to the listed file.  You can
watch what it is doing with::

  tail -f /var/lib/pgsql/repmgr/repmgr.log

Hit control-C to exit this tail command when you are done.

Monitoring and testing
----------------------

At this point, you have a functioning primary on "node1" and a functioning
standby server running on "node2".  You can confirm the master knows
about the standby, and that it is keeping it current, by looking at
``repl_status``::

	postgres@node2 $ psql -x -d pgbench -c "SELECT * FROM repmgr_test.repl_status"
	-[ RECORD 1 ]-------------+------------------------------
	primary_node              | 1
	standby_node              | 2
	last_monitor_time         | 2011-02-23 08:19:39.791974-05
	last_wal_primary_location | 0/1902D5E0
	last_wal_standby_location | 0/1902D5E0
	replication_lag           | 0 bytes
	apply_lag                 | 0 bytes
	time_lag                  | 00:26:13.30293

Some tests you might do at this point include:

* Insert some records into the primary server here, confirm they appear
  very quickly (within milliseconds) on the standby, and that the
  repl_status view advances accordingly.

* Verify that you can run queries against the standby server, but
  cannot make insertions into the standby database.

Simulating the failure of the primary server
--------------------------------------------

To simulate the loss of the primary server, simply stop the "node1" server.
At this point, the standby contains the database as it existed at the time of
the "failure" of the primary server.  If looking at ``repl_status`` on
"node2", you should see the time_lag value increase the longer "node1"
is down.

Promoting the Standby to be the Primary
---------------------------------------

Now you can promote the standby server to be the primary, to allow
applications to read and write to the database again, by typing::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf --verbose standby promote

The server restarts and now has read/write ability.

Bringing the former Primary up as a Standby
-------------------------------------------

To make the former primary act as a standby, which is necessary before
restoring the original roles, type the following on node1::

  repmgr -D $PGDATA -d pgbench -p 5432 -U repmgr -R postgres --verbose --force standby clone node2

Then start the "node1" server, which is now acting as a standby server.
Check

Make sure the record(s) inserted the earlier step are still available on the
now standby (prime).  Confirm the database on "node1" is read-only.

Restoring the original roles of prime to primary and standby to standby
-----------------------------------------------------------------------

Now restore to the original configuration by stopping
"node2" (now acting as a primary), promoting "node1" again to be the
primary server, then bringing up "node2" as a standby with a valid
``recovery.conf`` file.

Stop the "node2" server and type the following on "node1" server::

  repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby promote

Now the original primary, "node1", is acting again as primary.

Start the "node2" server and type this on "node2"::

  repmgr standby clone --force -h node2 -p 5432 -U postgres -R postgres --verbose

Verify the roles have reversed by attempting to insert a record on "node1"
and on "node2".

The servers are now again acting as primary on "node1" and standby on "node2".

Alternate setup:  both servers on one host
==========================================

Another test setup assumes you might be using the default installation of
PostgreSQL on port 5432 for some other purpose, and instead relocates these
instances onto different ports running as different users.  In places where
``127.0.0.1`` is used as a host name, a more traditional configuration
would instead use the name of the relevant host for that parameter.
You can usually leave out changes to the port number in this case too.

* A primary (master) server called "prime," with a user as "prime," who is
  also the owner of the files. This server is operating on port 5433.  This
  server will be known as "node1" in the cluster "test"

* A standby server called "standby", with a user of "standby", who is the
  owner of the files.  This server is operating on port 5434.  This server
  will be known and "node2" on the cluster "test."

* A database exists on "prime" called "testdb."

* The Postgres installation in each of the above is defined as $PGDATA,
  which is represented here with ``/data/prime`` as the "prime" server and
  ``/data/standby`` as the "standby" server.

You might setup such an installation by adjusting the login script for the
"prime" and "standby" users as in these two examples::

  # prime
  PGDATA=/data/prime
  PGENGINE=/usr/pgsql-9.0/bin
  PGPORT=5433
  export PGDATA PGENGINE PGPORT
  PATH="$PATH:$PGENGINE"

  # standby
  PGDATA=/data/standby
  PGENGINE=/usr/pgsql-9.0/bin
  PGPORT=5434
  export PGDATA PGENGINE PGPORT
  PATH="$PATH:$PGENGINE"

And then starting/stopping each installation as needed using the ``pg_ctl``
utility.

Note:  naming your nodes based on their starting role is not a recommended
best practice!  As you'll see in this example, once there is a failover, names
strongly associated with one particular role (primary or standby) can become
confusing, once that node no longer has that role.  Future versions of this
walkthrough are expected to use more generic terminology for these names.

Clearing the PostgreSQL installation on the Standby
---------------------------------------------------

Setup a streaming replica, strip away any PostgreSQL installation on the existing replica:

* Stop both servers.

* Go to "standby" database directory and remove the PostgreSQL installation::

    cd $PGDATA
    rm -rf *

  This will delete the entire database installation in ``/data/standby``.

Building the standby
--------------------

Create a directory to store each repmgr configuration in for each node.
In that, there needs to be a ``repmgr.conf`` file for each node in the cluster.
For "prime" we'll assume this is stored in ``/home/prime/repmgr``
and it should contain::

  cluster=test
  node=1
  node_name=earth
  conninfo='host=127.0.0.1 dbname=testdb'

On "standby" create the file ``/home/standby/repmgr/repmgr.conf`` with::

  cluster=test
  node=2
  node_name=mars
  conninfo='host=127.0.0.1 dbname=testdb'

Next, with "prime" server running, we want to use the ``clone standby`` command
in repmgr to copy over the entire PostgreSQL database cluster onto the
"standby" server.  On the "standby" server, type::

  repmgr -D $PGDATA -p 5433 -U prime -R prime --verbose standby clone localhost

Next, we need a recovery.conf file on "standby" in the $PGDATA directory
that reads as follows::

  standby_mode = 'on'
  primary_conninfo = 'host=127.0.0.1 port=5433'

Make sure that standby has a qualifying role in the database, "testdb" in this
case, and can login. Start ``psql`` on the testdb database on "prime" and at
the testdb# prompt type::

  CREATE ROLE standby SUPERUSER LOGIN

Registering the master and standby
----------------------------------

First, register the master by typing on "prime"::

  repmgr -f /home/prime/repmgr/repmgr.conf --verbose master register

On "standby," edit the ``postgresql.conf`` file and change the port to 5434.

Start the "standby" server.

Register the standby by typing on "standby"::

  repmgr -f /home/standby/repmgr/repmgr.conf --verbose standby register

At this point, you have a functioning primary on "prime" and a functioning
standby server running on "standby."  You can confirm the master knows
about the standby, and that it is keeping it current, by running the
following on the master::

  psql -x -d pgbench -c "SELECT * FROM repmgr_test.repl_status"

Some tests you might do at this point include:

* Insert some records into the primary server here, confirm they appear
  very quickly (within milliseconds) on the standby, and that the
  repl_status view advances accordingly.

* Verify that you can run queries against the standby server, but
  cannot make insertions into the standby database.

Simulating the failure of the primary server
--------------------------------------------

To simulate the loss of the primary server, simply stop the "prime" server.
At this point, the standby contains the database as it existed at the time of
the "failure" of the primary server.

Promoting the Standby to be the Primary
---------------------------------------

Now you can promote the standby server to be the primary, to allow
applications to read and write to the database again, by typing::

  repmgr -f /home/standby/repmgr/repmgr.conf --verbose standby promote

The server restarts and now has read/write ability.

Bringing the former Primary up as a Standby
-------------------------------------------

To make the former primary act as a standby, which is necessary before
restoring the original roles, type::

  repmgr -U standby -R prime -h 127.0.0.1 -p 5433 -d testdb --force --verbose standby clone

Stop and restart the "prime" server, which is now acting as a standby server.

Make sure the record(s) inserted the earlier step are still available on the
now standby (prime).  Confirm the database on "prime" is read-only.

Restoring the original roles of prime to primary and standby to standby
-----------------------------------------------------------------------

Now restore to the original configuration by stopping the
"standby" (now acting as a primary), promoting "prime" again to be the
primary server, then bringing up "standby" as a standby with a valid
``recovery.conf`` file on "standby".

Stop the "standby" server::

  repmgr -f /home/prime/repmgr/repmgr.conf standby promote

Now the original primary, "prime" is acting again as primary.

Start the "standby" server and type this on "prime"::

  repmgr standby clone --force -h 127.0.0.1 -p 5434 -U prime -R standby --verbose

Stop the "standby" and change the port to be 5434 in the ``postgresql.conf``
file.

Verify the roles have reversed by attempting to insert a record on "standby"
and on "prime."

The servers are now again acting as primary on "prime" and standby on "standby".

Maintainance of monitor history
-------------------------------

Once you have changed roles (with a failover or to restore original roles)
you would end up with records saying that node1 is primary and other records
saying that node2 is the primary. Which could be confusing.
Also, if you don't do anything about it the monitor history will keep growing.
For both of those reasons you sometime want to make some maintainance of the
``repl_monitor`` table.

If you want to clean the history after a few days you can execute the
CLUSTER CLEANUP command in a cron. For example to keep just one day of history
you can put this in your crontab::

0 1 * * *   repmgr cluster cleanup -k 1 -f ~/repmgr.conf

Configuration and command reference
===================================

Configuration File
------------------

``repmgr.conf`` is looked for in the directory repmgrd or repmgr exists in.
The configuration file should have 3 lines:

1. cluster: A string (single quoted) that identify the cluster we are on

2. node: An integer that identify our node in the cluster

3. conninfo: A string (single quoted) specifying how we can connect to this node's PostgreSQL service

repmgr
------

Command line syntax
~~~~~~~~~~~~~~~~~~~

The current supported syntax for the program can be seen using::

  repmgr --help

The output from this program looks like this::

  repmgr: Replicator manager
  Usage:
   repmgr [OPTIONS] master  {register}
   repmgr [OPTIONS] standby {register|clone|promote|follow}

  General options:
    --help                     show this help, then exit
    --version                  output version information, then exit
    --verbose                  output verbose activity information

  Connection options:
    -d, --dbname=DBNAME        database to connect to
    -h, --host=HOSTNAME        database server host or socket directory
    -p, --port=PORT            database server port
    -U, --username=USERNAME    database user name to connect as

  Configuration options:
    -D, --data-dir=DIR         local directory where the files will be copied to
    -f, --config-file=PATH     path to the configuration file
    -R, --remote-user=USERNAME database server username for rsync
    -w, --wal-keep-segments=VALUE  minimum value for the GUC wal_keep_segments (default: 5000)
    -I, --ignore-rsync-warning ignore rsync partial transfer warning
    -F, --force                force potentially dangerous operations to happen

  repmgr performs some tasks like clone a node, promote it or making follow another node and then exits.
  COMMANDS:
   master register       - registers the master in a cluster
   standby register      - registers a standby in a cluster
   standby clone [node]  - allows creation of a new standby
   standby promote       - allows manual promotion of a specific standby into a new master in the event of a failover
   standby follow        - allows the standby to re-point itself to a new master

The ``--verbose`` option can be useful in troubleshooting issues with
the program.

repmgr commands
---------------

Not all of these commands need the ``repmgr.conf`` file, but they need to be able to
connect to the remote and local databases.

You can teach it which is the remote database by using the -h parameter or
as a last parameter in standby clone and standby follow. If you need to specify
a port different then the default 5432 you can specify a -p parameter.
Standby is always considered as localhost and a second -p parameter will indicate
its port if is different from the default one.

* master register

  * Registers a master in a cluster, it needs to be executed before any
    standby nodes are registered

* standby register

  * Registers a standby in a cluster, it needs to be executed before
    repmgrd will function on the node.

* standby clone [node to be cloned]

  * Does a backup via ``rsync`` of the data directory of the primary. And it
    creates the recovery file we need to start a new hot standby server.
    It doesn't need the ``repmgr.conf`` so it can be executed anywhere on the
    new node.  You can change to the directory you want the new database
    cluster at and execute::

      ./repmgr standby clone node1

    or run from wherever you are with a full path::

     ./repmgr -D /path/to/new/data/directory standby clone node1

    That will make a backup of the primary then you only need to start the server
    using a command like::

      pg_ctl -D /your_data_directory_path start

    Note that some installations will also redirect the output log file when
    executing ``pg_ctl``; check the server startup script you are using
    and try to match what it does.

* standby promote

  * Allows manual promotion of a specific standby into a new primary in the
    event of a failover.  This needs to be executed on the same directory
    where the ``repmgr.conf`` is in the standby, or you can use the ``-f`` option
    to indicate where the ``repmgr.conf`` is at.  It doesn't need any
    additional arguments::

      ./repmgr standby promote

    That will restart your standby postgresql service.

* standby follow

    * Allows the standby to base itself to the new primary passed as a
      parameter.  This needs to be executed on the same directory where the
      ``repmgr.conf`` is in the standby, or you can use the ``-f`` option
      to indicate where the ``repmgr.conf`` is at.  Example::

        ./repmgr standby follow

* cluster show

    * Shows the role (standby/master) and connection string for all nodes configured
      in the cluster or "FAILED" if the node doesn't respond. This allow us to know
      which nodes are alive and which one needs attention and to have a notion of the
      structure of clusters we just have access to.  Example::

        ./repmgr cluster show

* cluster cleanup

    * Cleans the monitor's history from repmgr tables. This avoids the repl_monitor table
      to grow excesivelly which in turns affects repl_status view performance, also
      keeps controlled the space in disk used by repmgr. This command can be used manually
      or in a cron to make it periodically.
      There is also a --keep-history (-k) option to indicate how many days of history we
      want to keep, so the command will clean up history older than "keep-history" days. Example::

        ./repmgr cluster cleanup -k 2

repmgrd Daemon
--------------

Command line syntax
~~~~~~~~~~~~~~~~~~~

The current supported syntax for the program can be seen using::

  repmgrd --help

The output from this program looks like this::

  repmgrd: Replicator manager daemon
  Usage:
   repmgrd [OPTIONS]

  Options:
    --help                    show this help, then exit
    --version                 output version information, then exit
    --verbose                 output verbose activity information
    --monitoring-history      track advance or lag of the replication in every standby in repl_monitor
    -f, --config-file=PATH    path to the configuration file
    -d, --daemonize           detach process from foregroun
    -p, --pid-file=PATH       write a PID file

  repmgrd monitors a cluster of servers.

The ``--verbose`` option can be useful in troubleshooting issues with
the program.

Usage
-----

repmgrd reads the ``repmgr.conf`` file in current directory, or as
indicated with -f parameter.  If run on a standby, it checks if that
standby is in ``repl_nodes`` and adds it if not.

Before you can run repmgrd you need to register a master in a cluster
using the ``MASTER REGISTER`` command.  If run on a master,
repmgrd will exit, as it has nothing to do on them yet.  It is only
targeted at running on standby servers currently.  If converting
a former master into a standby, you will need to start repmgrd
in order to make it fully operational in its new role.

The repmgr daemon creates 2 connections: one to the master and another to the
standby.

Lag monitoring
--------------

repmgrd helps monitor a set of master and standby servers.  You can
see which node is the current master, as well as how far behind each
is from current.
To activate the monitor capabilities of repmgr you must include the
option --monitoring-history when running it::

  repmgrd --monitoring-history --config-file=/path/to/repmgr.conf &

To look at the current lag between primary and each node listed
in ``repl_node``, consult the ``repl_status`` view::

  psql -d postgres -c "SELECT * FROM repmgr_test.repl_status"

This view shows the latest monitor info from every node.

* replication_lag: in bytes.  This is how far the latest xlog record
  we have received is from master.

* apply_lag: in bytes.  This is how far the latest xlog record
  we have applied is from the latest record we have received.

* time_lag: in seconds.  How many seconds behind the master is this node.

Error codes
-----------

When the repmgr or repmgrd program exits, it will set one of the
following

* SUCCESS (0)             Program ran successfully.
* ERR_BAD_CONFIG (1)      One of the configuration checks the program makes failed.
* ERR_BAD_RSYNC (2)       An rsync call made by the program returned an error.
* ERR_NO_RESTART (4)      An attempt to restart a PostgreSQL instance failed.
* ERR_DB_CON (6)          Error when trying to connect to a database.
* ERR_DB_QUERY (7)        Error executing a database query.
* ERR_PROMOTED (8)        Exiting program because the node has been promoted to master.
* ERR_BAD_PASSWORD (9)    Password used to connect to a database was rejected.
* ERR_STR_OVERFLOW (10)   String overflow error
* ERR_FAILOVER_FAIL (11)  Error encountered during failover (repmgrd only)
* ERR_BAD_SSH (12)        Error when connecting to remote host via SSH
* ERR_SYS_FAILURE (13)    Error when forking (repmgrd only)
* ERR_BAD_BASEBACKUP (14) Error when executing pg_basebackup

License and Contributions
=========================

repmgr is licensed under the GPL v3.  All of its code and documentation is
Copyright 2010-2015, 2ndQuadrant Limited.  See the files COPYRIGHT and LICENSE for
details.

Main sponsorship of repmgr has been from 2ndQuadrant customers.

Additional work has been sponsored by the 4CaaST project for cloud computing,
which has received funding from the European Union's Seventh Framework Programme
(FP7/2007-2013) under grant agreement 258862.

Contributions to repmgr are welcome, and will be listed in the file CREDITS.
2ndQuadrant Limited requires that any contributions provide a copyright
assignment and a disclaimer of any work-for-hire ownership claims from the
employer of the developer.  This lets us make sure that all of the repmgr
distribution remains free code.  Please contact info@2ndQuadrant.com for a
copy of the relevant Copyright Assignment Form.

Code style
----------

Code in repmgr is formatted to a consistent style using the following command::

  astyle --style=ansi --indent=tab --suffix=none *.c *.h

Contributors should reformat their code similarly before submitting code to
the project, in order to minimize merge conflicts with other work.

Support and Assistance
======================

2ndQuadrant provides 24x7 production support for repmgr, as well as help you
configure it correctly, verify an installation and train you in running a
robust replication cluster.

There is a mailing list/forum to discuss contributions or issues
http://groups.google.com/group/repmgr

#repmgr is registered in freenode IRC

Further information is available at http://www.repmgr.org/

We'd love to hear from you about how you use repmgr. Case studies and
news are always welcome. Send us an email at info@2ndQuadrant.com, or
send a postcard to

repmgr
c/o 2ndQuadrant
7200 The Quorum
Oxford Business Park North
Oxford
OX4 2JZ

Thanks from the repmgr core team

Jaime Casanova
Simon Riggs
Greg Smith
Cedric Villemain