repmgrdBDRBDRBDR failover with repmgrd
&repmgr; 4.x provides support for monitoring BDR nodes and taking action in
case one of the nodes fails.
Due to the nature of BDR, it's only safe to use this solution for
a two-node scenario. Introducing additional nodes will create an inherent
risk of node desynchronisation if a node goes down without being cleanly
removed from the cluster.
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with `repmgrd` and redirecting queries from the failed node to the remaining
active node. This can be done by using an
event notification script
which is called by repmgrd to dynamically
reconfigure a proxy server/connection pooler such as PgBouncer.
Prerequisites
&repmgr; 4 requires PostgreSQL 9.4 or 9.6 with the BDR 2 extension
enabled and configured for a two-node BDR network. &repmgr; 4 packages
must be installed on each node before attempting to configure
repmgr.
&repmgr; 4 will refuse to install if it detects more than two BDR nodes.
Application database connections *must* be passed through a proxy server/
connection pooler such as PgBouncer, and it must be possible to dynamically
reconfigure that from repmgrd. The example demonstrated in this document
will use PgBouncer
The proxy server / connection poolers must not
be installed on the database servers.
For this example, it's assumed password-less SSH connections are available
from the PostgreSQL servers to the servers where PgBouncer
runs, and that the user on those servers has permission to alter the
PgBouncer configuration files.
PostgreSQL connections must be possible between each node, and each node
must be able to connect to each PgBouncer instance.
Configuration
A sample configuration for repmgr.conf on each
BDR node would look like this:
# Node information
node_id=1
node_name='node1'
conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
replication_type='bdr'
# Event notification configuration
event_notifications=bdr_failover
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
# repmgrd options
monitor_interval_secs=5
reconnect_attempts=6
reconnect_interval=5
Adjust settings as appropriate; copy and adjust for the second node (particularly
the values node_id, node_name
and conninfo).
Note that the values provided for the conninfo string
must be valid for connections from both nodes in the
replication cluster. The database must be the BDR-enabled database.
If defined, the evenr event_notifications parameter
will restrict execution of event_notification_command
to the specified event(s).
event_notification_command is the script which does the actual "heavy lifting"
of reconfiguring the proxy server/ connection pooler. It is fully
user-definable; a reference implementation is documented below.
repmgr setup
Register both nodes; example on node1:
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: node record created for node 'node1' (ID: 1)
NOTICE: BDR node 1 registered (conninfo: host=node1 dbname=bdrtest user=repmgr)
and on node1:
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: node record created for node 'node2' (ID: 2)
NOTICE: BDR node 2 registered (conninfo: host=node2 dbname=bdrtest user=repmgr)
The repmgr extension will be automatically created
when the first node is registered, and will be propagated to the second
node.
Ensure the &repmgr; package is available on both nodes before
attempting to register the first node.
At this point the meta data for both nodes has been created; executing
(on either node) should produce output like this:
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+------+-----------+----------+--------------------------------------------------------
1 | node1 | bdr | * running | | default | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
2 | node2 | bdr | * running | | default | host=node2 dbname=bdrtest user=repmgr connect_timeout=2
Additionally it's possible to display log of significant events; executing
(on either node) should produce output like this:
Node ID | Event | OK | Timestamp | Details
---------+--------------+----+---------------------+----------------------------------------------
2 | bdr_register | t | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
1 | bdr_register | t | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
At this point there will only be records for the two node registrations (displayed in reverse
chronological order).