mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 07:06:30 +00:00
The repmgr3 implementation required the promotion candidate (standby) to directly work with the demotion candidate's data directory, directly execute server control commands etc. Here we delegated a lot more of that work to the repmgr on the demotion candidate, which reduces the amount of back-and-forth over SSH and generally makes things cleaner and smoother. In particular the repmgr on the demotion candidate will carry out a thorough check that the node is shut down and report the last checkpoint LSN to the promotion candidate; this can then be used to determine whether pg_rewind needs to be executed on the demoted primary before reintegrating it back into the cluster (todo). Also implement "--dry-run" for this action, which will sanity-check the nodes as far as possible without executing the switchover. Additionally some of the new repmgr node commands (or command options) introduced for this can be also executed by the user to obtain additional information about the status of each node.
315 lines
12 KiB
Markdown
315 lines
12 KiB
Markdown
repmgr: Replication Manager for PostgreSQL
|
|
==========================================
|
|
|
|
`repmgr` is a suite of open-source tools to manage replication and failover
|
|
within a cluster of PostgreSQL servers. It enhances PostgreSQL's built-in
|
|
replication capabilities with utilities to set up standby servers, monitor
|
|
replication, and perform administrative tasks such as failover or switchover
|
|
operations.
|
|
|
|
`repmgr 4` is a complete rewrite of the existing `repmgr` codebase.
|
|
|
|
Supports PostgreSQL 9.5 and later; support for PostgreSQL 9.3 and 9.4 has been
|
|
dropped. To use `repmgr 4` with BDR 2.0, PostgreSQL 9.6 is required.
|
|
|
|
Building from source
|
|
--------------------
|
|
|
|
Simply:
|
|
|
|
./configure && make install
|
|
|
|
Ensure `pg_config` for the target PostgreSQL version is in `$PATH`.
|
|
|
|
|
|
Reference
|
|
---------
|
|
|
|
### repmgr commands
|
|
|
|
The following commands are available:
|
|
|
|
repmgr primary register
|
|
repmgr primary unregister
|
|
|
|
repmgr standby clone
|
|
repmgr standby register
|
|
repmgr standby unregister
|
|
repmgr standby promote
|
|
repmgr standby follow
|
|
|
|
repmgr bdr register
|
|
repmgr bdr unregister
|
|
|
|
repmgr node status
|
|
|
|
repmgr cluster show
|
|
repmgr cluster event [--all] [--node-id] [--node-name] [--event] [--event-matching]
|
|
|
|
|
|
* `primary register`
|
|
|
|
Registers a primary in a streaming replication cluster, and configures
|
|
it for use with repmgr. This command needs to be executed before any
|
|
standby nodes are registered.
|
|
|
|
`master register` can be used as an alias for `primary register`.
|
|
|
|
* `cluster show`
|
|
|
|
Displays information about each active node in the replication cluster. This
|
|
command polls each registered server and shows its role (`primary` / `standby` /
|
|
`bdr`) and status. It polls each server directly and can be run on any node
|
|
in the cluster; this is also useful when analyzing connectivity from a particular
|
|
node.
|
|
|
|
This command requires either a valid `repmgr.conf` file or a database connection
|
|
string to one of the registered nodes; no additional arguments are needed.
|
|
|
|
Example:
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster show
|
|
|
|
ID | Name | Role | Status | Upstream | Connection string
|
|
----+-------+---------+-----------+----------+-----------------------------------------
|
|
1 | node1 | primary | * running | | host=db_node1 dbname=repmgr user=repmgr
|
|
2 | node2 | standby | running | node1 | host=db_node2 dbname=repmgr user=repmgr
|
|
3 | node3 | standby | running | node1 | host=db_node3 dbname=repmgr user=repmgr
|
|
|
|
To show database connection errors when polling nodes, run the command in
|
|
`--verbose` mode.
|
|
|
|
The `cluster show` command accepts an optional parameter `--csv`, which
|
|
outputs the replication cluster's status in a simple CSV format, suitable for
|
|
parsing by scripts:
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster show --csv
|
|
1,-1,-1
|
|
2,0,0
|
|
3,0,1
|
|
|
|
The columns have following meanings:
|
|
|
|
- node ID
|
|
- availability (0 = available, -1 = unavailable)
|
|
- recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
|
|
|
|
Note that the availability is tested by connecting from the node where
|
|
`repmgr cluster show` is executed, and does not necessarily imply the node
|
|
is down.
|
|
|
|
|
|
* `cluster matrix` and `cluster crosscheck`
|
|
|
|
These commands display connection information for each pair of
|
|
nodes in the replication cluster.
|
|
|
|
- `cluster matrix` runs a `cluster show` on each node and arranges
|
|
the results in a matrix, recording success or failure;
|
|
|
|
- `cluster crosscheck` runs a `cluster matrix` on each node and
|
|
combines the results in a single matrix, providing a full
|
|
overview of connections between all databases in the cluster.
|
|
|
|
These commands require a valid `repmgr.conf` file on each node.
|
|
Additionally passwordless `ssh` connections are required between
|
|
all nodes.
|
|
|
|
Example 1 (all nodes up):
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | *
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | * | * | *
|
|
|
|
Here `cluster matrix` is sufficient to establish the state of each
|
|
possible connection.
|
|
|
|
|
|
Example 2 (node1 and `node2` up, `node3` down):
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | x
|
|
node3 | 3 | ? | ? | ?
|
|
|
|
Each row corresponds to one server, and indicates the result of
|
|
testing an outbound connection from that server.
|
|
|
|
Since `node3` is down, all the entries in its row are filled with
|
|
"?", meaning that there we cannot test outbound connections.
|
|
|
|
The other two nodes are up; the corresponding rows have "x" in the
|
|
column corresponding to node3, meaning that inbound connections to
|
|
that node have failed, and "*" in the columns corresponding to
|
|
node1 and node2, meaning that inbound connections to these nodes
|
|
have succeeded.
|
|
|
|
In this case, `cluster crosscheck` gives the same result as `cluster
|
|
matrix`, because from any functioning node we can observe the same
|
|
state: `node1` and `node2` are up, `node3` is down.
|
|
|
|
Example 3 (all nodes up, firewall dropping packets originating
|
|
from `node1` and directed to port 5432 on node3)
|
|
|
|
Running `cluster matrix` from `node1` gives the following output:
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | ? | ? | ?
|
|
|
|
(Note this may take some time depending on the `connect_timeout`
|
|
setting in the registered node `conninfo` strings; default is 1
|
|
minute which means without modification the above command would
|
|
take around 2 minutes to run; see comment elsewhere about setting
|
|
`connect_timeout`)
|
|
|
|
The matrix tells us that we cannot connect from `node1` to `node3`,
|
|
and that (therefore) we don't know the state of any outbound
|
|
connection from node3.
|
|
|
|
In this case, the `cluster crosscheck` command is more informative:
|
|
|
|
$ repmgr -f /etc/repmgr.conf cluster crosscheck
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | * | * | *
|
|
|
|
What happened is that `cluster crosscheck` merged its own `cluster
|
|
matrix` with the `cluster matrix` output from `node2`; the latter is
|
|
able to connect to `node3` and therefore determine the state of
|
|
outbound connections from that node.
|
|
|
|
|
|
Backwards compatibility
|
|
-----------------------
|
|
|
|
`repmgr` is now implemented as a PostgreSQL extension, and all database
|
|
objects used by repmgr are stored in a dedicated `repmgr` schema, rather
|
|
than `repmgr_$cluster_name`. Note there is no need to install the extension,
|
|
this will be done automatically by `repmgr primary register`.
|
|
|
|
Metadata tables have been revised and are not backwards-compatible
|
|
with repmgr 3.x. (however future DDL updates will be easier as they can be
|
|
carried out via the ALTER EXTENSION mechanism.).
|
|
|
|
An extension upgrade script will be provided for pre-4.0 installations;
|
|
note this will require the existing `repmgr_$cluster_name` schema to
|
|
be renamed to `repmgr` beforehand.
|
|
|
|
Some configuration items have had their names changed for consistency
|
|
and clarity e.g. `node` => `node_id`. `repmgr` will issue a warning
|
|
about deprecated/altered options.
|
|
|
|
Some configuration items have been changed to command line options,
|
|
and vice-versa, e.g. to avoid hard-coding items such as a a node's
|
|
upstream ID, which might change over time.
|
|
|
|
See file `doc/changes-in-repmgr4.md` for more details.
|
|
|
|
|
|
Generating event notifications with repmgr/repmgrd
|
|
--------------------------------------------------
|
|
|
|
Each time `repmgr` or `repmgrd` perform a significant event, a record
|
|
of that event is written into the `repmgr.events` table together with
|
|
a timestamp, an indication of failure or success, and further details
|
|
if appropriate. This is useful for gaining an overview of events
|
|
affecting the replication cluster. However note that this table has
|
|
advisory character and should be used in combination with the `repmgr`
|
|
and PostgreSQL logs to obtain details of any events.
|
|
|
|
Example output after a primary was registered and a standby cloned
|
|
and registered:
|
|
|
|
repmgr=# SELECT * from repmgr.events ;
|
|
node_id | event | successful | event_timestamp | details
|
|
---------+------------------+------------+-------------------------------+-------------------------------------------------------------------------------------
|
|
1 | primary_register | t | 2016-01-08 15:04:39.781733+09 |
|
|
2 | standby_clone | t | 2016-01-08 15:04:49.530001+09 | Cloned from host 'repmgr_node1', port 5432; backup method: pg_basebackup; --force: N
|
|
2 | standby_register | t | 2016-01-08 15:04:50.621292+09 |
|
|
(3 rows)
|
|
|
|
Alternatively use `repmgr cluster event` to output a list of events.
|
|
|
|
Additionally, event notifications can be passed to a user-defined program
|
|
or script which can take further action, e.g. send email notifications.
|
|
This is done by setting the `event_notification_command` parameter in
|
|
`repmgr.conf`.
|
|
|
|
This parameter accepts the following format placeholders:
|
|
|
|
%n - node ID
|
|
%e - event type
|
|
%s - success (1 or 0)
|
|
%t - timestamp
|
|
%d - details
|
|
|
|
The values provided for "%t" and "%d" will probably contain spaces,
|
|
so should be quoted in the provided command configuration, e.g.:
|
|
|
|
event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
|
|
|
|
Additionally the following format placeholders are available for the event
|
|
type `bdr_failover` and optionally `bdr_recovery`:
|
|
|
|
%c - conninfo string of the next available node
|
|
%a - name of the next available node
|
|
|
|
These should always be quoted.
|
|
|
|
By default, all notification type will be passed to the designated script;
|
|
the notification types can be filtered to explicitly named ones:
|
|
|
|
event_notifications=primary_register,standby_register
|
|
|
|
The following event types are available:
|
|
|
|
* `master_register`
|
|
* `standby_register`
|
|
* `standby_unregister`
|
|
* `standby_clone`
|
|
* `standby_promote`
|
|
* `standby_follow`
|
|
* `standby_disconnect_manual`
|
|
* `repmgrd_start`
|
|
* `repmgrd_shutdown`
|
|
* `repmgrd_failover_promote`
|
|
* `repmgrd_failover_follow`
|
|
* `bdr_failover`
|
|
* `bdr_reconnect`
|
|
* `bdr_recovery`
|
|
* `bdr_register`
|
|
* `bdr_unregister`
|
|
|
|
Note that under some circumstances (e.g. no replication cluster master could
|
|
be located), it will not be possible to write an entry into the `repmgr.events`
|
|
table, in which case executing a script via `event_notification_command` can
|
|
serve as a fallback by generating some form of notification.
|
|
|
|
|
|
Diagnostics
|
|
-----------
|
|
|
|
$ repmgr -f /etc/repmgr.conf node service --list-actions
|
|
Following commands would be executed for each action:
|
|
|
|
start: "/usr/bin/pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/pgsql/data' start"
|
|
stop: "/usr/bin/pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/pgsql/data' -m fast -W stop"
|
|
restart: "/usr/bin/pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/pgsql/data' restart"
|
|
reload: "/usr/bin/pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/pgsql/data' reload"
|
|
promote: "/usr/bin/pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/pgsql/data' promote"
|