Update README

2026-06-01 03:39:05 +00:00 · 2017-08-24 17:43:01 +09:00
parent 7d2dc0aa89
commit db157ad9bc
2 changed files with 60 additions and 5 deletions
@@ -327,7 +327,7 @@ Automatic failover with `repmgrd`

 `repmgrd` is a management and monitoring daemon which runs on each node in
 a replication cluster and. It can automate actions such as failover and
-updating standbys to follow the new master, as well as providing monitoring
+updating standbys to follow the new primary, as well as providing monitoring
 information about the state of each standby.

 To use `repmgrd`, its associated function library must be included in
@@ -339,13 +339,68 @@ Changing this setting requires a restart of PostgreSQL; for more details see:

  https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES

-Additionally the following `repmgrd` options *must* be set in `repmgr.conf`:
+Additionally the following `repmgrd` options *must* be set in `repmgr.conf`
+(adjust configuration file locations as appropriate):

    failover=automatic
    promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
    follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file'

-(adjust configuration file locations as appropriate).
+
+Note that the `--log-to-file` option will cause `repmgr`'s output to be logged to
+the destination configured to receive log output for `repmgrd`.
+See `repmgr.conf.sample` for further `repmgrd`-specific settings
+
+When `failover` is set to `automatic`, upon detecting failure of the current
+primary, `repmgrd` will execute one of `promote_command` or `follow_command`,
+depending on whether the current server is to become the new primary, or
+needs to follow another server which has become the new primary. Note that
+these commands can be any valid shell script which results in one of these
+two actions happening, but if `repmgr`'s `standby follow` or `standby promote`
+commands are not executed (either directly as shown here, or from a script which
+performs other actions), the `rempgr` metadata will not be updated and
+monitoring will no longer function reliably.
+
+To demonstrate automatic failover, set up a 3-node replication cluster (one primary
+and two standbys streaming directly from the primary) so that the cluster looks
+something like this:
+
+    $ repmgr -f /etc/repmgr.conf cluster show
+     ID | Name  | Role    | Status    | Upstream | Connection string
+    ----+-------+---------+-----------+----------+--------------------------------------
+     1  | node1 | primary | * running |          | host=node1 dbname=repmgr user=repmgr
+     2  | node2 | standby |   running | node1    | host=node2 dbname=repmgr user=repmgr
+     3  | node3 | standby |   running | node1    | host=node3 dbname=repmgr user=repmgr
+
+Start `repmgrd` on each standby and verify that it's running by examining the
+log output, which at log level `INFO` will look like this:
+
+    [2017-08-24 17:31:00] [NOTICE] using configuration file "/etc/repmgr.conf"
+    [2017-08-24 17:31:00] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr"
+    [2017-08-24 17:31:00] [NOTICE] starting monitoring of node "node2" (ID: 2)
+    [2017-08-24 17:31:00] [INFO] monitoring connection to upstream node "node1" (node ID: 1)
+
+Each `repmgrd` should also have recorded its successful startup as an event:
+
+    $ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
+    Node ID | Name  | Event         | OK | Timestamp           | Details
+    ---------+-------+---------------+----+---------------------+-------------------------------------------------------------
+     3       | node3 | repmgrd_start | t  | 2017-08-24 17:35:54 | monitoring connection to upstream node "node1" (node ID: 1)
+     2       | node2 | repmgrd_start | t  | 2017-08-24 17:35:50 | monitoring connection to upstream node "node1" (node ID: 1)
+     1       | node1 | repmgrd_start | t  | 2017-08-24 17:35:46 | monitoring cluster primary "node1" (node ID: 1)
+
+Now stop the current master server with e.g.:
+
+    pg_ctl -D /path/to/node1/data -m immediate stop
+
+This will force the master node to shut down straight away, aborting all
+processes and transactions.  This will cause a flurry of activity in
+the `repmgrd` log files as each `repmgrd` detects the failure of the master
+and a failover decision is made. Here extracts from the standby server
+promoted to new master:
+
+
+

 ### Monitoring with `repmgrd`

@@ -473,7 +473,7 @@ monitor_streaming_standby(void)
 		initPQExpBuffer(&event_details);

 		appendPQExpBuffer(&event_details,
-						  _("monitoring upstream node \"%s\" (node ID: %i)"),
+						  _("monitoring connection to upstream node \"%s\" (node ID: %i)"),
 						  upstream_node_info.node_name,
 						  upstream_node_info.node_id);

@@ -486,7 +486,7 @@ monitor_streaming_standby(void)

 		startup_event_logged = true;

-		log_notice("%s", event_details.data);
+		log_info("%s", event_details.data);

 		termPQExpBuffer(&event_details);
 	}