mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-24 23:56:29 +00:00
Standardize on "ID: %i" when logging node IDs
Previously there was a mix of "id:", "node id:", "node ID:" and "node_id:".
This commit is contained in:
@@ -98,7 +98,7 @@
|
||||
describing <application>repmgrd</application>'s current state, e.g.:
|
||||
</para>
|
||||
<programlisting>
|
||||
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
|
||||
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
||||
@@ -288,13 +288,13 @@
|
||||
<application>repmgrd</application> on the primary has started up, while two child
|
||||
nodes are being provisioned:
|
||||
<programlisting>
|
||||
[2019-04-24 15:25:33] [INFO] monitoring primary node "node1" (node ID: 1) in normal state
|
||||
[2019-04-24 15:25:35] [NOTICE] new node "node2" (node ID: 2) has connected
|
||||
[2019-04-24 15:25:33] [INFO] monitoring primary node "node1" (ID: 1) in normal state
|
||||
[2019-04-24 15:25:35] [NOTICE] new node "node2" (ID: 2) has connected
|
||||
[2019-04-24 15:25:35] [NOTICE] 1 (of 1) child nodes are connected, but at least 2 child nodes required
|
||||
[2019-04-24 15:25:35] [INFO] no child nodes have detached since repmgrd startup
|
||||
(...)
|
||||
[2019-04-24 15:25:44] [NOTICE] new node "node3" (node ID: 3) has connected
|
||||
[2019-04-24 15:25:46] [INFO] monitoring primary node "node1" (node ID: 1) in normal state
|
||||
[2019-04-24 15:25:44] [NOTICE] new node "node3" (ID: 3) has connected
|
||||
[2019-04-24 15:25:46] [INFO] monitoring primary node "node1" (ID: 1) in normal state
|
||||
(...)</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
@@ -302,9 +302,9 @@
|
||||
is now waiting <varname>child_nodes_disconnect_timeout</varname> seconds
|
||||
before executing <varname>child_nodes_disconnect_command</varname>:
|
||||
<programlisting>
|
||||
[2019-04-24 15:28:11] [INFO] monitoring primary node "node1" (node ID: 1) in normal state
|
||||
[2019-04-24 15:28:17] [INFO] monitoring primary node "node1" (node ID: 1) in normal state
|
||||
[2019-04-24 15:28:19] [NOTICE] node "node3" (node ID: 3) has disconnected
|
||||
[2019-04-24 15:28:11] [INFO] monitoring primary node "node1" (ID: 1) in normal state
|
||||
[2019-04-24 15:28:17] [INFO] monitoring primary node "node1" (ID: 1) in normal state
|
||||
[2019-04-24 15:28:19] [NOTICE] node "node3" (ID: 3) has disconnected
|
||||
[2019-04-24 15:28:19] [NOTICE] 1 (of 2) child nodes are connected, but at least 2 child nodes required
|
||||
[2019-04-24 15:28:19] [INFO] most recently detached child node was 3 (ca. 0 seconds ago), not triggering "child_nodes_disconnect_command"
|
||||
[2019-04-24 15:28:19] [DETAIL] "child_nodes_disconnect_timeout" set To 30 seconds
|
||||
@@ -552,7 +552,7 @@
|
||||
$ repmgr cluster event --event=child_node_disconnect
|
||||
Node ID | Name | Event | OK | Timestamp | Details
|
||||
---------+-------+-----------------------+----+---------------------+--------------------------------------------
|
||||
1 | node1 | child_node_disconnect | t | 2019-04-24 12:41:36 | node "node3" (node ID: 3) has disconnected</programlisting>
|
||||
1 | node1 | child_node_disconnect | t | 2019-04-24 12:41:36 | node "node3" (ID: 3) has disconnected</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@@ -574,7 +574,7 @@ $ repmgr cluster event --event=child_node_disconnect
|
||||
$ repmgr cluster event --event=child_node_reconnect
|
||||
Node ID | Name | Event | OK | Timestamp | Details
|
||||
---------+-------+----------------------+----+---------------------+------------------------------------------------------------
|
||||
1 | node1 | child_node_reconnect | t | 2019-04-24 12:42:19 | node "node3" (node ID: 3) has reconnected after 42 seconds</programlisting>
|
||||
1 | node1 | child_node_reconnect | t | 2019-04-24 12:42:19 | node "node3" (ID: 3) has reconnected after 42 seconds</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@@ -597,7 +597,7 @@ $ repmgr cluster event --event=child_node_reconnect
|
||||
$ repmgr cluster event --event=child_node_new_connect
|
||||
Node ID | Name | Event | OK | Timestamp | Details
|
||||
---------+-------+------------------------+----+---------------------+---------------------------------------------
|
||||
1 | node1 | child_node_new_connect | t | 2019-04-24 12:41:30 | new node "node3" (node ID: 3) has connected</programlisting>
|
||||
1 | node1 | child_node_new_connect | t | 2019-04-24 12:41:30 | new node "node3" (ID: 3) has connected</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@@ -725,7 +725,7 @@ Node ID: 2
|
||||
[2019-03-13 21:01:30] [NOTICE] failover validation command returned a non-zero value: "1"
|
||||
[2019-03-13 21:01:30] [NOTICE] promotion candidate election will be rerun
|
||||
[2019-03-13 21:01:30] [INFO] 1 followers to notify
|
||||
[2019-03-13 21:01:30] [NOTICE] notifying node "node3" (node ID: 3) to rerun promotion candidate selection
|
||||
[2019-03-13 21:01:30] [NOTICE] notifying node "node3" (ID: 3) to rerun promotion candidate selection
|
||||
INFO: node 3 received notification to rerun promotion candidate election
|
||||
[2019-03-13 21:01:30] [NOTICE] rerunning election after 15 seconds ("election_rerun_interval")</programlisting>
|
||||
</para>
|
||||
|
||||
@@ -117,7 +117,7 @@ NOTICE: node 3 (node3) paused</programlisting>
|
||||
If the primary (in this example, <literal>node1</literal>) is stopped, <application>repmgrd</application>
|
||||
running on one of the standbys (here: <literal>node2</literal>) will react like this:
|
||||
<programlisting>
|
||||
[2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
|
||||
[2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (ID: 1)
|
||||
[2018-09-20 12:22:21] [INFO] checking state of node 1, 1 of 5 attempts
|
||||
[2018-09-20 12:22:21] [INFO] sleeping 1 seconds until next reconnection attempt
|
||||
...
|
||||
@@ -125,7 +125,7 @@ NOTICE: node 3 (node3) paused</programlisting>
|
||||
[2018-09-20 12:22:25] [INFO] checking state of node 1, 5 of 5 attempts
|
||||
[2018-09-20 12:22:25] [WARNING] unable to reconnect to node 1 after 5 attempts
|
||||
[2018-09-20 12:22:25] [NOTICE] node is paused
|
||||
[2018-09-20 12:22:33] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state
|
||||
[2018-09-20 12:22:33] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state
|
||||
[2018-09-20 12:22:33] [DETAIL] repmgrd paused by administrator
|
||||
[2018-09-20 12:22:33] [HINT] execute "repmgr daemon unpause" to resume normal failover mode</programlisting>
|
||||
</para>
|
||||
@@ -268,8 +268,8 @@ NOTICE: node 3 (node3) unpaused</programlisting>
|
||||
Example output in a situation where there is only one standby with <literal>failover=manual</literal>,
|
||||
and the primary node is unavailable (but is later restarted):
|
||||
<programlisting>
|
||||
[2017-08-29 10:59:19] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)
|
||||
[2017-08-29 10:59:33] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
|
||||
[2017-08-29 10:59:19] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in normal state (automatic failover disabled)
|
||||
[2017-08-29 10:59:33] [WARNING] unable to connect to upstream node "node1" (ID: 1)
|
||||
[2017-08-29 10:59:33] [INFO] checking state of node 1, 1 of 5 attempts
|
||||
[2017-08-29 10:59:33] [INFO] sleeping 1 seconds until next reconnection attempt
|
||||
(...)
|
||||
@@ -278,10 +278,10 @@ NOTICE: node 3 (node3) unpaused</programlisting>
|
||||
[2017-08-29 10:59:37] [NOTICE] this node is not configured for automatic failover so will not be considered as promotion candidate
|
||||
[2017-08-29 10:59:37] [NOTICE] no other nodes are available as promotion candidate
|
||||
[2017-08-29 10:59:37] [HINT] use "repmgr standby promote" to manually promote this node
|
||||
[2017-08-29 10:59:37] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
|
||||
[2017-08-29 10:59:53] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state (automatic failover disabled)
|
||||
[2017-08-29 10:59:37] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
|
||||
[2017-08-29 10:59:53] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
|
||||
[2017-08-29 11:00:45] [NOTICE] reconnected to upstream node 1 after 68 seconds, resuming monitoring
|
||||
[2017-08-29 11:00:57] [INFO] node "node2" (node ID: 2) monitoring upstream node "node1" (node ID: 1) in normal state (automatic failover disabled)</programlisting>
|
||||
[2017-08-29 11:00:57] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in normal state (automatic failover disabled)</programlisting>
|
||||
|
||||
</para>
|
||||
<para>
|
||||
|
||||
@@ -104,17 +104,17 @@
|
||||
[2019-03-15 06:32:05] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr connect_timeout=2"
|
||||
INFO: set_repmgrd_pid(): provided pidfile is /var/run/repmgr/repmgrd-11.pid
|
||||
[2019-03-15 06:32:05] [NOTICE] starting monitoring of node "node2" (ID: 2)
|
||||
[2019-03-15 06:32:05] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
|
||||
[2019-03-15 06:32:05] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Each <application>repmgrd</application> should also have recorded its successful startup as an event:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
|
||||
Node ID | Name | Event | OK | Timestamp | Details
|
||||
---------+-------+---------------+----+---------------------+-------------------------------------------------------------
|
||||
3 | node3 | repmgrd_start | t | 2019-03-14 04:17:30 | monitoring connection to upstream node "node1" (node ID: 1)
|
||||
2 | node2 | repmgrd_start | t | 2019-03-14 04:11:47 | monitoring connection to upstream node "node1" (node ID: 1)
|
||||
1 | node1 | repmgrd_start | t | 2019-03-14 04:04:31 | monitoring cluster primary "node1" (node ID: 1)</programlisting>
|
||||
---------+-------+---------------+----+---------------------+--------------------------------------------------------
|
||||
3 | node3 | repmgrd_start | t | 2019-03-14 04:17:30 | monitoring connection to upstream node "node1" (ID: 1)
|
||||
2 | node2 | repmgrd_start | t | 2019-03-14 04:11:47 | monitoring connection to upstream node "node1" (ID: 1)
|
||||
1 | node1 | repmgrd_start | t | 2019-03-14 04:04:31 | monitoring cluster primary "node1" (ID: 1)</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Now stop the current primary server with e.g.:
|
||||
@@ -128,7 +128,7 @@
|
||||
decision is made. This is an extract from the log of a standby server (<literal>node2</literal>)
|
||||
which has promoted to new primary after failure of the original primary (<literal>node1</literal>).
|
||||
<programlisting>
|
||||
[2019-03-15 06:37:50] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
|
||||
[2019-03-15 06:37:50] [WARNING] unable to connect to upstream node "node1" (ID: 1)
|
||||
[2019-03-15 06:37:50] [INFO] checking state of node 1, 1 of 3 attempts
|
||||
[2019-03-15 06:37:50] [INFO] sleeping 5 seconds until next reconnection attempt
|
||||
[2019-03-15 06:37:55] [INFO] checking state of node 1, 2 of 3 attempts
|
||||
@@ -151,10 +151,10 @@
|
||||
NOTICE: STANDBY PROMOTE successful
|
||||
DETAIL: server "node2" (ID: 2) was successfully promoted to primary
|
||||
[2019-03-15 06:38:01] [INFO] 3 followers to notify
|
||||
[2019-03-15 06:38:01] [NOTICE] notifying node "node3" (node ID: 3) to follow node 2
|
||||
[2019-03-15 06:38:01] [NOTICE] notifying node "node3" (ID: 3) to follow node 2
|
||||
INFO: node 3 received notification to follow node 2
|
||||
[2019-03-15 06:38:01] [INFO] switching to primary monitoring mode
|
||||
[2019-03-15 06:38:01] [NOTICE] monitoring cluster primary "node2" (node ID: 2)</programlisting>
|
||||
[2019-03-15 06:38:01] [NOTICE] monitoring cluster primary "node2" (ID: 2)</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The cluster status will now look like this, with the original primary (<literal>node1</literal>)
|
||||
@@ -177,8 +177,8 @@
|
||||
Node ID | Name | Event | OK | Timestamp | Details
|
||||
---------+-------+----------------------------+----+---------------------+-------------------------------------------------------------
|
||||
3 | node3 | repmgrd_failover_follow | t | 2019-03-15 06:38:03 | node 3 now following new upstream node 2
|
||||
3 | node3 | standby_follow | t | 2019-03-15 06:38:02 | standby attached to upstream node "node2" (node ID: 2)
|
||||
2 | node2 | repmgrd_reload | t | 2019-03-15 06:38:01 | monitoring cluster primary "node2" (node ID: 2)
|
||||
3 | node3 | standby_follow | t | 2019-03-15 06:38:02 | standby attached to upstream node "node2" (ID: 2)
|
||||
2 | node2 | repmgrd_reload | t | 2019-03-15 06:38:01 | monitoring cluster primary "node2" (ID: 2)
|
||||
2 | node2 | repmgrd_failover_promote | t | 2019-03-15 06:38:01 | node 2 promoted to primary; old primary 1 marked as failed
|
||||
2 | node2 | standby_promote | t | 2019-03-15 06:38:01 | server "node2" (ID: 2) was successfully promoted to primary</programlisting>
|
||||
</para>
|
||||
|
||||
Reference in New Issue
Block a user