Commit Graph

76 Commits

Author SHA1 Message Date
Ian Barwick
49ac9cf9ca Add "repmgr cluster show" 2017-07-19 17:36:21 +09:00
Ian Barwick
a7b7d86ecc repmgrd: handle manual failover mode correctly 2017-07-19 14:01:01 +09:00
Ian Barwick
23e6440dfd repmgrd: initiate primary monitoring when local node is promoted manually 2017-07-19 11:15:38 +09:00
Ian Barwick
6e270b2faf repmgrd: catch cases where more than one node has initiated voting
The node(s) with higher ID will "yield", leaving the decision making
up to the node with the lower ID.

This happens very rarely, usually when the random delay is close
enough on two or mode nodes that vote initiation is simultaneous.
2017-07-18 17:04:24 +09:00
Ian Barwick
2c8dd49831 repmgrd: additional check to ensure only one node handles failover
It's possible the "failover" is completed by one repmgrd before the
other has a chance to react, in which case the am_bdr_failover_handler()
check will not apply. Instead check if the node record has already been
set to "inactive".
2017-07-17 16:47:42 +09:00
Ian Barwick
a56bb41891 Remove redundant fields from node record struct 2017-07-17 14:11:14 +09:00
Ian Barwick
ec554e5694 Improve connection handling
Set "connect_timeout" and "fallback_application_name" if not present.
2017-07-17 11:10:37 +09:00
Ian Barwick
951c7dbd07 repmgrd: in BDR mode, have each repmgrd monitor each node
This will cover both the case when an entire node including
repmgrd goes down, and when one PostgreSQL instance goes down
but repmgrd is still up (in which case only one of the repmgrds
will handle the failover).
2017-07-14 15:01:18 +09:00
Ian Barwick
e3b3fb65f0 repmgrd: restrict BDR monitoring to two node setup
It's not safe to have more than two nodes with this kind of
"failover", so we don't need to select alternative nodes by
priority.
2017-07-14 12:56:11 +09:00
Ian Barwick
d653888c65 Support pre-10 WAL functions 2017-07-14 10:40:11 +09:00
Ian Barwick
dfcf85a62f repmgrd: further BDR sanity checks 2017-07-14 10:27:28 +09:00
Ian Barwick
0320f409aa Detect BDR capability via presence of extension 2017-07-13 14:13:46 +09:00
Ian Barwick
7eadbf6b17 Various improvements to "repmgr bdr register/unregister" 2017-07-12 22:38:03 +09:00
Ian Barwick
0a1addfdc0 When registering a BDR node, sync repmgr.nodes from another node
If a BDR node is added via bdr_group_join(), repmgr.nodes will
start off empty, so we'll need to sync it ourselves before adding
it to the repmgr replication set.
2017-07-12 10:11:25 +09:00
Ian Barwick
1cccb1dd5a Add "repmgr bdr unregister" 2017-07-12 10:11:21 +09:00
Ian Barwick
71a0871232 Add "repmgr bdr register" 2017-07-11 15:38:58 +09:00
Ian Barwick
2962ffe605 repmgrd: initial BDR monitoring support 2017-07-10 23:58:59 +09:00
Ian Barwick
dddea9814b Add BDR-related database functions 2017-07-10 21:52:39 +09:00
Ian Barwick
5fbcf3e476 Remove witness server references 2017-07-10 09:31:31 +09:00
Ian Barwick
9e3d942917 Handle various (unlikely) failure states 2017-07-10 09:00:18 +09:00
Ian Barwick
5bf7098139 repmgrd: consolidate clear_node_info_list() calls 2017-07-09 11:10:49 +09:00
Ian Barwick
2787994a6e Make repmgrd failover settings configurable 2017-07-07 21:11:22 +09:00
Ian Barwick
0d226867b4 Add "location" column 2017-07-06 01:17:00 +09:00
Ian Barwick
614287548d Fix function get_primary_node_record() 2017-07-05 11:20:32 +09:00
Ian Barwick
617dee6bd6 Add function create_event_record()
For logging an event to the event table without generating an external
event notification.

Rename existing create_event_record*() functions to create_event_notification*()
as this describes their function better.
2017-07-05 09:52:22 +09:00
Ian Barwick
24c6b2c9f1 repmgrd: initial code for cascaded standby failover 2017-07-04 23:14:05 +09:00
Ian Barwick
618a2346e1 repmgrd: various fixed, mainly clearing status after a failover event 2017-07-04 11:55:03 +09:00
Ian Barwick
c12bf01b5a When clearing a node info list, reset the node count to 0 2017-07-03 21:59:02 +09:00
Ian Barwick
890b88d644 More failover fixes 2017-07-03 17:37:32 +09:00
Ian Barwick
debe5a18c5 have new primary communicate to standbys 2017-06-30 21:45:25 +09:00
Ian Barwick
fc4f276844 Improve handling
not sure if we need to store the electoral term...
2017-06-30 13:40:19 +09:00
Ian Barwick
3514e20367 poke it around until it works less badly 2017-06-29 09:35:09 +09:00
Ian Barwick
fa86fe4ad8 Basic voting 2017-06-29 01:11:21 +09:00
Ian Barwick
d6b6255144 interim commit 2017-06-28 18:20:03 +09:00
Ian Barwick
f4e8bf891d interim commit 2017-06-28 17:28:26 +09:00
Ian Barwick
ded8d95e5a interim commit 2017-06-28 16:38:41 +09:00
Ian Barwick
78a16d746d Initial primary node monitoring 2017-06-27 00:15:29 +09:00
Ian Barwick
46c956e61a Use "primary" instead of "master" 2017-06-23 21:33:54 +09:00
Ian Barwick
28808a02ab Fix return value of _get_node_record() 2017-06-23 20:44:40 +09:00
Ian Barwick
1b2652037d Rename enum types for consistency 2017-06-23 16:38:14 +09:00
Ian Barwick
dbaa2e0b44 Add a RecordStatus return type for functions which populate record structures
Unify a bunch of slightly different ways of handling the result.
2017-06-23 16:16:46 +09:00
Ian Barwick
6cdf73b4cb repmgr standby promote: suppress master database connection error message
Otherwise the first line of output is an ERROR, which is confusing,
even though it's expected.
2017-06-21 13:21:44 +09:00
Ian Barwick
94a88326ef repmgrd: further code ported 2017-06-20 09:17:29 +09:00
Ian Barwick
030fdc046b repmgr standby follow: main code 2017-06-16 21:38:53 +09:00
Ian Barwick
7b976ef2df repmgr standby follow: initial code 2017-06-16 00:05:18 +09:00
Ian Barwick
b440b5fcb8 Fix node record update query 2017-06-15 21:54:44 +09:00
Ian Barwick
36b3782009 Store the replication user in repmgr.nodes
When creating recovery.conf outside of "repmgr standby clone",
there was no way of knowing if a replication user had been
explicitly provided with --replication-user, meaning the value
of "primary_conninfo" would be set to the "conninfo" field of the
node's upstream node record.

We'll add an extra column to store the replication user for each
node so it can be referenced at any time.
2017-06-14 23:27:26 +09:00
Ian Barwick
e89c43c5cb Remove unused backup functions
Not needed since removal of rsync functionality
2017-06-13 00:35:01 +09:00
Ian Barwick
f26f1c0428 Minor code tweaks 2017-06-13 00:31:01 +09:00
Ian Barwick
cc1f0a02cd Add missing call to PQconninfoFree() 2017-06-13 00:22:41 +09:00