Commit Graph

228 Commits

Author SHA1 Message Date
Ian Barwick
172a3d90cf Terminate rather than destroy 2015-03-19 09:55:20 +09:00
Ian Barwick
7f98bb7aec Create event record for rempgrd termination
Also fix a few incorrect exit codes.
2015-03-17 19:08:59 +09:00
Ian Barwick
9e2736be4c Remove superfluous configuration check
Also add note about configuration parsing failure and event logging.
2015-03-17 18:41:17 +09:00
Ian Barwick
922dfd88e5 Add configuration option 'event_notification_command'
Command to be executed each time an event is logged.

Following formatting sequences will be interpolated:

      %e - event type
      %d - description
      %s - success (1 or 0)
      %t - timestamp
2015-03-16 13:41:13 +09:00
Ian Barwick
0307c51d4b Add initial event logging code 2015-03-16 07:44:54 +09:00
Ian Barwick
96c8cd4148 Update code comments 2015-03-13 16:43:12 +09:00
Ian Barwick
619f95d85c Update code comments 2015-03-13 12:49:06 +09:00
Ian Barwick
97ae6dbf57 Remove superfluous configuration check
This is already done in parse_config()
2015-03-13 12:04:08 +09:00
Ian Barwick
36db199882 Retrieve node's active status too 2015-03-13 11:31:01 +09:00
Abhijit Menon-Sen
2c69119eff Don't say 'standbies', especially when we don't actually do anything 2015-03-12 17:49:27 +05:30
Ian Barwick
c833dd65f9 rempgr -> repmgr
The rampager strikes again.
2015-03-12 11:42:50 +09:00
Ian Barwick
65afc42afa Remove superfluous comment 2015-03-12 10:16:46 +09:00
Ian Barwick
bd96e0ca72 Remove various temporary debugging output, comments 2015-03-10 09:55:16 +09:00
Ian Barwick
606d0afabc primary -> master
For consistency.
2015-03-09 15:48:46 +09:00
Ian Barwick
4e6c250830 Remove experimental event logging code
Needs more bikeshedding.
2015-03-09 14:39:04 +09:00
Ian Barwick
abf92883a8 Clean up log output
No need to prefix each line with the program name; this was pretty
inconsistent anyway. The only place where log output needs to identify
the outputting program is when syslog is being used, which is done
anyway.
2015-03-09 12:00:05 +09:00
Ian Barwick
e603498f43 Parse config file before daemonizing
Daemonizing changes the current working directory to '/',
which breaks configuration file parsing if the file is in
the previous working directory and provided without an
explicit path.

Also it makes general sense to parse the configuration file
before daemonizing.
2015-03-09 08:21:54 +09:00
Ian Barwick
491309f4ba Write events of note to a log table
This makes keeping track of events such as failovers
much easier. Note that this is for convenience and is
not a foolproof auditing log.

Sample output:

repmgr_db=# SELECT * from repmgr_test.repl_events ;
 node_id |          event           | successful |        event_timestamp        |                         details
---------+--------------------------+------------+-------------------------------+----------------------------------------------------------
       1 | master_register          | t          | 2015-03-06 14:14:08.196636+09 |
       2 | standby_clone            | t          | 2015-03-06 14:14:17.660768+09 | Backup method: pg_basebackup; --force: N
       2 | standby_register         | t          | 2015-03-06 14:14:18.762222+09 |
       4 | witness_create           | t          | 2015-03-06 14:14:22.072815+09 |
       3 | standby_clone            | t          | 2015-03-06 14:14:23.524673+09 | Backup method: pg_basebackup; --force: N
       3 | standby_register         | t          | 2015-03-06 14:14:24.620161+09 |
       2 | repmgrd_start            | t          | 2015-03-06 14:14:29.639096+09 |
       3 | repmgrd_start            | t          | 2015-03-06 14:14:29.641489+09 |
       4 | repmgrd_start            | t          | 2015-03-06 14:14:29.648002+09 |
       2 | standby_promote          | t          | 2015-03-06 14:15:01.956737+09 | Node 2 was successfully be promoted to master
       2 | repmgrd_failover_promote | t          | 2015-03-06 14:15:01.964771+09 | Node 2 promoted to master; old master 1 marked as failed
       3 | repmgrd_failover_follow  | t          | 2015-03-06 14:15:07.228493+09 | Node 3 now following new upstream node 2
(12 rows)
2015-03-06 14:35:41 +09:00
Ian Barwick
04fe820aff Note where compatibility check for replication slots is carried out
Scanning the source code gives the impression there's no check.
2015-03-05 10:12:36 +09:00
Ian Barwick
46888de77f Improve configuration file handling
Put logic in config.c so it can be shared between repmgr and repmgrd.
2015-03-03 15:39:56 +09:00
Ian Barwick
3d3f082617 Ensure witness server updates its node records following a failover
This involves mainly abstracting the functions which copy
and create records from repmgr.c to dbutils.c, as they need
to be shared between repmgr and repmgrd.

Per issue noted here:

  https://groups.google.com/forum/#!topic/repmgr/v5nu1Xwf6X0
2015-03-03 08:57:20 +09:00
Ian Barwick
dd7193715c Gracefully fail when node has not been registered 2015-03-02 10:38:44 +09:00
Ian Barwick
1803a16c7e Detect changes to configuration file
This will prevent unnecessary reconnects to the upstream and
updates of the node record on the primary.
2015-02-10 12:35:19 +09:00
Ian Barwick
4f36b2c085 Probably needed. 2015-02-10 11:07:18 +09:00
Ian Barwick
19aba38327 Handle DB error when updating upstream node 2015-02-10 10:32:20 +09:00
Ian Barwick
49debcdf92 Add version check if replication slot usage requested
Replication slots require 9.4 or greater
2015-02-02 22:16:04 +09:00
Ian Barwick
7a760c32ff Store slot name in repl_nodes table 2015-02-02 17:57:15 +09:00
Ian Barwick
2ece014952 Initial support for physical replication slots
Todo:
 - if slots specified in repmgr.conf, verify server version
 - store generated slot name in `repl_nodes` table
2015-02-02 15:53:53 +09:00
Ian Barwick
5c67d47881 Add query result tests 2015-01-28 18:05:19 +09:00
Ian Barwick
f40b3ac48a Consolidate node type parsing 2015-01-28 11:57:34 +09:00
Ian Barwick
109269f7fb When writing monitoring info, ensure standby connects to current primary
If the node is a cascaded standby and the primary fails, `primary_conn`
will not be updated automatically; when writing monitoring info,
ensure we connect to the current primary.
2015-01-27 21:19:25 +09:00
Ian Barwick
23ef305afb After promotion or follow, update internal node metadata record 2015-01-27 16:25:58 +09:00
Ian Barwick
b552710767 Remove global variable my_local_mode
Information now contained in metadata`x
2015-01-27 13:53:36 +09:00
Ian Barwick
7e4c26b8a0 Clarify upstream role in log messages 2015-01-27 12:39:39 +09:00
Ian Barwick
f8639a7878 Add TODO note 2015-01-26 22:11:54 +09:00
Ian Barwick
f2309bd0a9 Remove redundant comments 2015-01-26 21:25:37 +09:00
Ian Barwick
061e72d7cd Rename connection variable for clarity
Connection will always be the primary
2015-01-26 20:12:34 +09:00
Ian Barwick
0a19bf1e23 Reword notice to make more sense in log output context 2015-01-24 05:03:57 +09:00
Ian Barwick
84a4766f13 Basic failover for cascaded standby nodes
Attempt to attach to the next available upstream node, otherwise
quit monitoring. We'll need to add further options for failover
scenarios, including attempting to attach to another node,
shutting down the server completely etc.
2015-01-24 04:22:40 +09:00
Ian Barwick
3be8bf8e4c Check primary connection
Verify that existing primary connection is valid, and if not
attempt to find and connect to the current primary node.
2015-01-22 12:20:23 +09:00
Ian Barwick
1e6f1a88b0 On local node failure, attempt to update record on primary server 2015-01-22 11:45:01 +09:00
Ian Barwick
4a8912c2b4 Update comments and debugging output 2015-01-22 10:59:33 +09:00
Ian Barwick
3279e9e47e Separate functions for primary and cascading standby failover 2015-01-22 10:12:03 +09:00
Ian Barwick
5c4e77f8e2 do_failover() -> do_primary_failover() 2015-01-16 17:28:51 +09:00
Ian Barwick
b09f987341 Add note 2015-01-16 16:59:44 +09:00
Ian Barwick
fe758eda9f Update repl_nodes following failover
repl_nodes table updated by each node following failover to
show that either it is the primary, or which primary it has
started to follow.
2015-01-16 16:28:09 +09:00
Ian Barwick
c413cff461 Function to update node records 2015-01-16 14:14:04 +09:00
Ian Barwick
609453a848 Handle failover of top-level standby
Cascaded standbys will not go into failover so we need to ignore
these when looking for candidates for promotion.
2015-01-16 12:35:01 +09:00
Ian Barwick
a82d37e48a Improve node metadata and upstream connecting mechanism
To handle cascaded replication we're going to have to keep track
of each node's upstream node. Also enumerate the node type
("primary", "standby" or "witness") and mark if active.
2015-01-16 10:28:02 +09:00
Ian Barwick
2ae27521a3 Some infrastructure for supporting cascading replication
Does not fully work yet.
2015-01-15 15:37:09 +09:00