Commit Graph

249 Commits

Author SHA1 Message Date
Martín Marqués
120be2db1c Fix bug which prevents repmgrd from starting when the cluster name has
upper case letters.
2015-10-06 20:54:28 -03:00
Ian Barwick
e115825cd6 Fix comment capitalization 2015-09-30 14:58:43 +09:00
Ian Barwick
c3bd02b83d Standardize if-statement formatting
"if(" -> "if ("
2015-09-24 17:45:08 +09:00
Ian Barwick
8e7d110a22 Check for existing master record before deleting it
Otherwise repmgr implies it's deleting a record which isn't actually
there.
2015-09-24 17:39:39 +09:00
Tomas Vondra
ef6b24551a call update_node_record_set_upstream() for STANDBY FOLLOW
repmgrd correctly updates ID of the upstream node after automatic
failover, but repmgr was not doing that for manual failvers.

This moves the existing function to dbutils and modifies it so that
it does not rely on global variables with configuration (available
just in repmgrd).

This should fix issue #67 (hopefully, haven't done much testing).
2015-09-23 12:32:47 +09:00
Ian Barwick
30fd111cba Rework config file handling
If no configuration file provided, also check default Postgres
sysconfig dir.

It would also be useful to check the configuration directory
provided by the RPM/DEB packages, not sure if that's programmatically
feasible.
2015-09-21 15:55:29 +09:00
Ian Barwick
65e63b062e Generally tidy up help output 2015-09-21 11:49:06 +09:00
Ian Barwick
053f672caa Treat -?/--help and -V/--version as normal options
Currently repmgr/repmgrd will only accept these as valid when
provided as the first command line option, however it's possible
a user will want to get the output of those options by adding
them to the end of a previously inputted command.

Note that after the first of these options is encountered, the
program will terminate and not process any other options. This
is consistent with psql's behaviour

Per GitHub issue #107 from Sébastien Gross.
2015-09-21 09:53:51 +09:00
Ian Barwick
7345ddcf00 Whitespace tweak 2015-09-10 14:27:21 +09:00
Gianni Ciolli
462d446477 Bug #90 fix (autofailover with reconnect_attemps > 1).
The main change is that now check_connection requires a conninfo
parameter, and the connection object has type (PGconn **) so it can be
replaced by check_connection if needed.

The bug was caused by the fact that the first failure resulted in
*conn == NULL, so that subsequent checks of the upstream connection
were failing irrespectively of the actual state of the upstream node.

Now, when *conn == NULL, check_connection will use conninfo to
establish a new connection and place it into *conn. We introduce a new
INTERNAL_ERROR code for the case when they are both NULL.

In passing, we also reworded a confusing error message, distinguishing
a timeout from the actual elapsed time.
2015-08-10 20:58:43 +02:00
Ian Barwick
1e5792f8df Remove unused function 2015-04-14 14:29:47 +09:00
Ian Barwick
a01fefa7d0 After standby promotion, ensure metadata is updated by repmgr
Previously this was handled by repmgrd but if a standby is promoted
directly this will leave the metadata in an incorrect state.
2015-04-14 13:39:48 +09:00
Ian Barwick
07d220cb00 Correct monitoring table column names
It would be more consistent to change the "primary" to "master"
but that would make the table incompatible with the v2.0 table.
2015-03-31 18:14:32 +09:00
Ian Barwick
4dfeffe087 Add constant NODE_NOT_FOUND
Which is what the magic number means in those contexts.
2015-03-31 14:35:16 +09:00
Ian Barwick
18544c82ca Prevent rempgrd from looping infinitely if node was not registered 2015-03-31 14:25:08 +09:00
Ian Barwick
0f86bdcd05 Fixes for event logging
We can't always assume a valid connection to the master
2015-03-31 14:15:29 +09:00
Ian Barwick
3e621f43d1 Use 100 as the default priority; 0 or less means node will never be promoted 2015-03-26 10:38:20 +09:00
Ian Barwick
15a531fed8 Update function description 2015-03-24 19:36:31 +09:00
Ian Barwick
8de0deddf9 Change "primary" to "master"
Personally I prefer "primary", but repmgr uses "master" so let's consolidate
on one version of the terminology for clarity.
2015-03-24 14:06:39 +09:00
Ian Barwick
bd19a2c868 Improve handling of event logging in rempgrd
Provide the master connection if available, and if not enable
create_event_record() to skip trying to write to the database,
but execute the notification program if defined.
2015-03-24 13:40:39 +09:00
Ian Barwick
2cadb3424d Don't try and log events when no master connection available 2015-03-24 12:48:02 +09:00
Ian Barwick
172a3d90cf Terminate rather than destroy 2015-03-19 09:55:20 +09:00
Ian Barwick
7f98bb7aec Create event record for rempgrd termination
Also fix a few incorrect exit codes.
2015-03-17 19:08:59 +09:00
Ian Barwick
9e2736be4c Remove superfluous configuration check
Also add note about configuration parsing failure and event logging.
2015-03-17 18:41:17 +09:00
Ian Barwick
922dfd88e5 Add configuration option 'event_notification_command'
Command to be executed each time an event is logged.

Following formatting sequences will be interpolated:

      %e - event type
      %d - description
      %s - success (1 or 0)
      %t - timestamp
2015-03-16 13:41:13 +09:00
Ian Barwick
0307c51d4b Add initial event logging code 2015-03-16 07:44:54 +09:00
Ian Barwick
96c8cd4148 Update code comments 2015-03-13 16:43:12 +09:00
Ian Barwick
619f95d85c Update code comments 2015-03-13 12:49:06 +09:00
Ian Barwick
97ae6dbf57 Remove superfluous configuration check
This is already done in parse_config()
2015-03-13 12:04:08 +09:00
Ian Barwick
36db199882 Retrieve node's active status too 2015-03-13 11:31:01 +09:00
Abhijit Menon-Sen
2c69119eff Don't say 'standbies', especially when we don't actually do anything 2015-03-12 17:49:27 +05:30
Ian Barwick
c833dd65f9 rempgr -> repmgr
The rampager strikes again.
2015-03-12 11:42:50 +09:00
Ian Barwick
65afc42afa Remove superfluous comment 2015-03-12 10:16:46 +09:00
Ian Barwick
bd96e0ca72 Remove various temporary debugging output, comments 2015-03-10 09:55:16 +09:00
Ian Barwick
606d0afabc primary -> master
For consistency.
2015-03-09 15:48:46 +09:00
Ian Barwick
4e6c250830 Remove experimental event logging code
Needs more bikeshedding.
2015-03-09 14:39:04 +09:00
Ian Barwick
abf92883a8 Clean up log output
No need to prefix each line with the program name; this was pretty
inconsistent anyway. The only place where log output needs to identify
the outputting program is when syslog is being used, which is done
anyway.
2015-03-09 12:00:05 +09:00
Ian Barwick
e603498f43 Parse config file before daemonizing
Daemonizing changes the current working directory to '/',
which breaks configuration file parsing if the file is in
the previous working directory and provided without an
explicit path.

Also it makes general sense to parse the configuration file
before daemonizing.
2015-03-09 08:21:54 +09:00
Ian Barwick
491309f4ba Write events of note to a log table
This makes keeping track of events such as failovers
much easier. Note that this is for convenience and is
not a foolproof auditing log.

Sample output:

repmgr_db=# SELECT * from repmgr_test.repl_events ;
 node_id |          event           | successful |        event_timestamp        |                         details
---------+--------------------------+------------+-------------------------------+----------------------------------------------------------
       1 | master_register          | t          | 2015-03-06 14:14:08.196636+09 |
       2 | standby_clone            | t          | 2015-03-06 14:14:17.660768+09 | Backup method: pg_basebackup; --force: N
       2 | standby_register         | t          | 2015-03-06 14:14:18.762222+09 |
       4 | witness_create           | t          | 2015-03-06 14:14:22.072815+09 |
       3 | standby_clone            | t          | 2015-03-06 14:14:23.524673+09 | Backup method: pg_basebackup; --force: N
       3 | standby_register         | t          | 2015-03-06 14:14:24.620161+09 |
       2 | repmgrd_start            | t          | 2015-03-06 14:14:29.639096+09 |
       3 | repmgrd_start            | t          | 2015-03-06 14:14:29.641489+09 |
       4 | repmgrd_start            | t          | 2015-03-06 14:14:29.648002+09 |
       2 | standby_promote          | t          | 2015-03-06 14:15:01.956737+09 | Node 2 was successfully be promoted to master
       2 | repmgrd_failover_promote | t          | 2015-03-06 14:15:01.964771+09 | Node 2 promoted to master; old master 1 marked as failed
       3 | repmgrd_failover_follow  | t          | 2015-03-06 14:15:07.228493+09 | Node 3 now following new upstream node 2
(12 rows)
2015-03-06 14:35:41 +09:00
Ian Barwick
04fe820aff Note where compatibility check for replication slots is carried out
Scanning the source code gives the impression there's no check.
2015-03-05 10:12:36 +09:00
Ian Barwick
46888de77f Improve configuration file handling
Put logic in config.c so it can be shared between repmgr and repmgrd.
2015-03-03 15:39:56 +09:00
Ian Barwick
3d3f082617 Ensure witness server updates its node records following a failover
This involves mainly abstracting the functions which copy
and create records from repmgr.c to dbutils.c, as they need
to be shared between repmgr and repmgrd.

Per issue noted here:

  https://groups.google.com/forum/#!topic/repmgr/v5nu1Xwf6X0
2015-03-03 08:57:20 +09:00
Ian Barwick
dd7193715c Gracefully fail when node has not been registered 2015-03-02 10:38:44 +09:00
Ian Barwick
1803a16c7e Detect changes to configuration file
This will prevent unnecessary reconnects to the upstream and
updates of the node record on the primary.
2015-02-10 12:35:19 +09:00
Ian Barwick
4f36b2c085 Probably needed. 2015-02-10 11:07:18 +09:00
Ian Barwick
19aba38327 Handle DB error when updating upstream node 2015-02-10 10:32:20 +09:00
Ian Barwick
49debcdf92 Add version check if replication slot usage requested
Replication slots require 9.4 or greater
2015-02-02 22:16:04 +09:00
Ian Barwick
7a760c32ff Store slot name in repl_nodes table 2015-02-02 17:57:15 +09:00
Ian Barwick
2ece014952 Initial support for physical replication slots
Todo:
 - if slots specified in repmgr.conf, verify server version
 - store generated slot name in `repl_nodes` table
2015-02-02 15:53:53 +09:00
Ian Barwick
5c67d47881 Add query result tests 2015-01-28 18:05:19 +09:00