Commit Graph

297 Commits

Author SHA1 Message Date
Ian Barwick
a0bad5fdc0 General code cleanup 2017-08-16 23:09:02 +09:00
Ian Barwick
0ac16f7630 Add more --help output 2017-08-16 17:49:46 +09:00
Ian Barwick
ae30f41de6 Clean up various unhandled memory allocations 2017-08-16 17:09:13 +09:00
Ian Barwick
8ff545f9ae Add --help output for "repmgr cluster" 2017-08-16 16:33:07 +09:00
Ian Barwick
4efc8fb9ce Add placeholder functions for "repmgr $command --help"
There are now too many options to sensibly fit into general --help
output; we'll add separate output for each repmgr command, e.g.
"repmgr node --help".
2017-08-16 13:24:14 +09:00
Ian Barwick
00f983dc15 Update README 2017-08-16 13:23:46 +09:00
Ian Barwick
4c0d719cdb Add replication slot check to "repmgr node check" 2017-08-16 11:17:02 +09:00
Ian Barwick
3e9ce6fe38 Fix "repmgr node check --role" output 2017-08-15 22:07:07 +09:00
Ian Barwick
554673e83e Add "repmgr node check --downstream" 2017-08-15 15:50:46 +09:00
Ian Barwick
10ef30096c "node check": add server role check 2017-08-14 22:57:09 +09:00
Ian Barwick
fa7d60cd51 "node check": initial general output 2017-08-14 17:32:44 +09:00
Ian Barwick
3b2158edbf Initialise variables, where appropriate 2017-08-14 15:11:42 +09:00
Ian Barwick
4260fdf1e7 More code cleanup 2017-08-14 12:19:57 +09:00
Ian Barwick
eabd56f3be "standby follow": check node system identifiers match 2017-08-14 11:45:08 +09:00
Ian Barwick
0f31756733 General code cleanup 2017-08-14 10:04:53 +09:00
Ian Barwick
b95b3e50e3 Return system identification information with appropriate data types 2017-08-14 08:50:54 +09:00
Ian Barwick
50b82f785e Add function to execute "IDENTIFY_SYSTEM" 2017-08-11 22:01:02 +09:00
Ian Barwick
f972aec198 Parse recovery.conf file
This will be useful for various kinds of diagnostics.
2017-08-10 23:58:16 +09:00
Ian Barwick
1292e8991a Improve "standby switchover" --dry-run output 2017-08-10 22:43:05 +09:00
Ian Barwick
8a50a72dc5 Additional "node status" output 2017-08-10 17:18:08 +09:00
Ian Barwick
4f2161bd83 Cleanup various #defines 2017-08-10 15:11:53 +09:00
Ian Barwick
970ed5d959 Bump minimum supported version to 9.5
We assume availability of pg_rewind.
2017-08-10 15:07:32 +09:00
Ian Barwick
cc52227d61 Miscellaneous cleanup 2017-08-10 15:05:01 +09:00
Ian Barwick
7ca68b7cc8 Standardize "primary_conninfo" generation
Previously repmgr would write all the default libpq parameters
into "primary_conninfo" on "standby clone", but not for
"standby follow", which is inconsistent.

For repmgr4 we'll determine that the upstream node's conninfo
must be canonical and contain all required connection parameters,
even if these are available as defaults or environment variables
in the local environment, as those are transient and may not
be available in all environments/situations.

recovery.conf's "primary_conninfo" will be generated using the
upstream's conninfo parameters, except for those specific
to the downstream node. These are:

  - "application_name": this will always be set to the
      "node_name"  of the downstream node
  - "passfile" and "servicefile": these, must of course
    reference files on the downstream node so will be extracted
    from the downstream node's conninfo, if set
2017-08-10 12:37:50 +09:00
Ian Barwick
1cb0adfdcb Finalize switchover process 2017-08-10 09:34:48 +09:00
Ian Barwick
5fb86771b1 Use stored node configuration file path when executing remote commands
Makes life much easier.
2017-08-10 09:12:07 +09:00
Ian Barwick
1d99a07b43 Store configuration file in repmgr.nodes table
When executing repmgr on remote nodes, we otherwise end up jumping
through hoops as we can't make assumptions about where the configuration
file is located, but really need to be able to provide it.

From a support point of view it will also make life easier as it will
be easy to specify exactly which file to provide.
2017-08-10 08:03:24 +09:00
Ian Barwick
a57fb5b50c After switchover, enable sibling standbys to follow new primary 2017-08-10 00:06:16 +09:00
Ian Barwick
4930c95ef7 Consolidate final output of "standby follow" / "node rejoin" 2017-08-09 19:31:42 +09:00
Ian Barwick
bae82318f1 No need to expose configuration file archive functions as repmgr commands 2017-08-09 13:32:15 +09:00
Ian Barwick
df425a38b7 Refactor "standby follow" functionality
"standby follow" was originally co-opted to start up a demoted node;
this functionality is now delegated to "node rejoin", with the core
functionality of "standby follow" implemented as an internal function.
2017-08-09 13:26:27 +09:00
Ian Barwick
b1e544f962 Enable use of pg_rewind during switchover operations
But only if required and --force-rewind required, and pg_rewind
can actually be used.
2017-08-09 12:09:37 +09:00
Ian Barwick
2553839630 Split actual promote functionality of do_standby_promote() into seperate function
No need to do all the sanity checks performed by "repmgr standby promote"
when promoting the standby during a switchover operation.
2017-08-08 10:45:56 +09:00
Ian Barwick
f2cf46bba3 Check replication lag before attempting switchover 2017-08-08 10:16:47 +09:00
Ian Barwick
fd5dfa2ebc Document "archiver_lag_*" configuration settings. 2017-08-08 00:50:12 +09:00
Ian Barwick
2499b42ef8 switchover: check for pending archive files on the demotion candidate
If the current primary (demotion candidate) still has any files to archive,
it will delay the shutdown until all files are archived. If there is a
substantial number of files, and/or the archive command executes slowly,
this will probably lead to an unwelcome delay in the switchover process.
2017-08-08 00:37:20 +09:00
Ian Barwick
068ecc963d Minor log output fix 2017-08-04 23:58:15 +09:00
Ian Barwick
20eeeef884 don't try and drop non-existent slot after switchover 2017-08-04 14:20:38 +09:00
Ian Barwick
972f8394ff Fix slot deletion after switchover 2017-08-04 13:16:46 +09:00
Ian Barwick
82639b6903 Refactor slot name handling
Better to work with the slot name in a node record, rather than
creating a global variable.
2017-08-04 11:56:11 +09:00
Ian Barwick
2c682b31c2 Attempt to delete replication slot on old primary after switchover 2017-08-04 11:55:54 +09:00
Ian Barwick
c34f5c1ed1 Initial switchover code 2017-08-04 09:39:30 +09:00
Ian Barwick
5948cf6cda repmgr standby switchover: add sanity check for pg_rewind useability
pg_rewind will only be executed on a demoted primary if explictly
requested, to prevent transactions on the primary, which
were never replicated, from being automatically overwritten.

If --force-rewind is provided, we'll need to check pg_rewind
is actually useable before we need to use it.
2017-08-04 00:45:55 +09:00
Ian Barwick
0815accdef Formatting fix 2017-08-03 23:58:25 +09:00
Ian Barwick
7d77fd4072 Log successful switchover event 2017-08-03 17:02:30 +09:00
Ian Barwick
112ca6321a Initial switchover implementation
The repmgr3 implementation required the promotion candidate (standby)
to directly work with the demotion candidate's data directory,
directly execute server control commands etc.

Here we delegated a lot more of that work to the repmgr on the
demotion candidate, which reduces the amount of back-and-forth
over SSH and generally makes things cleaner and smoother.

In particular the repmgr on the demotion candidate will carry
out a thorough check that the node is shut down and report
the last checkpoint LSN to the promotion candidate; this
can then be used to determine whether pg_rewind needs to be
executed on the demoted primary before reintegrating it back
into the cluster (todo).

Also implement "--dry-run" for this action, which will sanity-check the
nodes as far as possible without executing the switchover.

Additionally some of the new repmgr node commands (or command options)
introduced for this can be also executed by the user to obtain
additional information about the status of each node.
2017-08-03 16:38:37 +09:00
Ian Barwick
c67aa15581 Make "pgdata" a mandatory configuration file setting
There are some circumstances, e.g. during switchover operations,
where repmgr may need to operate on a data directory while the
server isn't running, in which case there's no way to retrieve
that information.
2017-08-02 23:04:24 +09:00
Ian Barwick
83cda89362 Get data directory for server commands if needed
Also add configuration file option "pgdata" for hard-coding the
node's data directory - if the "repmgr" DB user isn't a superuser
or doesn't have permission to extract the data directory, we'll
need another way of finding out.
2017-08-02 13:16:16 +09:00
Ian Barwick
791640e3b4 repmgrd: never execute "service_promote_command" directly 2017-08-02 12:09:25 +09:00
Ian Barwick
aa528dfdfb Consolidate generation of various server control commands
This is needed for better switchover control, so we can instruct
the remote repmgr to issue the appropriate server command rather
than trying to work out what it should be from the local node.
2017-08-02 12:01:20 +09:00