Commit Graph

162 Commits

Author SHA1 Message Date
Ian Barwick
50894b6124 "standby follow": check for replication slot availability on target node 2018-02-02 15:01:23 +09:00
Ian Barwick
c54045bcd8 "standby follow": initial implementation of --dry-run option
GitHub #363.
2018-02-01 14:18:40 +09:00
Ian Barwick
c0a53471e1 "standby switchover": improve log messages and add new exit code
Previously, if an issue was encountered with the old primary, but user
provided -F/--force to have repmgr promote the standby anyway, repmgr
would exit with the log message "STANDBY SWITCHOVER is complete"
and exit code 0 (SUCCESS).

To better report this partial completion, repmgr will now emit the message
"STANDBY SWITCHOVER has completed with issues" (and a HINT to check preceding
log messages) and new exit code 22 (ERR_SWITCHOVER_INCOMPLETE).
2018-01-31 10:25:15 +09:00
Ian Barwick
2eec8b5d79 Have do_standby_follow_internal() not abort on error
Pass the error code back to the caller instead, mainly so
"repmgr node rejoin" can better report errors.
2018-01-30 16:53:04 +09:00
Ian Barwick
c11e92cf2a repmgr: improve switchover handling when "pg_ctl" used
If logging output not explicitly rediretced with "-l" in the pg_ctl
options, repmgr would hang waiting for pg_ctl output.

Note that we recommend using the OS-level service commands where
available.
2018-01-30 13:43:37 +09:00
Ian Barwick
f294d09034 "repmgr standby register": improve error output when standby not running
Add explicit HINT
2018-01-26 22:13:11 +09:00
Ian Barwick
3d6437c8f8 repmgr: assume node is actually shutting down if pingable and that's the reported status 2018-01-16 11:17:06 +09:00
Ian Barwick
ae7963dc64 repmgr: automatically create slot name if missing
It's possible that a node was registered with "use_replication_slots=false"
but that was later changed to "use_replication_slots=true". If the node
was not subsequently re-registered, the node record will contain an empty
slot name, which will cause any slot creation operation during
"standby follow" or "node rejoin" to fail.

To prevent this happening, check for an empty slot name and automatically
set before proceeding.

Addresses GitHub #343.
2018-01-11 11:13:41 +09:00
Ian Barwick
faffb2a6e7 repmgr: catch possible corner case when checking node shutdown status
It's conceivable that PQping is returning "no response" but the
shutdown hasn't quite completed.
2018-01-10 14:56:00 +09:00
Ian Barwick
5d57044118 repmgr: during switchover, correctly detect unclean shutdown status 2018-01-10 12:21:04 +09:00
Ian Barwick
07a88c78a5 repmgr standby switchover: add "%p" event notification parameter
This will contain the node ID of the former primary.
2018-01-10 11:01:00 +09:00
Ian Barwick
20920b3da1 repmgr standby switchover: add event details 2018-01-10 09:55:24 +09:00
Ian Barwick
0c62821ffb Consolidate parsing of output from executing repmgr on a remote server
This should also fix the issue reported in GitHub #349.
2018-01-09 13:33:38 +09:00
Ian Barwick
a608b0bc18 "repmgr standby register": add --wait-start option
Implements GitHub #356.
2018-01-04 12:48:12 +09:00
Ian Barwick
1521657965 Update copyright notices to 2018 2018-01-02 10:20:09 +09:00
Ian Barwick
3fa2bef6f4 repmgr: fix configuration file sanity check
The check was being carried out regardless of whether --copy-external-config-files
was specified, which means cloning will fail if no SSH connection is available.

Addresses GitHub #342
2017-11-23 22:50:28 +09:00
Ian Barwick
faf297b07f remove spurios "/base" path element in Barman tablespace cloning code.
Addresses GitHub #339
2017-11-20 11:10:30 +09:00
Ian Barwick
0dae8c9f0b repmgr: don't add empty "passfile" parameter in recovery.conf 2017-11-20 10:28:16 +09:00
Ian Barwick
c907b7b33d repmgr: minor fix to "repmgr standby --help" output 2017-11-15 14:04:01 +09:00
Ian Barwick
31b856dd9f Add "witness register" functionality 2017-11-15 14:03:54 +09:00
Ian Barwick
d459b92186 Add configuration file "passfile"
This will enable a custom .pgpass to be included in "primary_conninfo"
(provided it's supported by the libpq version on the standby).
2017-11-15 14:03:37 +09:00
Ian Barwick
11d856a1ec "standby follow": get upstream record before server restart, if required
The standby may not always be available for connections right after it's
restarted, so attempting to connect and get the node's upstream record
after the restart may fail. Record is now retrieved before the restart.

Addresses GitHub #333.
2017-10-27 16:30:25 +09:00
Ian Barwick
4d8176ab60 Fix version number check 2017-10-04 09:35:47 +09:00
Ian Barwick
d6c27f8938 Standardize quoting in log messages 2017-10-04 09:34:59 +09:00
Ian Barwick
e64600d3e3 Prevent compiler truncation warnings 2017-09-19 16:15:47 +09:00
Ian Barwick
bb311892b5 Remove unused code 2017-09-19 14:47:49 +09:00
Ian Barwick
e3defc507e Merge branch 'pg93' 2017-09-18 15:55:32 +09:00
Ian Barwick
1197c11c59 "standby clone": skip tablespace mapping in PostgreSQL 9.3 2017-09-18 13:26:35 +09:00
Ian Barwick
30b11c08e6 Disable any configuration settings not compatible with PostgreSQL 9.3
And emit a warning while we're at it.
2017-09-18 13:12:38 +09:00
Ian Barwick
ea2693bc75 Move create_recovery_file() et al to repmgr-action-standby.c
As they're only ever called from there.
2017-09-18 09:53:08 +09:00
Ian Barwick
f5c9c74a75 Minor log output tweak 2017-09-14 08:57:16 +09:00
Ian Barwick
e040f95aaa "standby clone": fix replication slot generation
Slot on source node was being deleted even if source node is the intended
upstream node.
2017-09-14 08:47:56 +09:00
Ian Barwick
e583e2eb40 "standby switchover": fix error message 2017-09-13 11:30:29 +09:00
Ian Barwick
b6cd816923 Tidy up some log output 2017-09-12 11:08:41 +09:00
Ian Barwick
a9f4a027a7 pgindent run 2017-09-11 11:14:13 +09:00
Ian Barwick
3447257ae4 repmgrd: minor fixes and comment updates 2017-09-08 20:59:21 +09:00
Ian Barwick
e4f7dc8234 Add copyright notices 2017-09-08 13:27:39 +09:00
Ian Barwick
b6a27b975d "standby clone": improve handling when designated upstream doesn't yet exist
This situation can occur in provisioning environments, where a node's
upstream may not exist at the point it's cloned. If replication slots
are in use, we'll need to make sure no attempt is made to create
the replication slot on the designated upstream, as that will end in
tears. We assume the user will be prepared to complete this step manually.
2017-09-07 12:35:27 +09:00
Ian Barwick
3787dd3795 "standby switchover": better handling of remote execution failure 2017-09-07 11:50:56 +09:00
Ian Barwick
edee80cc37 Rename option "node check --is-shutdown" to "--is-shutdown-cleanly"
As that's what we really want to know. Also return "UNCLEAN_SHUTDOWN"
if that's the case, rather than "RUNNING" which is confusing, even
though it's a command for internal use.
2017-09-07 11:15:27 +09:00
Ian Barwick
79531ae9da "standby switchover": fix check for remote repmgr binary
Also add a useful hint about setting "pg_bindir".
2017-09-07 10:26:46 +09:00
Ian Barwick
ee7a5b6e66 Minor bug and log fixes 2017-09-06 17:28:31 +09:00
Ian Barwick
03f400617c "standby register": add --dry-run and check node is attached to upstream 2017-09-06 15:48:23 +09:00
Ian Barwick
66219c3097 README and "repmgr standby --help" updates 2017-09-06 14:13:06 +09:00
Ian Barwick
a28bbd68eb "standby clone": improve replication slots handling
Ensure replication slot is created on the upstream node and deleted from
the source node, if upstream node and source nodes differ.
2017-09-06 12:16:02 +09:00
Ian Barwick
bd07a34472 "standby clone": improve log messages
Make it clearer which nodes are being connected to, and why.
2017-09-06 10:15:52 +09:00
Ian Barwick
5b5b456ecb "standby switchover": improve logging
Also no need to disconnect/reconnect from/to local node while it promotes.
2017-09-05 10:26:27 +09:00
Ian Barwick
d82e936556 "standby promote": improve logging
Specifically state which server is being promoted; this is particularly
important when the promotion occurs as part of a series of other operations,
e.g. "standby switchover".

Also no need to disconnect/reconnect while the server is promoted.
2017-09-05 09:43:16 +09:00
Ian Barwick
78e6bdeebe Have repmgrd parse "standby follow --upstream-node-id=%n" 2017-09-04 13:42:50 +09:00
Ian Barwick
47a4b49890 Add "repmgr standby follow --upstream-node-id"
In an automatic failover situation, after a standby has been promoted
there's a risk the original primary may become available again before
"standby follow" is issued on another standby node, in which case "standby
follow" will reconnect to the original primary.

As the standby's repmgrd will have received a notification from the new
primary, it will know the primary's ID and can therefore explicitly
direct "standby follow" to follow that primary.
2017-09-04 09:11:59 +09:00