get_superuser_connection() was erroneously using the local node record
to connect to as a superuser, which works when registering the primary
but obviously not when cloning a standby.
Addresses GitHub #380.
This will generate "recovery.conf" for an existing standby.
Typical use-case is a standby cloned manually from an external data
source (e.g. Barman), where "recovery.conf" needs to be created
(and if required a replication slot).
The --dry-run option will check the pre-requisites but not actually
create "recovery.conf" or a replication slot.
This requires that the upstream node is running, a replication connection
can be made and if required a replication slot can be created.
Implements GitHub #382.
Check it's actually possible for the demotion candidate to attach to
the promotion candidate before executing the switchover.
As with other checks of this nature, there's a faint possibility the
situation could change between the time the check is carried out and
the demotion candidate is restarted to connect to the promotion candidate,
but there's not a lot we can do about that. The main purpose is to
be able to catch existing misconfigurations before anything gets changed.
Implements GitHub #370.
If logging output not explicitly rediretced with "-l" in the pg_ctl
options, repmgr would hang waiting for pg_ctl output.
Note that we recommend using the OS-level service commands where
available.
This was required for a specific use case during pre-release
development and is no longer needed now the physical streaming
replication handling is implemented.
It's possible that a node was registered with "use_replication_slots=false"
but that was later changed to "use_replication_slots=true". If the node
was not subsequently re-registered, the node record will contain an empty
slot name, which will cause any slot creation operation during
"standby follow" or "node rejoin" to fail.
To prevent this happening, check for an empty slot name and automatically
set before proceeding.
Addresses GitHub #343.
As that's what we really want to know. Also return "UNCLEAN_SHUTDOWN"
if that's the case, rather than "RUNNING" which is confusing, even
though it's a command for internal use.
In an automatic failover situation, after a standby has been promoted
there's a risk the original primary may become available again before
"standby follow" is issued on another standby node, in which case "standby
follow" will reconnect to the original primary.
As the standby's repmgrd will have received a notification from the new
primary, it will know the primary's ID and can therefore explicitly
direct "standby follow" to follow that primary.
It's possible some distribution packages may assign a different name to the
"repmgr" binary (typically appending a version number), so we shouldn't hard-code
into the command string.
However it's reasonable to assume the "repmgr" binary will have the same name
across a replication cluster so we won't engage in any contortions to account
for possible variations.
See: https://github.com/2ndQuadrant/repmgr/issues/323