By checking the PID file in the same way pg_ctl does, we can be pretty
much certain whether the target data directory contains an active
PostgreSQL instance.
"--upstream-node-id", if provided, was not being passed through to
the SQL query executed via the Barman server.
Also modified the query to select the primary node if "--upstream-node-id"
is not provided.
Note: this is a very niche use case.
Attempting to use the contents of pg_control to tell whether the directory
is in use by PostgreSQL can result in false positives; we should use
a check based on the pidfile.
Also change the HINT to indicate a data directory can be overwritten
if -F/--force is provided.
For events generated by these commands, it may be useful to know details
of the primary node. This makes following additional parameters available
to event notification scripts:
- %p: node ID of the primary
- %a: node name of the primary
- %c: conninfo string for the primary
Implements GitHub #375
Previously, if an issue was encountered with the old primary, but user
provided -F/--force to have repmgr promote the standby anyway, repmgr
would exit with the log message "STANDBY SWITCHOVER is complete"
and exit code 0 (SUCCESS).
To better report this partial completion, repmgr will now emit the message
"STANDBY SWITCHOVER has completed with issues" (and a HINT to check preceding
log messages) and new exit code 22 (ERR_SWITCHOVER_INCOMPLETE).
If logging output not explicitly rediretced with "-l" in the pg_ctl
options, repmgr would hang waiting for pg_ctl output.
Note that we recommend using the OS-level service commands where
available.
It's possible that a node was registered with "use_replication_slots=false"
but that was later changed to "use_replication_slots=true". If the node
was not subsequently re-registered, the node record will contain an empty
slot name, which will cause any slot creation operation during
"standby follow" or "node rejoin" to fail.
To prevent this happening, check for an empty slot name and automatically
set before proceeding.
Addresses GitHub #343.
The check was being carried out regardless of whether --copy-external-config-files
was specified, which means cloning will fail if no SSH connection is available.
Addresses GitHub #342
The standby may not always be available for connections right after it's
restarted, so attempting to connect and get the node's upstream record
after the restart may fail. Record is now retrieved before the restart.
Addresses GitHub #333.
This situation can occur in provisioning environments, where a node's
upstream may not exist at the point it's cloned. If replication slots
are in use, we'll need to make sure no attempt is made to create
the replication slot on the designated upstream, as that will end in
tears. We assume the user will be prepared to complete this step manually.
As that's what we really want to know. Also return "UNCLEAN_SHUTDOWN"
if that's the case, rather than "RUNNING" which is confusing, even
though it's a command for internal use.