If a PostgreSQL instance was shut down while repmgrd was running, and
repmgrd was subsequently restarted (this chain of events could occur
during e.g. a server reboot), the node record will have been set to
"inactive". Previously, in this case repmgrd would refuse to start up.
However, as we can determine the node is running, it should normally
be no problem to automatically set the node record to "active".
The old behaviour can be restored by setting the new parameter
"repmgrd_exit_on_inactive_node" to "true".
RM19604.
Previously, repmgr would forcibly change the permissions on a data
directory to 0700. However from PostgreSQL 11, 0750 is also valid,
so that value should not be changed.
If neither the local node nor the upstream are available, and
"standby_disconnect_on_failover" is set, attempting to fetch
the walreceiver PID will result in repmgrd terminating.
Add a check that the connection is valid before attempting to
fetch the walreceiver PID.
Addresses GitHub #675.
From PostgreSQL 13, pg_rewind will automatically handle an unclean
shutdown itself, so as long as --force-rewind was provided, so there
is no need to fail with an error.
Note that pg_rewind handles the unclean shutdown by starting PostgreSQL
in single user mode, which it does before performing any checks as
to whether a rewind is actually necessary.
However pg_rewind doesn't take into account the possible presence
of a standby.signal file, so we remove that and recreate it after
pg_rewind was executed.
By setting --waldir in "pg_basebackup_options", standbys cloned using
pg_basebackup would have their WAL directory set to the specified
location and symlinked from the data directory.
This commit causes repmgr to honour that setting even when cloning
from Barman.
As of PostgreSQL 13, changes to the fundamental replication
configuration can be applied with a simple SIGHUP, no restart
required.
In case the old behaviour is desired, i.e. a full restart to apply
the configuration changes, the new configuration parameter
"standby_follow_restart" can be set. This parameter has no effect
in PostgreSQL 12 and earlier.
In certain corner cases, it's possible repmgrd may end up monitoring
a standby which was a former primary, but the node record has not
yet been updated.
Previously repmgrd would abort the promotion with a cryptic message
about being unable to find a node record for node_id -1 (the
default value for an unknown node id).
This commit addes a new configuration option "always_promote", which
determines whether repmgrd should promote the node in this case.
The default is "false", to effectively maintain the existing behaviour.
Logging output has also been improved to make it clearer what has
happened when this situation occurs.
Commit 0574279 set the file permissions to 0600 rather than the user's
umask, but if initdb was executed with -g/--allow-group-access, the
file is maintained with 0640, so we'll just maintain the existing
permssions.
This is mainly useful for the --data-directory-config option, which
requires permission to read pg_settings to verify that the data
directory configured in "repmgr.conf" matches the data directory
actually in use.
If pg_settings read permission is not available, repmgr will fall
back to a simple check that the data directory configured in
"repmgr.conf" is a valid PostgreSQL directory. This is not entirely
foolproof, as it's possible PostgreSQL could be using a different
data directory.
From PostgreSQL 12, the SQL-level function "pg_promote()" can be used
to promote a PostgreSQL instance, however usage is restricted to
superusers and users to whom explicit execution permission for this
function has been granted.
Therefore, if execution permission is not available, fall back to
"pg_ctl promote".
There were some corner cases where "repmgr standby switchover"
would erroneously report a successful switchover, even if the
demotion candidate had not reattached to the promotion candidate.
Also improve the logging in various places to make it clearer
what is happening on which node.
A more generic option name to cover pre- and post-Pg12 replication
configuration methods.
--recovery-conf-only is retained as an alias for backwards
compatibility.
repmgrd has a check to see if the upstream node has unexpectedly
changed, e.g. if the repmgrd service is paused and the PostgreSQL
instance has been pointed to another node.
However this check was relying on the node record on the local node
being up-to-date, which may not be the case immediately after a
failover, when the node is still replaying records updated prior
to the node's own record being updated. In this case it will
mistakenly assume the node is following the original primary
and attempt to restart monitoring, which will fail as the original
primary is no longer available.
To prevent this, we check against the node's record on the upstream
node.
Addresses issue noted in GitHub #587 and #588.
This makes it possible to return log output when executing repmgr
remotely at a different level to the one defined in the remote
repmgr's repmgr.conf.
This is particularly useful when DEBUG output is required.