If, for whatever reason, repmgrd is not running on a node, but that
node qualifies as promotion candidate, failover will not take place
as that node will never promote itself.
We therefore discount nodes where repmgrd is running as promotion
candidates, which will ensure one node is always promoted.
There is a slight risk here that the node(s) where repmgrd is not running
are further ahead, leading to a timeline fork. It might be possible
to mitigate that by having the "election" leader perform the promote
(or follow) operation.
If WAL replay is paused, and there is WAL pending replay, a promote command
will be queued until replay is resumed.
As it's conceivable that there are corner cases where one standby with
replay paused has actually received the most WAL, we'll forcibly
resume WAL replay so it can be reliably promoted, if needed.
Related to GitHub #540.
Specifically, if WAL replay is paused *and* WAL is pending replay,
this node cannot be promoted until WAL replay is unpaused. In this
state it is not a suitable promotion candidate in a failover situation.
If replay is paused, we can't be really sure that more WAL will be received
between the check and the promote operation, which would risk the promote
operation not taking place during the switchover (it would happen
as soon as WAL replay is resumed and pending WAL is replayed).
Therefore we simply quit with an informative slew of messages and
leave the user to sort it out.
GitHub #540.
If WAL replay is paused but WAL is still pending replay, PostgreSQL will ignore
the promote request until WAL replay is unpaused. This may lead to the standby
being promoted at an unpredictable point in time outside of repmgr's
control. Moreover it may not be obvious that this is happening, or why, and
it will appear that an apparently successful promotion attempt has not
actually worked.
To prevent this from happening, repmgr will now refuse to promote the
standy if WAL replay is paused *and* WAL is still pending replay.
GitHub #540.
Previously, if the witness server connection details were provided
to "repmgr witness register" rather than those of the primary server,
repmgr a) write the node record to the witness server rather than
the primary, and b) would loop indefinitely trying to copy the
node table to itself.
Addresses GitHub #538.
Eventually we'll want to have this contain the optional replication
info contained in the t_node_info struct, which should then contain a
pointer to a ReplInfo struct.
Ensure repmgr checks the standby (promotion candidate) is currently
attached to the primary (demotion candidate).
Addresses issue reported in GitHub #519.
Immediately after the demotion candidate (primary) has shut down, we can't
be absolutely sure that the walreceiver has flushed all WAL to disk, so
checking pg_last_wal_receive_lsn() at that point might not reflect
the actual last available WAL location.
To handle this, we'll loop for a while (timeout controlled by configuration
parameter "wal_receive_check_timeout") before finally deciding whether
the standby is still behind the shut-down primary.
Addresses issue raised in GitHub #518.
Previously the code would do nothing if an attempt was made to add parameters
if the array is already full.
As the array is designed to contain all valid libpq connection parameters,
there's no reason it should ever "overflow" like this. If there is, then
it means the caller is attempting to add invalid values. Add an Assert()
so we can easily detect this in the unlikely event it ever occurs.
Noted after examining the issue raised in GitHub #533, which is nonsensical
as it implies we'd be OK with writing beyond the end of the array, however
it doesn't hurt to make it a bit clearer what is happening and why.
If the rejoin target is not in recovery, but not registered as primary
(we detect this by attempting to connect to the registered primary)
we abort and suggest fixing the repmgr metadata first.
In some places we were still providing "false" from the original implementation,
which was intended to indicate whether a negative value was allowed.
This has not been a problem, as it merely means we have been providing "0",
which is the same thing; however we can finer-tune some of the calls
(e.g. node ID must be or greater).
The initial implementation was designed to fall back to "manual"
start/stop of repmgrd if the "repmgrd_service_..._command" parameters
were not set.
However on reflection, this is too much of a potential footgun, so
we will mandate provision of these parameters.