This enable selection of the method repmgrd uses to check whether the upstream
node is available. Possible values are:
- "ping" (default): uses PQping() to check server availability
- "connection": executes a query on the connection to check server
availability (similar to repmgr3.x).
In a situation where only some standbys are cut off from the primary,
a failover would result in a split brain/split cluster situation,
as it's likely one of the cut-off standbys will promote itself, and
other cut-off standbys (but not all standbys) will follow it.
To prevent this happening, interrogate the other sibiling nodes to
check whether they've seen the primary within a reasonably short interval;
if this is the case, do not take any failover action.
This feature is experimental.
If, for whatever reason, repmgrd is not running on a node, but that
node qualifies as promotion candidate, failover will not take place
as that node will never promote itself.
We therefore discount nodes where repmgrd is running as promotion
candidates, which will ensure one node is always promoted.
There is a slight risk here that the node(s) where repmgrd is not running
are further ahead, leading to a timeline fork. It might be possible
to mitigate that by having the "election" leader perform the promote
(or follow) operation.
If WAL replay is paused, and there is WAL pending replay, a promote command
will be queued until replay is resumed.
As it's conceivable that there are corner cases where one standby with
replay paused has actually received the most WAL, we'll forcibly
resume WAL replay so it can be reliably promoted, if needed.
Related to GitHub #540.
Eventually we'll want to have this contain the optional replication
info contained in the t_node_info struct, which should then contain a
pointer to a ReplInfo struct.
If the upstream node has changed, for example after "repmgr standby follow"
was manually executed, restart monitoring to ensure repmgrd is monitoring the
correct node.
Previously it would refuse to start if the primary was not reachable,
the thinking being that it's pointless trying to monitor an incomplete
cluster.
However following an aborted failover situation, repmgrd will restart
monitoring and on the witness server, this will lead to it aborting
itself due to to continuing absence of primary.
To resolve this, witness repmgrd will now start monitoring in degraded
mode if no primary is found in the hope a primary will reappear at
some point.
If the upstream comes back on line (e.g. after a switchover), and its
status is no longer primary, restart monitoring to ensure the correct
primary (potentially the current node) is being monitored.
While scanning for a new primary following a promotion script failure,
repmgrd was treating a witness server as a potential new primary
and would attempt to "follow" it. Fortunately "repmgr standby follow"
would do the right thing and choose the actual primary, if available,
otherwise do nothing, so the cluster would eventually end up in the
correct state, albeit for the wrong reason.
By skipping the witness server as a potential new primary,
repmgrd will do the right thing if the original primary does come
back online, i.e. resume monitoring as before.
In some circumstances, e.g. while performing a switchover, it is essential
that repmgrd does not take any kind of failover action, as this will put
the cluster into an incorrect state.
Previously it was necessary to stop repmgrd on all nodes (or at least
those nodes which repmgrd would consider as promotion candidates), however
this is a cumbersome and potentially risk-prone operation, particularly if the
replication cluster contains more than a couple of servers.
To prevent this issue from occurring, this patch introduces the ability
to "pause" repmgrd on all nodes wth a single command ("repmgr daemon pause")
which notifies repmgrd not to take any failover action until the node
is "unpaused" ("repmgr daemon unpause").
"repmgr daemon status" provides an overview of each node and whether repmgrd
is running, and if so whether it is paused.
"repmgr standby switchover" has been modified to automatically pause repmgrd
while carrying out the switchover.
See documentation for further details.
Though we note this in the DEBUG output, it's not immediately obvious
from the logs, especially outside of the DEBUG log level, why a node
didn't promote itself if it is in a different location to the primary.
Previously, if the server being monitored was not available, repmgrd
would always close the existing connection handle and open a new one.
However, in some cases, e.g. a brief network outage, the existing
connection handle is still good and does not need to be reopened.
This could be particularly problematic if monitoring_history is on,
as this risks leaving orphan sessions on the primary which (given
a sufficiently unstable network) could lead to all available backends
being occupied.
Instead, during an outage we now use a new connection to verify
the server is accessible; if the old connection is still available
(e.g. following a short network interruption) we continue using that;
if not (e.g. the server was restarted), we use the new one.
Add more granular logging to help diagnose issues, and also keep track
of when the last monitoring statistics update was set and emit that
as DETAIL every time we emit a log status update.
Previously, when running on a witness server, repmgrd didn't consider
the local cache of the "repmgr.nodes" table might be outdated, e.g.
as repmgrd wasn't running on the witness server during a failover,
so could potentially end up monitoring a former primary now running
as a standby.
When running on a witness server, at startup repmgrd will now scan
all nodes to determine the current primary, and refresh its local
cache from there. This will also ensure it can start up even if the
node currently registered as primary in the local cache is not available.
Implements GitHub #488 and #489.
If repmgrd is promoting the local node, it was only logging the contents
of "promote_command" at DEBUG level; it would be useful to see this at
the default log level.
Related to GitHub #473.
The documentation implied it would override "promote_command", which is
not the case.
"promote_command" is used by repmgrd to execute "repmgr standby promote"
(either directly or via a custom script).
"service_promote_command" can be set to specify a package-level service
command to promote the local PostgreSQL instance from standby to primary,
e.g. Debian's pg_ctlcluster. If set, this will be executed by "repmgr standby promote".
Also update code comments to clarify usage.
Related to GitHub #473.
Currently the (very generic sounding) "standby_reconnect_timeout" configuration
file parameter is used in several different contexts and it would be useful
to have more granular control over the different timeouts it's used to configure.
This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout"
(which wasn't documented) when "repmgr node rejoin" is executed, to determine
how long to wait for the node to rejoin the replication cluster.
Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for
failover situations, when repmgrd executes "repmgr standby follow" to follow
a new primary, and waits for the standby to restart and become available
for connections.
"standby_reconnect_timeout" is now only relevant for "repmgr standby switchover".
Implements GitHub #454.
If "pg_ctl promote" fails due to a timeout, but the promotion itself succeeds,
have repmgrd on the new primary explicitly notify any sibling nodes to
follow it.
Previously the sibling nodes would wait "primary_notification_timeout" seconds
before attempting to discover the new primary.
This (and preceding commit eac80ae) address GitHub #425.