Previously it would refuse to start if the primary was not reachable,
the thinking being that it's pointless trying to monitor an incomplete
cluster.
However following an aborted failover situation, repmgrd will restart
monitoring and on the witness server, this will lead to it aborting
itself due to to continuing absence of primary.
To resolve this, witness repmgrd will now start monitoring in degraded
mode if no primary is found in the hope a primary will reappear at
some point.
If the upstream comes back on line (e.g. after a switchover), and its
status is no longer primary, restart monitoring to ensure the correct
primary (potentially the current node) is being monitored.
While scanning for a new primary following a promotion script failure,
repmgrd was treating a witness server as a potential new primary
and would attempt to "follow" it. Fortunately "repmgr standby follow"
would do the right thing and choose the actual primary, if available,
otherwise do nothing, so the cluster would eventually end up in the
correct state, albeit for the wrong reason.
By skipping the witness server as a potential new primary,
repmgrd will do the right thing if the original primary does come
back online, i.e. resume monitoring as before.
In some circumstances, e.g. while performing a switchover, it is essential
that repmgrd does not take any kind of failover action, as this will put
the cluster into an incorrect state.
Previously it was necessary to stop repmgrd on all nodes (or at least
those nodes which repmgrd would consider as promotion candidates), however
this is a cumbersome and potentially risk-prone operation, particularly if the
replication cluster contains more than a couple of servers.
To prevent this issue from occurring, this patch introduces the ability
to "pause" repmgrd on all nodes wth a single command ("repmgr daemon pause")
which notifies repmgrd not to take any failover action until the node
is "unpaused" ("repmgr daemon unpause").
"repmgr daemon status" provides an overview of each node and whether repmgrd
is running, and if so whether it is paused.
"repmgr standby switchover" has been modified to automatically pause repmgrd
while carrying out the switchover.
See documentation for further details.
Though we note this in the DEBUG output, it's not immediately obvious
from the logs, especially outside of the DEBUG log level, why a node
didn't promote itself if it is in a different location to the primary.
Previously, if the server being monitored was not available, repmgrd
would always close the existing connection handle and open a new one.
However, in some cases, e.g. a brief network outage, the existing
connection handle is still good and does not need to be reopened.
This could be particularly problematic if monitoring_history is on,
as this risks leaving orphan sessions on the primary which (given
a sufficiently unstable network) could lead to all available backends
being occupied.
Instead, during an outage we now use a new connection to verify
the server is accessible; if the old connection is still available
(e.g. following a short network interruption) we continue using that;
if not (e.g. the server was restarted), we use the new one.
Add more granular logging to help diagnose issues, and also keep track
of when the last monitoring statistics update was set and emit that
as DETAIL every time we emit a log status update.
Previously, when running on a witness server, repmgrd didn't consider
the local cache of the "repmgr.nodes" table might be outdated, e.g.
as repmgrd wasn't running on the witness server during a failover,
so could potentially end up monitoring a former primary now running
as a standby.
When running on a witness server, at startup repmgrd will now scan
all nodes to determine the current primary, and refresh its local
cache from there. This will also ensure it can start up even if the
node currently registered as primary in the local cache is not available.
Implements GitHub #488 and #489.
If repmgrd is promoting the local node, it was only logging the contents
of "promote_command" at DEBUG level; it would be useful to see this at
the default log level.
Related to GitHub #473.
The documentation implied it would override "promote_command", which is
not the case.
"promote_command" is used by repmgrd to execute "repmgr standby promote"
(either directly or via a custom script).
"service_promote_command" can be set to specify a package-level service
command to promote the local PostgreSQL instance from standby to primary,
e.g. Debian's pg_ctlcluster. If set, this will be executed by "repmgr standby promote".
Also update code comments to clarify usage.
Related to GitHub #473.
Currently the (very generic sounding) "standby_reconnect_timeout" configuration
file parameter is used in several different contexts and it would be useful
to have more granular control over the different timeouts it's used to configure.
This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout"
(which wasn't documented) when "repmgr node rejoin" is executed, to determine
how long to wait for the node to rejoin the replication cluster.
Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for
failover situations, when repmgrd executes "repmgr standby follow" to follow
a new primary, and waits for the standby to restart and become available
for connections.
"standby_reconnect_timeout" is now only relevant for "repmgr standby switchover".
Implements GitHub #454.
If "pg_ctl promote" fails due to a timeout, but the promotion itself succeeds,
have repmgrd on the new primary explicitly notify any sibling nodes to
follow it.
Previously the sibling nodes would wait "primary_notification_timeout" seconds
before attempting to discover the new primary.
This (and preceding commit eac80ae) address GitHub #425.
It's possible "pg_ctl promote" will timeout, causing "repmgr standby
follow" to return with an error; however the promotion itself will usually
succeed, so detect this case and handle accordingly.
If repmgrd marks the local node as unavailable, and it was actually
restarting but a failover event occured before the next local node
check, failover will continue with the stale connection handle.
Add a final local node check just before starting the failover
process, so repmgrd can reconnect if it wasn't able to before.
If monitoring history not in use, there's no activity on the standby's
connection handle, so if e.g. the standby is restarted, PQstatus()
never returns CONNECTION_BAD and repmgrd never notices the connection
is stale. Therefore execute a throw-away statement at "monitor_interval_secs".
The event notification was only being created if there was a valid
primary connection; it should be created in any case, so an event
notification script can be executed.
- emit explicit startup NOTICE
- emit NOTICE when falling back to degraded monitoring on a primary node
- improve log message and event notification details when monitoring
a former primary which has been reconnected as a standby
If two nodes were in the primary location, and at least one node in
another location, the non-failed node in the primary location was not
recognising itself as a promotion candidate.
Addresses GitHub #407.
This is used for determining a timeout when reconnecting to the standby
after executing the "follow_command". This will normally not need to be
set explicitly, but maybe useful in cases where the standby's startup
phase can last longer than usual.