Add functionality to "pause" repmgrd

In some circumstances, e.g. while performing a switchover, it is essential
that repmgrd does not take any kind of failover action, as this will put
the cluster into an incorrect state.

Previously it was necessary to stop repmgrd on all nodes (or at least
those nodes which repmgrd would consider as promotion candidates), however
this is a cumbersome and potentially risk-prone operation, particularly if the
replication cluster contains more than a couple of servers.

To prevent this issue from occurring, this patch introduces the ability
to "pause" repmgrd on all nodes wth a single command ("repmgr daemon pause")
which notifies repmgrd not to take any failover action until the node
is "unpaused" ("repmgr daemon unpause").

"repmgr daemon status" provides an overview of each node and whether repmgrd
is running, and if so whether it is paused.

"repmgr standby switchover" has been modified to automatically pause repmgrd
while carrying out the switchover.

See documentation for further details.
This commit is contained in:
Ian Barwick
2018-09-27 16:42:10 +09:00
parent fce3c02760
commit 2491b8ae52
27 changed files with 1943 additions and 121 deletions

View File

@@ -35,7 +35,7 @@
static char *config_file = NULL;
static bool verbose = false;
static char pid_file[MAXPGPATH];
char pid_file[MAXPGPATH];
static bool daemonize = true;
static bool show_pid_file = false;
static bool no_pid_file = false;
@@ -488,6 +488,9 @@ main(int argc, char **argv)
check_and_create_pid_file(pid_file);
}
repmgrd_set_pid(local_conn, getpid(), pid_file);
#ifndef WIN32
setup_event_handlers();
#endif
@@ -901,6 +904,9 @@ print_monitoring_state(MonitoringState monitoring_state)
void
terminate(int retval)
{
if (PQstatus(local_conn) == CONNECTION_OK)
repmgrd_set_pid(local_conn, UNKNOWN_PID, NULL);
logger_shutdown();
if (pid_file[0] != '\0')