repmgr: add parameter "shutdown_check_timeout"

Previously, "repmgr standby switchover" used the configuration file parameters
"reconnect_interval" and "reconnect_attempts" to define a timeout to determine
whether the current primary (demotion candidate) has shut down.

However, these parameters are intended for primary failure detection and are
generally lower in value, while a controlled shutdown may take longer, resulting
in the switchover being aborted as repmgr was not waiting long enough.

To prevent this happening, parameter "shutdown_check_timeout" has been added.
This complements the existing "standby_reconnect_timeout" parameter used
by "repmgr standby switchover".

Implements GitHub #504.
This commit is contained in:
Ian Barwick
2018-09-25 11:30:01 +09:00
parent 80bef0eb28
commit 38e3aae053
7 changed files with 34 additions and 20 deletions

View File

@@ -3666,13 +3666,14 @@ do_standby_switchover(void)
/* loop for timeout waiting for current primary to stop */
for (i = 0; i < config_file_options.reconnect_attempts; i++)
for (i = 0; i < config_file_options.shutdown_check_timeout; i++)
{
/* Check whether primary is available */
PGPing ping_res;
log_info(_("checking primary status; %i of %i attempts"),
i + 1, config_file_options.reconnect_attempts);
log_info(_("checking for primary shutdown; %i of %i attempts (\"shutdown_check_timeout\")"),
i + 1, config_file_options.shutdown_check_timeout);
ping_res = PQping(remote_conninfo);
log_debug("ping status is: %s", print_pqping_status(ping_res));
@@ -3741,9 +3742,8 @@ do_standby_switchover(void)
termPQExpBuffer(&command_output);
}
log_debug("sleeping %i seconds (\"reconnect_interval\") until next check",
config_file_options.reconnect_interval);
sleep(config_file_options.reconnect_interval);
log_debug("sleeping 1 second until next check");
sleep(1);
}
if (shutdown_success == false)