repmgr: add parameter "shutdown_check_timeout"

Previously, "repmgr standby switchover" used the configuration file parameters
"reconnect_interval" and "reconnect_attempts" to define a timeout to determine
whether the current primary (demotion candidate) has shut down.

However, these parameters are intended for primary failure detection and are
generally lower in value, while a controlled shutdown may take longer, resulting
in the switchover being aborted as repmgr was not waiting long enough.

To prevent this happening, parameter "shutdown_check_timeout" has been added.
This complements the existing "standby_reconnect_timeout" parameter used
by "repmgr standby switchover".

Implements GitHub #504.
This commit is contained in:
Ian Barwick
2018-09-25 11:30:01 +09:00
parent 80bef0eb28
commit 38e3aae053
7 changed files with 34 additions and 20 deletions

View File

@@ -141,19 +141,7 @@
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>reconnect_attempts</literal>: number of times to check the original primary
for a clean shutdown after executing the shutdown command, before aborting
</simpara>
</listitem>
<listitem>
<simpara>
<literal>reconnect_interval</literal>: interval (in seconds) to check the original
primary for a clean shutdown after executing the shutdown command (up to a maximum
of <literal>reconnect_attempts</literal> tries)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>replication_lag_critical</literal>:
@@ -163,10 +151,24 @@
</simpara>
</listitem>
<listitem>
<simpara>
<literal>shutdown_check_timeout</literal>: maximum number of seconds to wait for the
demotion candidate (current primary) to shut down, before aborting the switchover.
</simpara>
<note>
<para>
In versions prior to &repmgr; 4.2, <command>repmgr standby switchover</command> would
use the values defined in <literal>reconnect_attempts</literal> and <literal>reconnect_interval</literal>
to determine the timeout for demotion candidate shutdown.
</para>
</note>
</listitem>
<listitem>
<simpara>
<literal>standby_reconnect_timeout</literal>:
number of seconds to attempt to wait for the demoted primary
maximum number of seconds to attempt to wait for the demoted primary
to reconnect to the promoted primary (default: 60 seconds)
</simpara>
</listitem>