mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-22 22:56:29 +00:00
Currently the (very generic sounding) "standby_reconnect_timeout" configuration file parameter is used in several different contexts and it would be useful to have more granular control over the different timeouts it's used to configure. This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout" (which wasn't documented) when "repmgr node rejoin" is executed, to determine how long to wait for the node to rejoin the replication cluster. Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for failover situations, when repmgrd executes "repmgr standby follow" to follow a new primary, and waits for the standby to restart and become available for connections. "standby_reconnect_timeout" is now only relevant for "repmgr standby switchover". Implements GitHub #454.
248 lines
8.2 KiB
Plaintext
248 lines
8.2 KiB
Plaintext
<refentry id="repmgr-standby-switchover">
|
|
<indexterm>
|
|
<primary>repmgr standby switchover</primary>
|
|
</indexterm>
|
|
|
|
<refmeta>
|
|
<refentrytitle>repmgr standby switchover</refentrytitle>
|
|
</refmeta>
|
|
|
|
<refnamediv>
|
|
<refname>repmgr standby switchover</refname>
|
|
<refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
|
|
</refnamediv>
|
|
|
|
<refsect1>
|
|
<title>Description</title>
|
|
|
|
<para>
|
|
Promotes a standby to primary and demotes the existing primary to a standby.
|
|
This command must be run on the standby to be promoted, and requires a
|
|
passwordless SSH connection to the current primary.
|
|
</para>
|
|
<para>
|
|
If other standbys are connected to the demotion candidate, &repmgr; can instruct
|
|
these to follow the new primary if the option <literal>--siblings-follow</literal>
|
|
is specified. This requires a passwordless SSH connection between the promotion
|
|
candidate (new primary) and the standbys attached to the demotion candidate
|
|
(existing primary).
|
|
</para>
|
|
<note>
|
|
<para>
|
|
Performing a switchover is a non-trivial operation. In particular it
|
|
relies on the current primary being able to shut down cleanly and quickly.
|
|
&repmgr; will attempt to check for potential issues but cannot guarantee
|
|
a successful switchover.
|
|
</para>
|
|
</note>
|
|
<para>
|
|
For more details on performing a switchover, including preparation and configuration,
|
|
see section <xref linkend="performing-switchover">.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Options</title>
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><option>--always-promote</option></term>
|
|
<listitem>
|
|
<para>
|
|
Promote standby to primary, even if it is behind original primary
|
|
(original primary will be shut down in any case).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>--dry-run</option></term>
|
|
<listitem>
|
|
<para>
|
|
Check prerequisites but don't actually execute a switchover.
|
|
</para>
|
|
<important>
|
|
<para>
|
|
Success of <option>--dry-run</option> does not imply the switchover will
|
|
complete successfully, only that
|
|
the prerequisites for performing the operation are met.
|
|
</para>
|
|
</important>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>-F</option></term>
|
|
<term><option>--force</option></term>
|
|
<listitem>
|
|
<para>
|
|
Ignore warnings and continue anyway.
|
|
</para>
|
|
<para>
|
|
Specifically, if a problem is encountered when shutting down the current primary,
|
|
using <option>-F/--force</option> will cause &repmgr; to continue by promoting
|
|
the standby to be the new primary, and if <option>--siblings-follow</option> is
|
|
specified, attach any other standbys to the new primary.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
|
|
<listitem>
|
|
<para>
|
|
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
|
|
(and the prerequisites for using <application>pg_rewind</application> are met).
|
|
If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
|
|
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
|
|
provide its full path. For more details see also <xref linkend="switchover-pg-rewind">.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>-R</option></term>
|
|
<term><option>--remote-user</option></term>
|
|
<listitem>
|
|
<para>
|
|
System username for remote SSH operations (defaults to local system user).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>--siblings-follow</option></term>
|
|
<listitem>
|
|
<para>
|
|
Have standbys attached to the old primary follow the new primary.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Configuration file settings</title>
|
|
|
|
<para>
|
|
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
|
|
switchover operation:
|
|
<itemizedlist spacing="compact" mark="bullet">
|
|
<listitem>
|
|
<simpara>
|
|
<literal>reconnect_attempts</literal>: number of times to check the original primary
|
|
for a clean shutdown after executing the shutdown command, before aborting
|
|
</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>
|
|
<literal>reconnect_interval</literal>: interval (in seconds) to check the original
|
|
primary for a clean shutdown after executing the shutdown command (up to a maximum
|
|
of <literal>reconnect_attempts</literal> tries)
|
|
</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>
|
|
<literal>replication_lag_critical</literal>:
|
|
if replication lag (in seconds) on the standby exceeds this value, the
|
|
switchover will be aborted (unless the <literal>-F/--force</literal> option
|
|
is provided)
|
|
</simpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<simpara>
|
|
<literal>standby_reconnect_timeout</literal>:
|
|
number of seconds to attempt to wait for the demoted primary
|
|
to reconnect to the promoted primary (default: 60 seconds)
|
|
</simpara>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
</para>
|
|
</refsect1>
|
|
|
|
|
|
<refsect1>
|
|
<title>Execution</title>
|
|
|
|
<para>
|
|
Execute with the <literal>--dry-run</literal> option to test the switchover as far as
|
|
possible without actually changing the status of either node.
|
|
</para>
|
|
<important>
|
|
<para>
|
|
<application>repmgrd</application> should not be active on any nodes while a switchover is being
|
|
executed. This restriction may be lifted in a later version.
|
|
</para>
|
|
</important>
|
|
<para>
|
|
External database connections, e.g. from an application, should not be permitted while
|
|
the switchover is taking place. In particular, active transactions on the primary
|
|
can potentially disrupt the shutdown process.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Event notifications</title>
|
|
<para>
|
|
<literal>standby_switchover</literal> and <literal>standby_promote</literal>
|
|
<link linkend="event-notifications">event notifications</link> will be generated for the new primary,
|
|
and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
|
|
</para>
|
|
<para>
|
|
If using an event notification script, <literal>standby_switchover</literal>
|
|
will populate the placeholder parameter <literal>%p</literal> with the node ID of
|
|
the former primary.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Exit codes</title>
|
|
<para>
|
|
Following exit codes can be emitted by <command>repmgr standby switchover</command>:
|
|
</para>
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><option>SUCCESS (0)</option></term>
|
|
<listitem>
|
|
<para>
|
|
The switchover completed successfully.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
|
|
<listitem>
|
|
<para>
|
|
The switchover could not be executed.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
|
|
<listitem>
|
|
<para>
|
|
The switchover was executed but a problem was encountered.
|
|
Typically this means the former primary could not be reattached
|
|
as a standby. Check preceding log messages for more information.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>See also</title>
|
|
<para>
|
|
For more details see the section <xref linkend="performing-switchover">.
|
|
</para>
|
|
</refsect1>
|
|
|
|
</refentry>
|