repmgr/doc/switchover.xml

<chapter id="performing-switchover" xreflabel="Performing a switchover with repmgr">
 <title>Performing a switchover with repmgr</title>

 <indexterm>
  <primary>switchover</primary>
 </indexterm>

 <para>
  A typical use-case for replication is a combination of primary and standby
  server, with the standby serving as a backup which can easily be activated
  in case of a problem with the primary. Such an unplanned failover would
  normally be handled by promoting the standby, after which an appropriate
  action must be taken to restore the old primary.
 </para>
 <para>
  In some cases however it's desirable to promote the standby in a planned
  way, e.g. so maintenance can be performed on the primary; this kind of switchover
  is supported by the <xref linkend="repmgr-standby-switchover"/> command.
 </para>
 <para>
  <command>repmgr standby switchover</command> differs from other &repmgr;
  actions in that it also performs actions on other servers (the demotion
  candidate, and optionally any other servers which are to follow the new primary),
  which means passwordless SSH access is required to those servers from the one where
  <command>repmgr standby switchover</command> is executed.
 </para>
 <note>
  <simpara>
   <command>repmgr standby switchover</command> performs a relatively complex
   series of operations on two servers, and should therefore be performed after
   careful preparation and with adequate attention. In particular you should
   be confident that your network environment is stable and reliable.
  </simpara>
  <simpara>
   Additionally you should be sure that the current primary can be shut down
   quickly and cleanly. In particular, access from applications should be
   minimalized or preferably blocked completely. Also be aware that if there
   is a backlog of files waiting to be archived, PostgreSQL will not shut
   down until archiving completes.
  </simpara>
  <simpara>
    We recommend running <command>repmgr standby switchover</command> at the
    most verbose logging level (<literal>--log-level=DEBUG --verbose</literal>)
    and capturing all output to assist troubleshooting any problems.
  </simpara>
  <simpara>
   Please also read carefully the sections <xref linkend="preparing-for-switchover"/> and
   <xref linkend="switchover-caveats"/> below.
  </simpara>
 </note>

 <sect1 id="preparing-for-switchover" xreflabel="Preparing for switchover">
   <title>Preparing for switchover</title>

   <indexterm>
     <primary>switchover</primary>
     <secondary>preparation</secondary>
   </indexterm>

   <para>
    As mentioned in the previous section, success of the switchover operation depends on
    &repmgr; being able to shut down the current primary server quickly and cleanly.
   </para>

   <para>
     Ensure that the promotion candidate has sufficient free walsenders available
     (PostgreSQL configuration item <varname>max_wal_senders</varname>), and if replication
     slots are in use, at least one free slot is available for the demotion candidate (
     PostgreSQL configuration item <varname>max_replication_slots</varname>).
   </para>

   <para>
     Ensure that a passwordless SSH connection is possible from the promotion candidate
     (standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
     will be used, ensure that passwordless SSH connections are possible from the
     promotion candidate to all nodes attached to the demotion candidate
     (including the witness server, if in use).
   </para>

   <note>
     <simpara>
       &repmgr; expects to find the &repmgr; binary in the same path on the remote
       server as on the local server.
     </simpara>
   </note>

   <para>
    Double-check which commands will be used to stop/start/restart the current
    primary; this can be done by e.g. executing <command><link linkend="repmgr-node-service">repmgr node service</link></command>
    on the current primary:
    <programlisting>
     repmgr -f /etc/repmgr.conf node service --list-actions --action=stop
     repmgr -f /etc/repmgr.conf node service --list-actions --action=start
     repmgr -f /etc/repmgr.conf node service --list-actions --action=restart</programlisting>

   </para>

   <para>
     These commands can be defined in <filename>repmgr.conf</filename> with
     <option>service_start_command</option>, <option>service_stop_command</option>
     and <option>service_restart_command</option>.
   </para>

   <important>
     <para>
       If &repmgr; is installed from a package. you should set these commands
       to use the appropriate service commands defined by the package/operating
       system as these will ensure PostgreSQL is stopped/started properly
       taking into account configuration and log file locations etc.
     </para>
     <para>
       If the <option>service_*_command</option> options aren't defined, &repmgr; will
       fall back to using <application>pg_ctl</application> to stop/start/restart
       PostgreSQL, which may not work properly, particularly when executed on a remote
       server.
     </para>
     <para>
       For more details, see <xref linkend="configuration-file-service-commands"/>.
     </para>
   </important>

   <note>
    <para>
     On <literal>systemd</literal> systems we strongly recommend using the appropriate
     <command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
     <literal>systemd</literal> is informed about the status of the PostgreSQL service.
    </para>
    <para>
     If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
     <command>sudo</command> specification doesn't require a real tty for the user. If not set
     this way, <command>repmgr</command> will fail to stop the primary.
    </para>
    <para>
      See the <xref linkend="configuration-file-service-commands"/> documentation section for further details.
    </para>
   </note>

   <para>
     Check that access from applications is minimalized or preferably blocked
     completely, so applications are not unexpectedly interrupted.
   </para>

   <note>
     <para>
       If an exclusive backup is running on the current primary, or if WAL replay is paused on the standby,
       &repmgr; will <emphasis>not</emphasis> perform the switchover.
     </para>
   </note>

   <para>
     Check there is no significant replication lag on standbys attached to the
     current primary.
   </para>

   <para>
    If WAL file archiving is set up, check that there is no backlog of files waiting
    to be archived, as PostgreSQL will not finally shut down until all of these have been
    archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files,
    &repmgr; will emit a warning before attempting to perform a switchover; you can also check
    manually with <command>repmgr node check --archive-ready</command>.
   </para>

    <note>
      <para>
        From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
        &repmgrd; instances to pause operations while the switchover
        is being carried out, to prevent &repmgrd; from
        unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing"/>.
      </para>
      <para>
        Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd;
        is not running on any nodes while a switchover is being executed.
      </para>
    </note>


   <para>
    Finally, consider executing <command>repmgr standby switchover</command> with the
    <literal>--dry-run</literal> option; this will perform any necessary checks and inform you about
    success/failure, and stop before the first actual command is run (which would be the shutdown of the
    current primary). Example output:
    <programlisting>
      $ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run
      NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
      INFO: SSH connection to host "node1" succeeded
      INFO: archive mode is "off"
      INFO: replication lag on this standby is 0 seconds
      INFO: all sibling nodes are reachable via SSH
      NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
      INFO: following shutdown command would be run on node "node1":
        "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
      INFO: parameter "shutdown_check_timeout" is set to 60 seconds
    </programlisting>
   </para>

   <important>
     <para>
       Be aware that <option>--dry-run</option> checks the prerequisites
       for performing the switchover and some basic sanity checks on the
       state of the database which might effect the switchover operation
       (e.g. replication lag); it cannot however guarantee the switchover
       operation will succeed. In particular, if the current primary
       does not shut down cleanly, &repmgr; will not be able to reliably
       execute the switchover (as there would be a danger of divergence
       between the former and new primary nodes).
     </para>
   </important>


   <note>
     <simpara>
       See <xref linkend="repmgr-standby-switchover"/> for a full list of available
       command line options and <filename>repmgr.conf</filename> settings relevant
       to performing a switchover.
     </simpara>
   </note>

   <sect2 id="switchover-pg-rewind" xreflabel="Switchover and pg_rewind">
    <title>Switchover and pg_rewind</title>

    <indexterm>
      <primary>pg_rewind</primary>
      <secondary>using with "repmgr standby switchover"</secondary>
    </indexterm>
    <para>
      If the demotion candidate does not shut down smoothly or cleanly, there's a risk it
      will have a slightly divergent timeline and will not be able to attach to the new
      primary. To fix this situation without needing to reclone the old primary, it's
      possible to use the <application>pg_rewind</application> utility, which will usually be
      able to resync the two servers.
    </para>
    <para>
      To have &repmgr; execute <application>pg_rewind</application> if it detects this
      situation after promoting the new primary, add the <option>--force-rewind</option>
      option.
    </para>
    <note>
      <simpara>
        If &repmgr; detects a situation where it needs to execute <application>pg_rewind</application>,
        it will execute a <literal>CHECKPOINT</literal> on the new primary before executing
        <application>pg_rewind</application>.
      </simpara>
    </note>
    <para>
      For more details on <application>pg_rewind</application>, see:
      <ulink url="https://www.postgresql.org/docs/current/app-pgrewind.html">https://www.postgresql.org/docs/current/app-pgrewind.html</ulink>.
    </para>
    <para>
      <application>pg_rewind</application> has been part of the core PostgreSQL distribution since
      version 9.5. Users of PostgreSQL 9.4 will need to manually install it; the source code is available here:
      <ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
      If the <application>pg_rewind</application>
      binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
      its full path  on the demotion candidate  with <option>--force-rewind</option>.
    </para>
    <para>
      Note that building the 9.4 version of <application>pg_rewind</application> requires the PostgreSQL
      source code.
    </para>
  </sect2>


 </sect1>

 <sect1 id="switchover-execution" xreflabel="Executing the switchover command">
  <title>Executing the switchover command</title>

  <indexterm>
   <primary>switchover</primary>
    <secondary>execution</secondary>
  </indexterm>
  <para>
   To demonstrate switchover, we will assume a replication cluster with a
   primary (<literal>node1</literal>) and one standby (<literal>node2</literal>);
   after the switchover <literal>node2</literal> should become the primary with
   <literal>node1</literal> following it.
  </para>
  <para>
   The switchover command must be run from the standby which is to be promoted,
   and in its simplest form looks like this:
   <programlisting>
    $ repmgr -f /etc/repmgr.conf standby switchover
    NOTICE: executing switchover on node "node2" (ID: 2)
    INFO: searching for primary node
    INFO: checking if node 1 is primary
    INFO: current primary node is 1
    INFO: SSH connection to host "node1" succeeded
    INFO: archive mode is "off"
    INFO: replication lag on this standby is 0 seconds
    NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
    NOTICE: stopping current primary node "node1" (ID: 1)
    NOTICE: issuing CHECKPOINT
    DETAIL: executing server command "pg_ctl -l /var/log/postgres/startup.log -D '/var/lib/pgsql/data' -m fast -W stop"
    INFO: checking primary status; 1 of 6 attempts
    NOTICE: current primary has been cleanly shut down at location 0/3001460
    NOTICE: promoting standby to primary
    DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' promote"
    server promoting
    NOTICE: STANDBY PROMOTE successful
    DETAIL: server "node2" (ID: 2) was successfully promoted to primary
    INFO: setting node 1's primary to node 2
    NOTICE: starting server using  "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' restart"
    NOTICE: NODE REJOIN successful
    DETAIL: node 1 is now attached to node 2
    NOTICE: switchover was successful
    DETAIL: node "node2" is now primary
    NOTICE: STANDBY SWITCHOVER is complete
   </programlisting>
  </para>
  <para>
   The old primary is now replicating as a standby from the new primary, and the
   cluster status will now look like this:
   <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | standby |   running | node2    | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | primary | * running |          | default  | host=node2 dbname=repmgr user=repmgr
   </programlisting>
  </para>
  <para>
    If &repmgrd; is in use, it's worth double-checking that
    all nodes are unpaused by executing
    <command><link linkend="repmgr-service-status">repmgr service status</link></command>
    (&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-status">repmgr daemon status</link></command>).
  </para>

   <note>
     <para>
       Users of &repmgr; versions prior to 4.2 will need to manually restart &repmgrd;
       on all nodes after the switchover is completed.
     </para>
    </note>

 </sect1>


 <sect1 id="switchover-caveats" xreflabel="Caveats">
  <title>Caveats</title>
  <indexterm>
   <primary>switchover</primary>
    <secondary>caveats</secondary>
  </indexterm>
  <para>
   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <simpara>
      If using PostgreSQL 9.4, you should ensure that the shutdown command
      is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
      and later). If relying on <command>pg_ctl</command> to perform database server operations,
      you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>
      in <filename>repmgr.conf</filename>.
     </simpara>
    </listitem>
    <listitem>
     <simpara>
      <command>pg_rewind</command> *requires* that either <varname>wal_log_hints</varname> is enabled, or that
      data checksums were enabled when the cluster was initialized. See the
      <ulink url="https://www.postgresql.org/docs/current/app-pgrewind.html">pg_rewind documentation</ulink>
      for details.
     </simpara>
    </listitem>
   </itemizedlist>
  </para>
 </sect1>

 <sect1 id="switchover-troubleshooting" xreflabel="Troubleshooting">
   <title>Troubleshooting switchover issues</title>

   <indexterm>
     <primary>switchover</primary>
     <secondary>troubleshooting</secondary>
   </indexterm>

   <para>
     As <link linkend="performing-switchover">emphasised previously</link>, performing a switchover
     is a non-trivial operation and there are a number of potential issues which can occur.
     While &repmgr; attempts to perform sanity checks, there's no guaranteed way of determining the success of
     a switchover without actually carrying it out.
   </para>

   <sect2 id="switchover-troubleshooting-primary-shutdown">
     <title>Demotion candidate (old primary) does not shut down</title>
     <para>
       &repmgr; may abort a switchover with a message like:
       <programlisting>
ERROR: shutdown of the primary server could not be confirmed
HINT: check the primary server status before performing any further actions</programlisting>
     </para>
     <para>
       This means the shutdown of the old primary has taken longer than &repmgr; expected,
       and it has given up waiting.
     </para>
     <para>
       In this case, check the PostgreSQL log on the primary server to see what is going
       on. It's entirely possible the shutdown process is just taking longer than the
       timeout set by the configuration parameter <varname>shutdown_check_timeout</varname>
       (default: 60 seconds), in which case you may need to adjust this parameter.
     </para>
     <note>
       <para>
         Note that <varname>shutdown_check_timeout</varname> is set on the node where
         <command>repmgr standby switchover</command> is executed (promotion candidate); setting it on the
         demotion candidate (former primary) will have no effect.
       </para>
     </note>
     <para>
       If the primary server has shut down cleanly, and no other node has been promoted,
       it is safe to restart it, in which case the replication cluster will be restored
       to its original configuration.
     </para>
   </sect2>

   <sect2 id="switchover-troubleshooting-exclusive-backup">
     <title>Switchover aborts with an &quot;exclusive backup&quot; error</title>
     <para>
       &repmgr; may abort a switchover with a message like:
       <programlisting>
ERROR: unable to perform a switchover while primary server is in exclusive backup mode
HINT: stop backup before attempting the switchover</programlisting>
     </para>
     <para>
       This means an exclusive backup is running on the current primary; interrupting this
       will not only abort the backup, but potentially leave the primary with an ambiguous
       backup state.
     </para>
     <para>
       To proceed, either wait until the backup has finished, or cancel it with the command
       <command>SELECT pg_stop_backup()</command>. For more details see the PostgreSQL
       documentation section
       <ulink url="https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-EXCLUSIVE">Making an exclusive low level backup</ulink>.
     </para>
   </sect2>
 </sect1>

</chapter>