repmgr/doc/quickstart.sgml

<chapter id="quickstart" xreflabel="Quick-start guide">
 <title>Quick-start guide</title>

 <indexterm>
   <primary>quickstart</primary>
 </indexterm>

 <para>
  This section gives a quick introduction to &repmgr;, including setting up a
  sample &repmgr; installation and a basic replication cluster.
 </para>
 <para>
  These instructions for demonstration purposes and are not suitable for a production
  install, as issues such as account security considerations, and system administration
  best practices are omitted.
 </para>
 <note>
   <simpara>
     To upgrade an existing &repmgr; 3.x installation, see section
     <xref linkend="upgrading-from-repmgr-3">.
   </simpara>
 </note>

 <sect1 id="quickstart-prerequisites">
   <title>Prerequisites for setting up a basic replication cluster with &repmgr;</title>
    <para>
     The following section will describe how to set up a basic replication cluster
     with a primary and a standby server using the <application>repmgr</application>
     command line tool.
    </para>
    <para>
      We'll assume the primary is called <literal>node1</literal> with IP address
      <literal>192.168.1.11</literal>, and the standby is called <literal>node2</literal>
      with IP address <literal>192.168.1.12</literal>
    </para>
    <para>
     Following software must be installed on both servers:
     <itemizedlist spacing="compact" mark="bullet">
      <listitem>
       <simpara><application>PostgreSQL</application></simpara>
      </listitem>
      <listitem>
       <simpara>
        <application>repmgr</application> (matching the installed
        <application>PostgreSQL</application> major version)
       </simpara>
      </listitem>
     </itemizedlist>
    </para>

    <para>
      At network level, connections between the PostgreSQL port (default: <literal>5432</literal>)
      must be possible in both directions.
    </para>
    <para>
      If you want <application>repmgr</application> to copy configuration files which are
      located outside the PostgreSQL data directory, and/or to test
      <command><link linkend="repmgr-standby-switchover">switchover</link></command>
      functionality, you will also need passwordless SSH connections between both servers, and
      <application>rsync</application> should be installed.
    </para>
    <tip>
     <simpara>
      For testing <application>repmgr</application>, it's possible to use multiple PostgreSQL
      instances running on different ports on the same computer, with
      passwordless SSH access to <filename>localhost</filename> enabled.
     </simpara>
    </tip>
 </sect1>

 <sect1 id="quickstart-postgresql-configuration" xreflabel="PostgreSQL configuration">
   <title>PostgreSQL configuration</title>
   <para>
    On the primary server, a PostgreSQL instance must be initialised and running.
    The following replication settings may need to be adjusted:
   </para>
   <programlisting>

    # Enable replication connections; set this value to at least one more
    # than the number of standbys which will connect to this server
    # (note that repmgr will execute "pg_basebackup" in WAL streaming mode,
    # which requires two free WAL senders).
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-WAL-SENDERS

    max_wal_senders = 10

    # If using replication slots, set this value to at least one more
    # than the number of standbys which will connect to this server.
    # Note that repmgr will only make use of replication slots if
    # "use_replication_slots" is set to "true" in "repmgr.conf".
    # (If you are not intending to use replication slots, this value
    # can be set to "0").
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS

    max_replication_slots = 10

    # Ensure WAL files contain enough information to enable read-only queries
    # on the standby.
    #
    #  PostgreSQL 9.5 and earlier: one of 'hot_standby' or 'logical'
    #  PostgreSQL 9.6 and later: one of 'replica' or 'logical'
    #    ('hot_standby' will still be accepted as an alias for 'replica')
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL

    wal_level = 'hot_standby'

    # Enable read-only queries on a standby
    # (Note: this will be ignored on a primary but we recommend including
    # it anyway, in case the primary later becomes a standby)
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-HOT-STANDBY

    hot_standby = on

    # Enable WAL file archiving
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE

    archive_mode = on

    # Set archive command to a dummy command; this can later be changed without
    # needing to restart the PostgreSQL instance.
    #
    # See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-COMMAND

    archive_command = '/bin/true'
   </programlisting>
   <tip>
    <simpara>
      Rather than editing these settings in the default <filename>postgresql.conf</filename>
      file, create a separate file such as <filename>postgresql.replication.conf</filename> and
      include it from the end of the main configuration file with:
     <command>include 'postgresql.replication.conf'</command>.
    </simpara>
   </tip>
   <para>
     Additionally, if you are intending to use <application>pg_rewind</application>,
     and the cluster was not initialised using data checksums, you may want to consider enabling
     <varname>wal_log_hints</varname>; for more details see <xref linkend="repmgr-node-rejoin-pg-rewind">.
   </para>
    <para>
      See also the <link linkend="configuration-postgresql">PostgreSQL configuration</link> section in the
      <link linkend="configuration">repmgr configuration guide</link>.
    </para>
 </sect1>

 <sect1 id="quickstart-repmgr-user-database">
  <title>Create the repmgr user and database</title>
  <para>
   Create a dedicated PostgreSQL superuser account and a database for
   the &repmgr; metadata, e.g.
  </para>
  <programlisting>
   createuser -s repmgr
   createdb repmgr -O repmgr
  </programlisting>

  <para>
   For the examples in this document, the name <literal>repmgr</literal> will be
   used for both user and database, but any names can be used.
  </para>
  <note>
   <para>
    For the sake of simplicity, the <literal>repmgr</literal> user is created
    as a superuser. If desired, it's possible to create the <literal>repmgr</literal>
    user as a normal user. However for certain operations superuser permissions
    are requiredl; in this case the command line option <command>--superuser</command>
    can be provided to specify a superuser.
   </para>
   <para>
    It's also assumed that the <literal>repmgr</literal> user will be used to make the
    replication connection from the standby to the primary; again this can be
    overridden by specifying a separate replication user when registering each node.
   </para>
  </note>

  <tip>
    <para>
     &repmgr; will install the <literal>repmgr</literal> extension, which creates a
     <literal>repmgr</literal> schema containing the &repmgr;'s metadata tables as
     well as other functions and views. We also recommend that you set the
     <literal>repmgr</literal> user's search path to include this schema name, e.g.
     <programlisting>
       ALTER USER repmgr SET search_path TO repmgr, "$user", public;</programlisting>
    </para>
  </tip>

 </sect1>

 <sect1 id="quickstart-authentication">
  <title>Configuring authentication in pg_hba.conf</title>
  <para>
   Ensure the <literal>repmgr</literal> user has appropriate permissions in <filename>pg_hba.conf</filename> and
   can connect in replication mode; <filename>pg_hba.conf</filename> should contain entries
   similar to the following:
  </para>
  <programlisting>
    local   replication   repmgr                              trust
    host    replication   repmgr      127.0.0.1/32            trust
    host    replication   repmgr      192.168.1.0/24          trust

    local   repmgr        repmgr                              trust
    host    repmgr        repmgr      127.0.0.1/32            trust
    host    repmgr        repmgr      192.168.1.0/24          trust
  </programlisting>
  <para>
   Note that these are simple settings for testing purposes.
   Adjust according to your network environment and authentication requirements.
  </para>
 </sect1>

 <sect1 id="quickstart-standby-preparation">
  <title>Preparing the standby</title>
  <para>
   On the standby, do <emphasis>not</emphasis> create a PostgreSQL instance (i.e.
   do not execute <application>initdb</application> or any database creation
   scripts provided by packages), but do ensure the destination
   data directory (and any other directories which you want PostgreSQL to use)
   exist and are owned by the <literal>postgres</literal> system user. Permissions
   must be set to <literal>0700</literal> (<literal>drwx------</literal>).
  </para>
  <tip>
    <simpara>
      &repmgr; will place a copy of the primary's database files in this directory.
      It will however refuse to run if a PostgreSQL instance has already been
      created there.
    </simpara>
  </tip>
  <para>
   Check the primary database is reachable from the standby using <application>psql</application>:
  </para>
  <programlisting>
    psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>

  <note>
   <para>
    &repmgr; stores connection information as <ulink
    url="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING">libpq
    connection strings</ulink> throughout. This documentation refers to them as <literal>conninfo</literal>
    strings; an alternative name is <literal>DSN</literal> (<literal>data source name</literal>).
    We'll use these in place of the <command>-h hostname -d databasename -U username</command> syntax.
   </para>
  </note>
 </sect1>

 <sect1 id="quickstart-repmgr-conf">
  <title>repmgr configuration file</title>
  <para>
   Create a <filename>repmgr.conf</filename> file on the primary server. The file must
   contain at least the following parameters:
  </para>
  <programlisting>
    node_id=1
    node_name=node1
    conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
    data_directory='/var/lib/postgresql/data'
  </programlisting>

  <para>
   <filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory,
   as it could be overwritten when setting up or reinitialising the PostgreSQL
   server. See sections <xref linkend="configuration"> and <xref linkend="configuration-file">
   for further details about <filename>repmgr.conf</filename>.
  </para>

  <note>
    <para>
      &repmgr; only uses <option>pg_bindir</option> when it executes
      PostgreSQL binaries directly.
    </para>
    <para>
      For user-defined scripts such as <option>promote_command</option> and the
      various <option>service_*_command</option>s, you <emphasis>must</emphasis>
      always explicitly provide the full path to the binary or script being
      executed, even if it is &repmgr; itself.
    </para>
    <para>
      This is because these options can contain user-defined scripts in arbitrary
      locations, so prepending <option>pg_bindir</option> may break them.
    </para>
  </note>

  <tip>
   <simpara>
    For Debian-based distributions we recommend explictly setting
    <option>pg_bindir</option> to the directory where <command>pg_ctl</command> and other binaries
    not in the standard path are located. For PostgreSQL 9.6 this would be <filename>/usr/lib/postgresql/9.6/bin/</filename>.
   </simpara>
  </tip>

  <tip>
    <simpara>
      If your distribution places the &repmgr; binaries in a location other than the
      PostgreSQL installation directory, specify this with <option>repmgr_bindir</option>
      to enable &repmgr; to perform operations (e.g.
      <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>)
      on other nodes.
    </simpara>
  </tip>

  <para>
   See the file
   <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</>
    for details of all available configuration parameters.
  </para>

 </sect1>


 <sect1 id="quickstart-primary-register">
  <title>Register the primary server</title>
  <para>
   To enable &repmgr; to support a replication cluster, the primary node must
   be registered with &repmgr;. This installs the <literal>repmgr</literal>
   extension and metadata objects, and adds a metadata record for the primary server:
  </para>

  <programlisting>
    $ repmgr -f /etc/repmgr.conf primary register
    INFO: connecting to primary database...
    NOTICE: attempting to install extension "repmgr"
    NOTICE: "repmgr" extension successfully installed
    NOTICE: primary node record (id: 1) registered</programlisting>

  <para>
    Verify status of the cluster like this:
  </para>
  <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Connection string
    ----+-------+---------+-----------+----------+--------------------------------------------------------
     1  | node1 | primary | * running |          | host=node1 dbname=repmgr user=repmgr connect_timeout=2
  </programlisting>
  <para>
    The record in the <literal>repmgr</literal> metadata table will look like this:
  </para>
  <programlisting>
    repmgr=# SELECT * FROM repmgr.nodes;
    -[ RECORD 1 ]----+-------------------------------------------------------
    node_id          | 1
    upstream_node_id |
    active           | t
    node_name        | node1
    type             | primary
    location         | default
    priority         | 100
    conninfo         | host=node1 dbname=repmgr user=repmgr connect_timeout=2
    repluser         | repmgr
    slot_name        |
    config_file      | /etc/repmgr.conf</programlisting>
  <para>
    Each server in the replication cluster will have its own record. If <application>repmgrd</application>
    is in use, the fields <literal>upstream_node_id</literal>, <literal>active</literal> and
    <literal>type</literal> will be updated when the node's status or role changes.
  </para>
 </sect1>

 <sect1 id="quickstart-standby-clone">
  <title>Clone the standby server</title>
  <para>
   Create a <filename>repmgr.conf</filename> file on the standby server. It must contain at
   least the same parameters as the primary's <filename>repmgr.conf</filename>, but with
   the mandatory values <literal>node</literal>, <literal>node_name</literal>, <literal>conninfo</literal>
   (and possibly <literal>data_directory</literal>) adjusted accordingly, e.g.:
  </para>
  <programlisting>
    node_id=2
    node_name=node2
    conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
    data_directory='/var/lib/postgresql/data'</programlisting>
  <para>
   Use the <command>--dry-run</command> option to check the standby can be cloned:
  </para>
  <programlisting>
    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run
    NOTICE: using provided configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to source node
    NOTICE: checking for available walsenders on source node (2 required)
    INFO: sufficient walsenders available on source node (2 required)
    NOTICE: standby will attach to upstream node 1
    HINT: consider using the -c/--fast-checkpoint option
    INFO: all prerequisites for "standby clone" are met</programlisting>
  <para>
    If no problems are reported, the standby can then be cloned with:
  </para>
  <programlisting>
    $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone

    NOTICE: using configuration file "/etc/repmgr.conf"
    NOTICE: destination directory "/var/lib/postgresql/data" provided
    INFO: connecting to source node
    NOTICE: checking for available walsenders on source node (2 required)
    INFO: sufficient walsenders available on source node (2 required)
    INFO: creating directory "/var/lib/postgresql/data"...
    NOTICE: starting backup (using pg_basebackup)...
    HINT: this may take some time; consider using the -c/--fast-checkpoint option
    INFO: executing:
      pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream
    NOTICE: standby clone (using pg_basebackup) complete
    NOTICE: you can now start your PostgreSQL server
    HINT: for example: pg_ctl -D /var/lib/postgresql/data start
  </programlisting>
  <para>
   This has cloned the PostgreSQL data directory files from the primary <literal>node1</literal>
   using PostgreSQL's <command>pg_basebackup</command> utility. A <filename>recovery.conf</filename>
   file containing the correct parameters to start streaming from this primary server will be created
   automatically.
  </para>
  <note>
   <simpara>
    By default, any configuration files in the primary's data directory will be
    copied to the standby. Typically these will be <filename>postgresql.conf</filename>,
    <filename>postgresql.auto.conf</filename>, <filename>pg_hba.conf</filename> and
    <filename>pg_ident.conf</filename>. These may require modification before the standby
    is started.
   </simpara>
  </note>
  <para>
   Make any adjustments to the standby's PostgreSQL configuration files now,
   then start the server.
  </para>
  <para>
   For more details on <command>repmgr standby clone</command>, see the
   <link linkend="repmgr-standby-clone">command reference</link>.
   A more detailed overview of cloning options is available in the
   <link linkend="cloning-standbys">administration manual</link>.
  </para>
 </sect1>

 <sect1 id="quickstart-verify-replication">
  <title>Verify replication is functioning</title>
  <para>
   Connect to the primary server and execute:
   <programlisting>
    repmgr=# SELECT * FROM pg_stat_replication;
    -[ RECORD 1 ]----+------------------------------
    pid              | 19111
    usesysid         | 16384
    usename          | repmgr
    application_name | node2
    client_addr      | 192.168.1.12
    client_hostname  |
    client_port      | 50378
    backend_start    | 2017-08-28 15:14:19.851581+09
    backend_xmin     |
    state            | streaming
    sent_location    | 0/7000318
    write_location   | 0/7000318
    flush_location   | 0/7000318
    replay_location  | 0/7000318
    sync_priority    | 0
    sync_state       | async</programlisting>
   This shows that the previously cloned standby (<literal>node2</literal> shown in the field
   <literal>application_name</literal>) has connected to the primary from IP address
   <literal>192.168.1.12</literal>.
  </para>
  <para>
    From PostgreSQL 9.6 you can also use the view
    <ulink url="https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW">
    <literal>pg_stat_wal_receiver</literal></ulink> to check the replication status from the standby.

   <programlisting>
    repmgr=# SELECT * FROM pg_stat_wal_receiver;
    Expanded display is on.
    -[ RECORD 1 ]---------+--------------------------------------------------------------------------------
    pid                   | 18236
    status                | streaming
    receive_start_lsn     | 0/3000000
    receive_start_tli     | 1
    received_lsn          | 0/7000538
    received_tli          | 1
    last_msg_send_time    | 2017-08-28 15:21:26.465728+09
    last_msg_receipt_time | 2017-08-28 15:21:26.465774+09
    latest_end_lsn        | 0/7000538
    latest_end_time       | 2017-08-28 15:20:56.418735+09
    slot_name             |
    conninfo              | user=repmgr dbname=replication host=node1 application_name=node2
   </programlisting>
   Note that the <varname>conninfo</varname> value is that generated in <filename>recovery.conf</filename>
   and will differ slightly from the primary's <varname>conninfo</varname> as set in <filename>repmgr.conf</filename> -
   among others it will contain the connecting node's name as <varname>application_name</varname>.
  </para>
 </sect1>

 <sect1 id="quickstart-register-standby">
  <title>Register the standby</title>
  <para>
    Register the standby server with:
    <programlisting>
    $ repmgr -f /etc/repmgr.conf standby register
    NOTICE: standby node "node2" (ID: 2) successfully registered</programlisting>
  </para>
  <para>
    Check the node is registered by executing <command>repmgr cluster show</command> on the standby:
    <programlisting>
    $ repmgr -f /etc/repmgr.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr</programlisting>
  </para>
  <para>
   Both nodes are now registered with &repmgr; and the records have been copied to the standby server.
  </para>
 </sect1>

</chapter>