mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 07:06:30 +00:00
Split up command reference
This commit is contained in:
@@ -1,610 +0,0 @@
|
||||
<chapter id="command-reference" xreflabel="command reference">
|
||||
<title>repmgr command reference</title>
|
||||
|
||||
<para>
|
||||
Overview of repmgr commands.
|
||||
</para>
|
||||
|
||||
<sect1 id="repmgr-primary-register" xreflabel="repmgr primary register">
|
||||
<indexterm><primary>repmgr primary register</primary></indexterm>
|
||||
<title>repmgr primary register</title>
|
||||
<para>
|
||||
<command>repmgr primary register</command> registers a primary node in a
|
||||
streaming replication cluster, and configures it for use with repmgr, including
|
||||
installing the &repmgr; extension. This command needs to be executed before any
|
||||
standby nodes are registered.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to check what would happen without
|
||||
actually registering the primary.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgr master register</command> can be used as an alias for
|
||||
<command>repmgr primary register</command>/
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-primary-unregister" xreflabel="repmgr primary unregister">
|
||||
<indexterm><primary>repmgr primary unregister</primary></indexterm>
|
||||
<title>repmgr primary unregister</title>
|
||||
<para>
|
||||
<command>repmgr primary register</command> unregisters an inactive primary node
|
||||
from the `repmgr` metadata. This is typically when the primary has failed and is
|
||||
being removed from the cluster after a new primary has been promoted.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to check what would happen without
|
||||
actually unregistering the node.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>repmgr master unregister</command> can be used as an alias for
|
||||
<command>repmgr primary unregister</command>/
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-standby-clone" xreflabel="repmgr standby clone">
|
||||
<indexterm>
|
||||
<primary>repmgr standby clone</primary>
|
||||
<seealso>cloning</seealso>
|
||||
</indexterm>
|
||||
<title>repmgr standby clone</title>
|
||||
<para>
|
||||
<command>repmgr standby clone</command> clones a PostgreSQL node from another
|
||||
PostgreSQL node, typically the primary, but optionally from any other node in
|
||||
the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
|
||||
to attach the cloned node to the primary node (or another standby, if cascading replication
|
||||
is in use).
|
||||
</para>
|
||||
<note>
|
||||
<simpara>
|
||||
<command>repmgr standby clone</command> does not start the standby, and after cloning
|
||||
<command>repmgr standby register</command> must be executed to notify &repmgr; of its presence.
|
||||
</simpara>
|
||||
</note>
|
||||
|
||||
|
||||
<sect2 id="repmgr-standby-clone-config-file-copying" xreflabel="Copying configuration files">
|
||||
<title>Handling configuration files</title>
|
||||
|
||||
<para>
|
||||
Note that by default, all configuration files in the source node's data
|
||||
directory will be copied to the cloned node. Typically these will be
|
||||
<filename>postgresql.conf</filename>, <filename>postgresql.auto.conf</filename>,
|
||||
<filename>pg_hba.conf</filename> and <filename>pg_ident.conf</filename>.
|
||||
These may require modification before the standby is started.
|
||||
</para>
|
||||
<para>
|
||||
In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
|
||||
configuration files are located outside of the data directory and will
|
||||
not be copied by default. &repmgr; can copy these files, either to the same
|
||||
location on the standby server (provided appropriate directory and file permissions
|
||||
are available), or into the standby's data directory. This requires passwordless
|
||||
SSH access to the primary server. Add the option <literal>--copy-external-config-files</literal>
|
||||
to the <command>repmgr standby clone</command> command; by default files will be copied to
|
||||
the same path as on the upstream server. Note that the user executing <command>repmgr</command>
|
||||
must have write access to those directories.
|
||||
</para>
|
||||
<para>
|
||||
To have the configuration files placed in the standby's data directory, specify
|
||||
<literal>--copy-external-config-files=pgdata</literal>, but note that
|
||||
any include directives in the copied files may need to be updated.
|
||||
</para>
|
||||
<tip>
|
||||
<simpara>
|
||||
For reliable configuration file management we recommend using a
|
||||
configuration management tool such as Ansible, Chef, Puppet or Salt.
|
||||
</simpara>
|
||||
</tip>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="repmgr-standby-clone-wal-management" xreflabel="Managing WAL during the cloning process">
|
||||
<title>Managing WAL during the cloning process</title>
|
||||
<para>
|
||||
When initially cloning a standby, you will need to ensure
|
||||
that all required WAL files remain available while the cloning is taking
|
||||
place. To ensure this happens when using the default `pg_basebackup` method,
|
||||
&repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
||||
parameter to <literal>stream</literal>,
|
||||
which will ensure all WAL files generated during the cloning process are
|
||||
streamed in parallel with the main backup. Note that this requires two
|
||||
replication connections to be available (&repmgr; will verify sufficient
|
||||
connections are available before attempting to clone, and this can be checked
|
||||
before performing the clone using the <literal>--dry-run</literal> option).
|
||||
</para>
|
||||
<para>
|
||||
To override this behaviour, in <filename>repmgr.conf</filename> set
|
||||
<command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
||||
parameter to <literal>fetch</literal>:
|
||||
<programlisting>
|
||||
pg_basebackup_options='--xlog-method=fetch'</programlisting>
|
||||
|
||||
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
|
||||
See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
|
||||
pg_basebackup</ulink> documentation for details.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<simpara>
|
||||
From PostgreSQL 10, <command>pg_basebackup</command>'s
|
||||
<literal>--xlog-method</literal> parameter has been renamed to
|
||||
<literal>--wal-method</literal>.
|
||||
</simpara>
|
||||
</note>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="repmgr-standby-register" xreflabel="repmgr standby register">
|
||||
<indexterm><primary>repmgr standby register</primary></indexterm>
|
||||
<title>repmgr standby register</title>
|
||||
<para>
|
||||
<command>repmgr standby register</command> adds a standby's information to
|
||||
the &repmgr; metadata. This command needs to be executed to enable
|
||||
promote/follow operations and to allow <command>repmgrd</command> to work with the node.
|
||||
An existing standby can be registered using this command. Execute with the
|
||||
<literal>--dry-run</literal> option to check what would happen without actually registering the
|
||||
standby.
|
||||
</para>
|
||||
|
||||
<sect2 id="repmgr-standby-register-wait" xreflabel="repmgr standby register --wait">
|
||||
<title>Waiting for the registration to propagate to the standby</title>
|
||||
<para>
|
||||
Depending on your environment and workload, it may take some time for
|
||||
the standby's node record to propagate from the primary to the standby. Some
|
||||
actions (such as starting <command>repmgrd</command>) require that the standby's node record
|
||||
is present and up-to-date to function correctly.
|
||||
</para>
|
||||
<para>
|
||||
By providing the option <literal>--wait-sync</literal> to the
|
||||
<command>repmgr standby register</command> command, &repmgr; will wait
|
||||
until the record is synchronised before exiting. An optional timeout (in
|
||||
seconds) can be added to this option (e.g. <literal>--wait-sync=60</literal>).
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="rempgr-standby-register-inactive-node" xreflabel="Registering an inactive node">
|
||||
<title>Registering an inactive node</title>
|
||||
<para>
|
||||
Under some circumstances you may wish to register a standby which is not
|
||||
yet running; this can be the case when using provisioning tools to create
|
||||
a complex replication cluster. In this case, by using the <literal>-F/--force</literal>
|
||||
option and providing the connection parameters to the primary server,
|
||||
the standby can be registered.
|
||||
</para>
|
||||
<para>
|
||||
Similarly, with cascading replication it may be necessary to register
|
||||
a standby whose upstream node has not yet been registered - in this case,
|
||||
using <literal>-F/--force</literal> will result in the creation of an inactive placeholder
|
||||
record for the upstream node, which will however later need to be registered
|
||||
with the <literal>-F/--force</literal> option too.
|
||||
</para>
|
||||
<para>
|
||||
When used with <command>repmgr standby register</command>, care should be taken that use of the
|
||||
<literal>-F/--force</literal> option does not result in an incorrectly configured cluster.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="repmgr-standby-unregister" xreflabel="repmgr standby unregister">
|
||||
<indexterm><primary>repmgr standby unregister</primary></indexterm>
|
||||
<title>repmgr standby unregister</title>
|
||||
<para>
|
||||
Unregisters a standby with `repmgr`. This command does not affect the actual
|
||||
replication, just removes the standby's entry from the &repmgr; metadata.
|
||||
</para>
|
||||
<para>
|
||||
To unregister a running standby, execute:
|
||||
<programlisting>
|
||||
repmgr standby unregister -f /etc/repmgr.conf</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
This will remove the standby record from &repmgr;'s internal metadata
|
||||
table (<literal>repmgr.nodes</literal>). A <literal>standby_unregister</literal>
|
||||
event notification will be recorded in the <literal>repmgr.events</literal> table.
|
||||
</para>
|
||||
<para>
|
||||
If the standby is not running, the command can be executed on another
|
||||
node by providing the id of the node to be unregistered using
|
||||
the command line parameter <literal>--node-id</literal>, e.g. executing the following
|
||||
command on the master server will unregister the standby with
|
||||
id <literal>3</literal>:
|
||||
<programlisting>
|
||||
repmgr standby unregister -f /etc/repmgr.conf --node-id=3
|
||||
</programlisting>
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-standby-promote" xreflabel="repmgr standby promote">
|
||||
<indexterm>
|
||||
<primary>repmgr standby promote</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby promote</title>
|
||||
<para>
|
||||
Promotes a standby to a primary if the current primary has failed. This
|
||||
command requires a valid <filename>repmgr.conf</filename> file for the standby, either
|
||||
specified explicitly with <literal>-f/--config-file</literal> or located in a
|
||||
default location; no additional arguments are required.
|
||||
</para>
|
||||
<para>
|
||||
If the standby promotion succeeds, the server will not need to be
|
||||
restarted. However any other standbys will need to follow the new server,
|
||||
by using <xref linkend="repmgr-standby-follow">; if <command>repmgrd</command> is active, it will
|
||||
handle this automatically.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-standby-follow" xreflabel="repmgr standby follow">
|
||||
<indexterm>
|
||||
<primary>repmgr standby follow</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby follow</title>
|
||||
<para>
|
||||
Attaches the standby to a new primary. This command requires a valid
|
||||
<filename>repmgr.conf</filename> file for the standby, either specified
|
||||
explicitly with <literal>-f/--config-file</literal> or located in a
|
||||
default location; no additional arguments are required.
|
||||
</para>
|
||||
<para>
|
||||
This command will force a restart of the standby server, which must be
|
||||
running. It can only be used to attach a standby to a new primary node.
|
||||
</para>
|
||||
<para>
|
||||
To re-add an inactive node to the replication cluster, see
|
||||
<xref linkend="repmgr-node-rejoin">
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1 id="repmgr-standby-switchover" xreflabel="repmgr standby switchover">
|
||||
<indexterm>
|
||||
<primary>repmgr standby switchover</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby switchover</title>
|
||||
<para>
|
||||
Promotes a standby to primary and demotes the existing primary to a standby.
|
||||
This command must be run on the standby to be promoted, and requires a
|
||||
passwordless SSH connection to the current primary.
|
||||
</para>
|
||||
<para>
|
||||
If other standbys are connected to the demotion candidate, &repmgr; can instruct
|
||||
these to follow the new primary if the option <literal>--siblings-follow</literal>
|
||||
is specified.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to test the switchover as far as
|
||||
possible without actually changing the status of either node.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgrd</command> should not be active on any nodes while a switchover is being
|
||||
executed. This restriction may be lifted in a later version.
|
||||
</para>
|
||||
<para>
|
||||
For more details see the section <xref linkend="performing-switchover">.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="repmgr-node-status" xreflabel="repmgr node status">
|
||||
<indexterm>
|
||||
<primary>repmgr node status</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node status</title>
|
||||
<para>
|
||||
Displays an overview of a node's basic information and replication
|
||||
status. This command must be run on the local node.
|
||||
</para>
|
||||
<para>
|
||||
Sample output (execute <command>repmgr node status</command>):
|
||||
<programlisting>
|
||||
Node "node1":
|
||||
PostgreSQL version: 10beta1
|
||||
Total data size: 30 MB
|
||||
Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2
|
||||
Role: primary
|
||||
WAL archiving: off
|
||||
Archive command: (none)
|
||||
Replication connections: 2 (of maximal 10)
|
||||
Replication slots: 0 (of maximal 10)
|
||||
Replication lag: n/a
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
See <xref linkend="repmgr-node-check"> to diagnose issues.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-node-check" xreflabel="repmgr node check">
|
||||
<indexterm>
|
||||
<primary>repmgr node check</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node check</title>
|
||||
<para>
|
||||
Performs some health checks on a node from a replication perspective.
|
||||
This command must be run on the local node.
|
||||
</para>
|
||||
<para>
|
||||
Sample output (execute <command>repmgr node check</command>):
|
||||
<programlisting>
|
||||
Node "node1":
|
||||
Server role: OK (node is primary)
|
||||
Replication lag: OK (N/A - node is primary)
|
||||
WAL archiving: OK (0 pending files)
|
||||
Downstream servers: OK (2 of 2 downstream nodes attached)
|
||||
Replication slots: OK (node has no replication slots)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Additionally each check can be performed individually by supplying
|
||||
an additional command line parameter, e.g.:
|
||||
<programlisting>
|
||||
$ repmgr node check --role
|
||||
OK (node is primary)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Parameters for individual checks are as follows:
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--role</literal>: checks if the node has the expected role
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--replication-lag</literal>: checks if the node is lagging by more than
|
||||
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--downstream</literal>: checks that the expected downstream nodes are attached
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--slots</literal>: checks there are no inactive replication slots
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
</para>
|
||||
<para>
|
||||
Individual checks can also be output in a Nagios-compatible format by additionally
|
||||
providing the option <literal>--nagios</literal>.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-node-rejoin" xreflabel="repmgr node rejoin">
|
||||
<indexterm>
|
||||
<primary>repmgr node rejoin</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node rejoin</title>
|
||||
<para>
|
||||
Enables a dormant (stopped) node to be rejoined to the replication cluster.
|
||||
</para>
|
||||
<para>
|
||||
This can optionally use <command>pg_rewind</command> to re-integrate a node which has diverged
|
||||
from the rest of the cluster, typically a failed primary.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="repmgr-cluster-show" xreflabel="repmgr cluster show">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster show</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster show</title>
|
||||
<para>
|
||||
Displays information about each active node in the replication cluster. This
|
||||
command polls each registered server and shows its role (<literal>primary</literal> /
|
||||
<literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
|
||||
directly and can be run on any node in the cluster; this is also useful when analyzing
|
||||
connectivity from a particular node.
|
||||
</para>
|
||||
<para>
|
||||
This command requires either a valid <filename>repmgr.conf</filename> file or a database
|
||||
connection string to one of the registered nodes; no additional arguments are needed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster show
|
||||
|
||||
ID | Name | Role | Status | Upstream | Location | Connection string
|
||||
----+-------+---------+-----------+----------+----------+-----------------------------------------
|
||||
1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr
|
||||
2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr
|
||||
3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To show database connection errors when polling nodes, run the command in
|
||||
<literal>--verbose</literal> mode.
|
||||
</para>
|
||||
<para>
|
||||
The `cluster show` command accepts an optional parameter <literal>--csv</literal>, which
|
||||
outputs the replication cluster's status in a simple CSV format, suitable for
|
||||
parsing by scripts:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster show --csv
|
||||
1,-1,-1
|
||||
2,0,0
|
||||
3,0,1</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The columns have following meanings:
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
<listitem>
|
||||
<simpara>
|
||||
node ID
|
||||
</simpara>
|
||||
<simpara>
|
||||
availability (0 = available, -1 = unavailable)
|
||||
</simpara>
|
||||
<simpara>
|
||||
recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
|
||||
</simpara>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that the availability is tested by connecting from the node where
|
||||
<command>repmgr cluster show</command> is executed, and does not necessarily imply the node
|
||||
is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
|
||||
a better overviews of connections between nodes.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-cluster-matrix" xreflabel="repmgr cluster matrix">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster matrix</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster matrix</title>
|
||||
<para>
|
||||
<command>repmgr cluster matrix</command> runs <command>repmgr cluster show</command> on each
|
||||
node and arranges the results in a matrix, recording success or failure.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
|
||||
file on each node. Additionally passwordless `ssh` connections are required between
|
||||
all nodes.
|
||||
</para>
|
||||
<para>
|
||||
Example 1 (all nodes up):
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | *
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | * | * | *</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | x
|
||||
node3 | 3 | ? | ? | ?
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Each row corresponds to one server, and indicates the result of
|
||||
testing an outbound connection from that server.
|
||||
</para>
|
||||
<para>
|
||||
Since <literal>node3</literal> is down, all the entries in its row are filled with
|
||||
<literal>?</literal>, meaning that there we cannot test outbound connections.
|
||||
</para>
|
||||
<para>
|
||||
The other two nodes are up; the corresponding rows have <literal>x</literal> in the
|
||||
column corresponding to <literal>node3</literal>, meaning that inbound connections to
|
||||
that node have failed, and `*` in the columns corresponding to
|
||||
<literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
|
||||
to these nodes have succeeded.
|
||||
</para>
|
||||
<para>
|
||||
Example 3 (all nodes up, firewall dropping packets originating
|
||||
from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
|
||||
running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | ? | ? | ?</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Note this may take some time depending on the <varname>connect_timeout</varname>
|
||||
setting in the node <varname>conninfo</varname> strings; default is
|
||||
<literal>1 minute</literal> which means without modification the above
|
||||
command would take around 2 minutes to run; see comment elsewhere about setting
|
||||
<varname>connect_timeout</varname>)
|
||||
</para>
|
||||
<para>
|
||||
The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
|
||||
and that (therefore) we don't know the state of any outbound
|
||||
connection from <literal>node3</literal>.
|
||||
</para>
|
||||
<para>
|
||||
In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
|
||||
useful result.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="repmgr-cluster-crosscheck" xreflabel="repmgr cluster crosscheck">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster crosscheck</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster crosscheck</title>
|
||||
<para>
|
||||
<command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
|
||||
but cross-checks connections between each combination of nodes. In "Example 3" in
|
||||
<xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
|
||||
However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
|
||||
overview of the cluster situation:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster crosscheck
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | * | * | *</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
What happened is that <command>repmgr cluster crosscheck</command> merged its own
|
||||
<command>repmgr cluster matrix</command> with the <command>repmgr cluster matrix</command>
|
||||
output from <literal>node2</literal>; the latter is able to connect to <literal>node3</literal>
|
||||
and therefore determine the state of outbound connections from that node.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-cluster-cleanup" xreflabel="repmgr cluster cleanup">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster cleanup</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster cleanup</title>
|
||||
<para>
|
||||
Purges monitoring history from the <literal>repmgr.monitoring_history</literal> table to
|
||||
prevent excessive table growth. Use the <literal>-k/--keep-history</literal> to specify the
|
||||
number of days of monitoring history to retain. This command can be used
|
||||
manually or as a cronjob.
|
||||
</para>
|
||||
<para>
|
||||
This command requires a valid <filename>repmgr.conf</filename> file for the node on which it is
|
||||
executed; no additional arguments are required.
|
||||
</para>
|
||||
<note>
|
||||
<simpara>
|
||||
Monitoring history will only be written if <command>repmgrd</command> is active, and
|
||||
<varname>monitoring_history</varname> is set to <literal>true</literal> in <filename>repmgr.conf</filename>.
|
||||
</simpara>
|
||||
</note>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
@@ -43,7 +43,23 @@
|
||||
<!ENTITY promoting-standby SYSTEM "promoting-standby.sgml">
|
||||
<!ENTITY follow-new-primary SYSTEM "follow-new-primary.sgml">
|
||||
<!ENTITY switchover SYSTEM "switchover.sgml">
|
||||
<!ENTITY command-reference SYSTEM "command-reference.sgml">
|
||||
<!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.sgml">
|
||||
<!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.sgml">
|
||||
<!ENTITY repmgr-standby-clone SYSTEM "repmgr-standby-clone.sgml">
|
||||
<!ENTITY repmgr-standby-register SYSTEM "repmgr-standby-register.sgml">
|
||||
<!ENTITY repmgr-standby-unregister SYSTEM "repmgr-standby-unregister.sgml">
|
||||
<!ENTITY repmgr-standby-promote SYSTEM "repmgr-standby-promote.sgml">
|
||||
<!ENTITY repmgr-standby-follow SYSTEM "repmgr-standby-follow.sgml">
|
||||
<!ENTITY repmgr-standby-switchover SYSTEM "repmgr-standby-switchover.sgml">
|
||||
<!ENTITY repmgr-node-status SYSTEM "repmgr-node-status.sgml">
|
||||
<!ENTITY repmgr-node-check SYSTEM "repmgr-node-check.sgml">
|
||||
<!ENTITY repmgr-node-rejoin SYSTEM "repmgr-node-rejoin.sgml">
|
||||
<!ENTITY repmgr-cluster-show SYSTEM "repmgr-cluster-show.sgml">
|
||||
<!ENTITY repmgr-cluster-matrix SYSTEM "repmgr-cluster-matrix.sgml">
|
||||
<!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.sgml">
|
||||
<!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.sgml">
|
||||
|
||||
|
||||
<!ENTITY appendix-signatures SYSTEM "appendix-signatures.sgml">
|
||||
|
||||
<!ENTITY bookindex SYSTEM "bookindex.sgml">
|
||||
|
||||
22
doc/repmgr-cluster-cleanup.sgml
Normal file
22
doc/repmgr-cluster-cleanup.sgml
Normal file
@@ -0,0 +1,22 @@
|
||||
<chapter id="repmgr-cluster-cleanup" xreflabel="repmgr cluster cleanup">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster cleanup</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster cleanup</title>
|
||||
<para>
|
||||
Purges monitoring history from the <literal>repmgr.monitoring_history</literal> table to
|
||||
prevent excessive table growth. Use the <literal>-k/--keep-history</literal> to specify the
|
||||
number of days of monitoring history to retain. This command can be used
|
||||
manually or as a cronjob.
|
||||
</para>
|
||||
<para>
|
||||
This command requires a valid <filename>repmgr.conf</filename> file for the node on which it is
|
||||
executed; no additional arguments are required.
|
||||
</para>
|
||||
<note>
|
||||
<simpara>
|
||||
Monitoring history will only be written if <command>repmgrd</command> is active, and
|
||||
<varname>monitoring_history</varname> is set to <literal>true</literal> in <filename>repmgr.conf</filename>.
|
||||
</simpara>
|
||||
</note>
|
||||
</chapter>
|
||||
28
doc/repmgr-cluster-crosscheck.sgml
Normal file
28
doc/repmgr-cluster-crosscheck.sgml
Normal file
@@ -0,0 +1,28 @@
|
||||
<chapter id="repmgr-cluster-crosscheck" xreflabel="repmgr cluster crosscheck">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster crosscheck</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster crosscheck</title>
|
||||
<para>
|
||||
<command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
|
||||
but cross-checks connections between each combination of nodes. In "Example 3" in
|
||||
<xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
|
||||
However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
|
||||
overview of the cluster situation:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster crosscheck
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | * | * | *</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
What happened is that <command>repmgr cluster crosscheck</command> merged its own
|
||||
<command>repmgr cluster matrix</command> with the <command>repmgr cluster matrix</command>
|
||||
output from <literal>node2</literal>; the latter is able to connect to <literal>node3</literal>
|
||||
and therefore determine the state of outbound connections from that node.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
83
doc/repmgr-cluster-matrix.sgml
Normal file
83
doc/repmgr-cluster-matrix.sgml
Normal file
@@ -0,0 +1,83 @@
|
||||
<chapter id="repmgr-cluster-matrix" xreflabel="repmgr cluster matrix">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster matrix</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster matrix</title>
|
||||
<para>
|
||||
<command>repmgr cluster matrix</command> runs <command>repmgr cluster show</command> on each
|
||||
node and arranges the results in a matrix, recording success or failure.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
|
||||
file on each node. Additionally passwordless `ssh` connections are required between
|
||||
all nodes.
|
||||
</para>
|
||||
<para>
|
||||
Example 1 (all nodes up):
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | *
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | * | * | *</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | x
|
||||
node3 | 3 | ? | ? | ?
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Each row corresponds to one server, and indicates the result of
|
||||
testing an outbound connection from that server.
|
||||
</para>
|
||||
<para>
|
||||
Since <literal>node3</literal> is down, all the entries in its row are filled with
|
||||
<literal>?</literal>, meaning that there we cannot test outbound connections.
|
||||
</para>
|
||||
<para>
|
||||
The other two nodes are up; the corresponding rows have <literal>x</literal> in the
|
||||
column corresponding to <literal>node3</literal>, meaning that inbound connections to
|
||||
that node have failed, and `*` in the columns corresponding to
|
||||
<literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
|
||||
to these nodes have succeeded.
|
||||
</para>
|
||||
<para>
|
||||
Example 3 (all nodes up, firewall dropping packets originating
|
||||
from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
|
||||
running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster matrix
|
||||
|
||||
Name | Id | 1 | 2 | 3
|
||||
-------+----+----+----+----
|
||||
node1 | 1 | * | * | x
|
||||
node2 | 2 | * | * | *
|
||||
node3 | 3 | ? | ? | ?</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Note this may take some time depending on the <varname>connect_timeout</varname>
|
||||
setting in the node <varname>conninfo</varname> strings; default is
|
||||
<literal>1 minute</literal> which means without modification the above
|
||||
command would take around 2 minutes to run; see comment elsewhere about setting
|
||||
<varname>connect_timeout</varname>)
|
||||
</para>
|
||||
<para>
|
||||
The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
|
||||
and that (therefore) we don't know the state of any outbound
|
||||
connection from <literal>node3</literal>.
|
||||
</para>
|
||||
<para>
|
||||
In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
|
||||
useful result.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
67
doc/repmgr-cluster-show.sgml
Normal file
67
doc/repmgr-cluster-show.sgml
Normal file
@@ -0,0 +1,67 @@
|
||||
<chapter id="repmgr-cluster-show" xreflabel="repmgr cluster show">
|
||||
<indexterm>
|
||||
<primary>repmgr cluster show</primary>
|
||||
</indexterm>
|
||||
<title>repmgr cluster show</title>
|
||||
<para>
|
||||
Displays information about each active node in the replication cluster. This
|
||||
command polls each registered server and shows its role (<literal>primary</literal> /
|
||||
<literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
|
||||
directly and can be run on any node in the cluster; this is also useful when analyzing
|
||||
connectivity from a particular node.
|
||||
</para>
|
||||
<para>
|
||||
This command requires either a valid <filename>repmgr.conf</filename> file or a database
|
||||
connection string to one of the registered nodes; no additional arguments are needed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster show
|
||||
|
||||
ID | Name | Role | Status | Upstream | Location | Connection string
|
||||
----+-------+---------+-----------+----------+----------+-----------------------------------------
|
||||
1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr
|
||||
2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr
|
||||
3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To show database connection errors when polling nodes, run the command in
|
||||
<literal>--verbose</literal> mode.
|
||||
</para>
|
||||
<para>
|
||||
The `cluster show` command accepts an optional parameter <literal>--csv</literal>, which
|
||||
outputs the replication cluster's status in a simple CSV format, suitable for
|
||||
parsing by scripts:
|
||||
<programlisting>
|
||||
$ repmgr -f /etc/repmgr.conf cluster show --csv
|
||||
1,-1,-1
|
||||
2,0,0
|
||||
3,0,1</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The columns have following meanings:
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
<listitem>
|
||||
<simpara>
|
||||
node ID
|
||||
</simpara>
|
||||
<simpara>
|
||||
availability (0 = available, -1 = unavailable)
|
||||
</simpara>
|
||||
<simpara>
|
||||
recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
|
||||
</simpara>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that the availability is tested by connecting from the node where
|
||||
<command>repmgr cluster show</command> is executed, and does not necessarily imply the node
|
||||
is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
|
||||
a better overviews of connections between nodes.
|
||||
</para>
|
||||
</chapter>
|
||||
70
doc/repmgr-node-check.sgml
Normal file
70
doc/repmgr-node-check.sgml
Normal file
@@ -0,0 +1,70 @@
|
||||
<chapter id="repmgr-node-check" xreflabel="repmgr node check">
|
||||
<indexterm>
|
||||
<primary>repmgr node check</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node check</title>
|
||||
<para>
|
||||
Performs some health checks on a node from a replication perspective.
|
||||
This command must be run on the local node.
|
||||
</para>
|
||||
<para>
|
||||
Sample output (execute <command>repmgr node check</command>):
|
||||
<programlisting>
|
||||
Node "node1":
|
||||
Server role: OK (node is primary)
|
||||
Replication lag: OK (N/A - node is primary)
|
||||
WAL archiving: OK (0 pending files)
|
||||
Downstream servers: OK (2 of 2 downstream nodes attached)
|
||||
Replication slots: OK (node has no replication slots)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Additionally each check can be performed individually by supplying
|
||||
an additional command line parameter, e.g.:
|
||||
<programlisting>
|
||||
$ repmgr node check --role
|
||||
OK (node is primary)
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
Parameters for individual checks are as follows:
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--role</literal>: checks if the node has the expected role
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--replication-lag</literal>: checks if the node is lagging by more than
|
||||
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--downstream</literal>: checks that the expected downstream nodes are attached
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<simpara>
|
||||
<literal>--slots</literal>: checks there are no inactive replication slots
|
||||
</simpara>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
</para>
|
||||
<para>
|
||||
Individual checks can also be output in a Nagios-compatible format by additionally
|
||||
providing the option <literal>--nagios</literal>.
|
||||
</para>
|
||||
</chapter>
|
||||
13
doc/repmgr-node-rejoin.sgml
Normal file
13
doc/repmgr-node-rejoin.sgml
Normal file
@@ -0,0 +1,13 @@
|
||||
<chapter id="repmgr-node-rejoin" xreflabel="repmgr node rejoin">
|
||||
<indexterm>
|
||||
<primary>repmgr node rejoin</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node rejoin</title>
|
||||
<para>
|
||||
Enables a dormant (stopped) node to be rejoined to the replication cluster.
|
||||
</para>
|
||||
<para>
|
||||
This can optionally use <command>pg_rewind</command> to re-integrate a node which has diverged
|
||||
from the rest of the cluster, typically a failed primary.
|
||||
</para>
|
||||
</chapter>
|
||||
29
doc/repmgr-node-status.sgml
Normal file
29
doc/repmgr-node-status.sgml
Normal file
@@ -0,0 +1,29 @@
|
||||
|
||||
<chapter id="repmgr-node-status" xreflabel="repmgr node status">
|
||||
<indexterm>
|
||||
<primary>repmgr node status</primary>
|
||||
</indexterm>
|
||||
<title>repmgr node status</title>
|
||||
<para>
|
||||
Displays an overview of a node's basic information and replication
|
||||
status. This command must be run on the local node.
|
||||
</para>
|
||||
<para>
|
||||
Sample output (execute <command>repmgr node status</command>):
|
||||
<programlisting>
|
||||
Node "node1":
|
||||
PostgreSQL version: 10beta1
|
||||
Total data size: 30 MB
|
||||
Conninfo: host=node1 dbname=repmgr user=repmgr connect_timeout=2
|
||||
Role: primary
|
||||
WAL archiving: off
|
||||
Archive command: (none)
|
||||
Replication connections: 2 (of maximal 10)
|
||||
Replication slots: 0 (of maximal 10)
|
||||
Replication lag: n/a
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
See <xref linkend="repmgr-node-check"> to diagnose issues.
|
||||
</para>
|
||||
</chapter>
|
||||
18
doc/repmgr-primary-register.sgml
Normal file
18
doc/repmgr-primary-register.sgml
Normal file
@@ -0,0 +1,18 @@
|
||||
<chapter id="repmgr-primary-register" xreflabel="repmgr primary register">
|
||||
<indexterm><primary>repmgr primary register</primary></indexterm>
|
||||
<title>repmgr primary register</title>
|
||||
<para>
|
||||
<command>repmgr primary register</command> registers a primary node in a
|
||||
streaming replication cluster, and configures it for use with repmgr, including
|
||||
installing the &repmgr; extension. This command needs to be executed before any
|
||||
standby nodes are registered.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to check what would happen without
|
||||
actually registering the primary.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgr master register</command> can be used as an alias for
|
||||
<command>repmgr primary register</command>.
|
||||
</para>
|
||||
</chapter>
|
||||
18
doc/repmgr-primary-unregister.sgml
Normal file
18
doc/repmgr-primary-unregister.sgml
Normal file
@@ -0,0 +1,18 @@
|
||||
<chapter id="repmgr-primary-unregister" xreflabel="repmgr primary unregister">
|
||||
<indexterm><primary>repmgr primary unregister</primary></indexterm>
|
||||
<title>repmgr primary unregister</title>
|
||||
<para>
|
||||
<command>repmgr primary register</command> unregisters an inactive primary node
|
||||
from the &repmgr; metadata. This is typically when the primary has failed and is
|
||||
being removed from the cluster after a new primary has been promoted.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to check what would happen without
|
||||
actually unregistering the node.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>repmgr master unregister</command> can be used as an alias for
|
||||
<command>repmgr primary unregister</command>/
|
||||
</para>
|
||||
</chapter>
|
||||
91
doc/repmgr-standby-clone.sgml
Normal file
91
doc/repmgr-standby-clone.sgml
Normal file
@@ -0,0 +1,91 @@
|
||||
<chapter id="repmgr-standby-clone" xreflabel="repmgr standby clone">
|
||||
<indexterm>
|
||||
<primary>repmgr standby clone</primary>
|
||||
<seealso>cloning</seealso>
|
||||
</indexterm>
|
||||
<title>repmgr standby clone</title>
|
||||
<para>
|
||||
<command>repmgr standby clone</command> clones a PostgreSQL node from another
|
||||
PostgreSQL node, typically the primary, but optionally from any other node in
|
||||
the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
|
||||
to attach the cloned node to the primary node (or another standby, if cascading replication
|
||||
is in use).
|
||||
</para>
|
||||
<note>
|
||||
<simpara>
|
||||
<command>repmgr standby clone</command> does not start the standby, and after cloning
|
||||
<command>repmgr standby register</command> must be executed to notify &repmgr; of its presence.
|
||||
</simpara>
|
||||
</note>
|
||||
|
||||
|
||||
<sect1 id="repmgr-standby-clone-config-file-copying" xreflabel="Copying configuration files">
|
||||
<title>Handling configuration files</title>
|
||||
|
||||
<para>
|
||||
Note that by default, all configuration files in the source node's data
|
||||
directory will be copied to the cloned node. Typically these will be
|
||||
<filename>postgresql.conf</filename>, <filename>postgresql.auto.conf</filename>,
|
||||
<filename>pg_hba.conf</filename> and <filename>pg_ident.conf</filename>.
|
||||
These may require modification before the standby is started.
|
||||
</para>
|
||||
<para>
|
||||
In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
|
||||
configuration files are located outside of the data directory and will
|
||||
not be copied by default. &repmgr; can copy these files, either to the same
|
||||
location on the standby server (provided appropriate directory and file permissions
|
||||
are available), or into the standby's data directory. This requires passwordless
|
||||
SSH access to the primary server. Add the option <literal>--copy-external-config-files</literal>
|
||||
to the <command>repmgr standby clone</command> command; by default files will be copied to
|
||||
the same path as on the upstream server. Note that the user executing <command>repmgr</command>
|
||||
must have write access to those directories.
|
||||
</para>
|
||||
<para>
|
||||
To have the configuration files placed in the standby's data directory, specify
|
||||
<literal>--copy-external-config-files=pgdata</literal>, but note that
|
||||
any include directives in the copied files may need to be updated.
|
||||
</para>
|
||||
<tip>
|
||||
<simpara>
|
||||
For reliable configuration file management we recommend using a
|
||||
configuration management tool such as Ansible, Chef, Puppet or Salt.
|
||||
</simpara>
|
||||
</tip>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="repmgr-standby-clone-wal-management" xreflabel="Managing WAL during the cloning process">
|
||||
<title>Managing WAL during the cloning process</title>
|
||||
<para>
|
||||
When initially cloning a standby, you will need to ensure
|
||||
that all required WAL files remain available while the cloning is taking
|
||||
place. To ensure this happens when using the default `pg_basebackup` method,
|
||||
&repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
||||
parameter to <literal>stream</literal>,
|
||||
which will ensure all WAL files generated during the cloning process are
|
||||
streamed in parallel with the main backup. Note that this requires two
|
||||
replication connections to be available (&repmgr; will verify sufficient
|
||||
connections are available before attempting to clone, and this can be checked
|
||||
before performing the clone using the <literal>--dry-run</literal> option).
|
||||
</para>
|
||||
<para>
|
||||
To override this behaviour, in <filename>repmgr.conf</filename> set
|
||||
<command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
||||
parameter to <literal>fetch</literal>:
|
||||
<programlisting>
|
||||
pg_basebackup_options='--xlog-method=fetch'</programlisting>
|
||||
|
||||
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
|
||||
See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
|
||||
pg_basebackup</ulink> documentation for details.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<simpara>
|
||||
From PostgreSQL 10, <command>pg_basebackup</command>'s
|
||||
<literal>--xlog-method</literal> parameter has been renamed to
|
||||
<literal>--wal-method</literal>.
|
||||
</simpara>
|
||||
</note>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
21
doc/repmgr-standby-follow.sgml
Normal file
21
doc/repmgr-standby-follow.sgml
Normal file
@@ -0,0 +1,21 @@
|
||||
<chapter id="repmgr-standby-follow" xreflabel="repmgr standby follow">
|
||||
<indexterm>
|
||||
<primary>repmgr standby follow</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby follow</title>
|
||||
<para>
|
||||
Attaches the standby to a new primary. This command requires a valid
|
||||
<filename>repmgr.conf</filename> file for the standby, either specified
|
||||
explicitly with <literal>-f/--config-file</literal> or located in a
|
||||
default location; no additional arguments are required.
|
||||
</para>
|
||||
<para>
|
||||
This command will force a restart of the standby server, which must be
|
||||
running. It can only be used to attach a standby to a new primary node.
|
||||
</para>
|
||||
<para>
|
||||
To re-add an inactive node to the replication cluster, see
|
||||
<xref linkend="repmgr-node-rejoin">
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
18
doc/repmgr-standby-promote.sgml
Normal file
18
doc/repmgr-standby-promote.sgml
Normal file
@@ -0,0 +1,18 @@
|
||||
<chapter id="repmgr-standby-promote" xreflabel="repmgr standby promote">
|
||||
<indexterm>
|
||||
<primary>repmgr standby promote</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby promote</title>
|
||||
<para>
|
||||
Promotes a standby to a primary if the current primary has failed. This
|
||||
command requires a valid <filename>repmgr.conf</filename> file for the standby, either
|
||||
specified explicitly with <literal>-f/--config-file</literal> or located in a
|
||||
default location; no additional arguments are required.
|
||||
</para>
|
||||
<para>
|
||||
If the standby promotion succeeds, the server will not need to be
|
||||
restarted. However any other standbys will need to follow the new server,
|
||||
by using <xref linkend="repmgr-standby-follow">; if <command>repmgrd</command>
|
||||
is active, it will handle this automatically.
|
||||
</para>
|
||||
</chapter>
|
||||
50
doc/repmgr-standby-register.sgml
Normal file
50
doc/repmgr-standby-register.sgml
Normal file
@@ -0,0 +1,50 @@
|
||||
<chapter id="repmgr-standby-register" xreflabel="repmgr standby register">
|
||||
<indexterm><primary>repmgr standby register</primary></indexterm>
|
||||
<title>repmgr standby register</title>
|
||||
<para>
|
||||
<command>repmgr standby register</command> adds a standby's information to
|
||||
the &repmgr; metadata. This command needs to be executed to enable
|
||||
promote/follow operations and to allow <command>repmgrd</command> to work with the node.
|
||||
An existing standby can be registered using this command. Execute with the
|
||||
<literal>--dry-run</literal> option to check what would happen without actually registering the
|
||||
standby.
|
||||
</para>
|
||||
|
||||
<sect1 id="repmgr-standby-register-wait" xreflabel="repmgr standby register --wait">
|
||||
<title>Waiting for the registration to propagate to the standby</title>
|
||||
<para>
|
||||
Depending on your environment and workload, it may take some time for
|
||||
the standby's node record to propagate from the primary to the standby. Some
|
||||
actions (such as starting <command>repmgrd</command>) require that the standby's node record
|
||||
is present and up-to-date to function correctly.
|
||||
</para>
|
||||
<para>
|
||||
By providing the option <literal>--wait-sync</literal> to the
|
||||
<command>repmgr standby register</command> command, &repmgr; will wait
|
||||
until the record is synchronised before exiting. An optional timeout (in
|
||||
seconds) can be added to this option (e.g. <literal>--wait-sync=60</literal>).
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="rempgr-standby-register-inactive-node" xreflabel="Registering an inactive node">
|
||||
<title>Registering an inactive node</title>
|
||||
<para>
|
||||
Under some circumstances you may wish to register a standby which is not
|
||||
yet running; this can be the case when using provisioning tools to create
|
||||
a complex replication cluster. In this case, by using the <literal>-F/--force</literal>
|
||||
option and providing the connection parameters to the primary server,
|
||||
the standby can be registered.
|
||||
</para>
|
||||
<para>
|
||||
Similarly, with cascading replication it may be necessary to register
|
||||
a standby whose upstream node has not yet been registered - in this case,
|
||||
using <literal>-F/--force</literal> will result in the creation of an inactive placeholder
|
||||
record for the upstream node, which will however later need to be registered
|
||||
with the <literal>-F/--force</literal> option too.
|
||||
</para>
|
||||
<para>
|
||||
When used with <command>repmgr standby register</command>, care should be taken that use of the
|
||||
<literal>-F/--force</literal> option does not result in an incorrectly configured cluster.
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
27
doc/repmgr-standby-switchover.sgml
Normal file
27
doc/repmgr-standby-switchover.sgml
Normal file
@@ -0,0 +1,27 @@
|
||||
<chapter id="repmgr-standby-switchover" xreflabel="repmgr standby switchover">
|
||||
<indexterm>
|
||||
<primary>repmgr standby switchover</primary>
|
||||
</indexterm>
|
||||
<title>repmgr standby switchover</title>
|
||||
<para>
|
||||
Promotes a standby to primary and demotes the existing primary to a standby.
|
||||
This command must be run on the standby to be promoted, and requires a
|
||||
passwordless SSH connection to the current primary.
|
||||
</para>
|
||||
<para>
|
||||
If other standbys are connected to the demotion candidate, &repmgr; can instruct
|
||||
these to follow the new primary if the option <literal>--siblings-follow</literal>
|
||||
is specified.
|
||||
</para>
|
||||
<para>
|
||||
Execute with the <literal>--dry-run</literal> option to test the switchover as far as
|
||||
possible without actually changing the status of either node.
|
||||
</para>
|
||||
<para>
|
||||
<command>repmgrd</command> should not be active on any nodes while a switchover is being
|
||||
executed. This restriction may be lifted in a later version.
|
||||
</para>
|
||||
<para>
|
||||
For more details see the section <xref linkend="performing-switchover">.
|
||||
</para>
|
||||
</chapter>
|
||||
29
doc/repmgr-standby-unregister.sgml
Normal file
29
doc/repmgr-standby-unregister.sgml
Normal file
@@ -0,0 +1,29 @@
|
||||
<chapter id="repmgr-standby-unregister" xreflabel="repmgr standby unregister">
|
||||
<indexterm><primary>repmgr standby unregister</primary></indexterm>
|
||||
<title>repmgr standby unregister</title>
|
||||
<para>
|
||||
Unregisters a standby with `repmgr`. This command does not affect the actual
|
||||
replication, just removes the standby's entry from the &repmgr; metadata.
|
||||
</para>
|
||||
<para>
|
||||
To unregister a running standby, execute:
|
||||
<programlisting>
|
||||
repmgr standby unregister -f /etc/repmgr.conf</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
This will remove the standby record from &repmgr;'s internal metadata
|
||||
table (<literal>repmgr.nodes</literal>). A <literal>standby_unregister</literal>
|
||||
event notification will be recorded in the <literal>repmgr.events</literal> table.
|
||||
</para>
|
||||
<para>
|
||||
If the standby is not running, the command can be executed on another
|
||||
node by providing the id of the node to be unregistered using
|
||||
the command line parameter <literal>--node-id</literal>, e.g. executing the following
|
||||
command on the master server will unregister the standby with
|
||||
id <literal>3</literal>:
|
||||
<programlisting>
|
||||
repmgr standby unregister -f /etc/repmgr.conf --node-id=3
|
||||
</programlisting>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
@@ -52,6 +52,7 @@
|
||||
<keyword>PostgreSQL</keyword>
|
||||
<keyword>replication</keyword>
|
||||
<keyword>asynchronous</keyword>
|
||||
<keyword>HA</keyword>
|
||||
<keyword>high-availability</keyword>
|
||||
</keywordset>
|
||||
</bookinfo>
|
||||
@@ -72,9 +73,27 @@
|
||||
&promoting-standby;
|
||||
&follow-new-primary;
|
||||
&switchover;
|
||||
&command-reference;
|
||||
</part>
|
||||
|
||||
<part id="repmgr-command-reference">
|
||||
<title>repmgr command reference</title>
|
||||
|
||||
&repmgr-primary-register;
|
||||
&repmgr-primary-unregister;
|
||||
&repmgr-standby-clone;
|
||||
&repmgr-standby-register;
|
||||
&repmgr-standby-unregister;
|
||||
&repmgr-standby-promote;
|
||||
&repmgr-standby-follow;
|
||||
&repmgr-standby-switchover;
|
||||
&repmgr-node-status;
|
||||
&repmgr-node-check;
|
||||
&repmgr-node-rejoin;
|
||||
&repmgr-cluster-show;
|
||||
&repmgr-cluster-matrix;
|
||||
&repmgr-cluster-crosscheck;
|
||||
&repmgr-cluster-cleanup;
|
||||
</part>
|
||||
|
||||
&appendix-signatures;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user