mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 07:06:30 +00:00
421 lines
17 KiB
Plaintext
421 lines
17 KiB
Plaintext
<chapter id="command-reference" xreflabel="command reference">
|
|
<title>repmgr command reference</title>
|
|
|
|
<para>
|
|
Overview of repmgr commands.
|
|
</para>
|
|
|
|
<sect1 id="repmgr-standby-clone" xreflabel="repmgr standby clone">
|
|
<indexterm>
|
|
<primary>repmgr standby clone</primary>
|
|
<seealso>cloning</seealso>
|
|
</indexterm>
|
|
<title>repmgr standby clone</title>
|
|
<para>
|
|
<command>repmgr standby clone</command> clones a PostgreSQL node from another
|
|
PostgreSQL node, typically the primary, but optionally from any other node in
|
|
the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
|
|
to attach the cloned node to the primary node (or another standby, if cascading replication
|
|
is in use).
|
|
</para>
|
|
<note>
|
|
<simpara>
|
|
<command>repmgr standby clone</command> does not start the standby, and after cloning
|
|
<command>repmgr standby register</command> must be executed to notify &repmgr; of its presence.
|
|
</simpara>
|
|
</note>
|
|
|
|
|
|
<sect2 id="repmgr-standby-clone-config-file-copying" xreflabel="Copying configuration files">
|
|
<title>Handling configuration files</title>
|
|
|
|
<para>
|
|
Note that by default, all configuration files in the source node's data
|
|
directory will be copied to the cloned node. Typically these will be
|
|
<filename>postgresql.conf</filename>, <filename>postgresql.auto.conf</filename>,
|
|
<filename>pg_hba.conf</filename> and <filename>pg_ident.conf</filename>.
|
|
These may require modification before the standby is started.
|
|
</para>
|
|
<para>
|
|
In some cases (e.g. on Debian or Ubuntu Linux installations), PostgreSQL's
|
|
configuration files are located outside of the data directory and will
|
|
not be copied by default. &repmgr; can copy these files, either to the same
|
|
location on the standby server (provided appropriate directory and file permissions
|
|
are available), or into the standby's data directory. This requires passwordless
|
|
SSH access to the primary server. Add the option <literal>--copy-external-config-files</literal>
|
|
to the <command>repmgr standby clone</command> command; by default files will be copied to
|
|
the same path as on the upstream server. Note that the user executing <command>repmgr</command>
|
|
must have write access to those directories.
|
|
</para>
|
|
<para>
|
|
To have the configuration files placed in the standby's data directory, specify
|
|
<literal>--copy-external-config-files=pgdata</literal>, but note that
|
|
any include directives in the copied files may need to be updated.
|
|
</para>
|
|
<tip>
|
|
<simpara>
|
|
For reliable configuration file management we recommend using a
|
|
configuration management tool such as Ansible, Chef, Puppet or Salt.
|
|
</simpara>
|
|
</tip>
|
|
</sect2>
|
|
|
|
<sect2 id="repmgr-standby-clone-wal-management" xreflabel="Managing WAL during the cloning process">
|
|
<title>Managing WAL during the cloning process</title>
|
|
<para>
|
|
When initially cloning a standby, you will need to ensure
|
|
that all required WAL files remain available while the cloning is taking
|
|
place. To ensure this happens when using the default `pg_basebackup` method,
|
|
&repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
|
parameter to <literal>stream</literal>,
|
|
which will ensure all WAL files generated during the cloning process are
|
|
streamed in parallel with the main backup. Note that this requires two
|
|
replication connections to be available (&repmgr; will verify sufficient
|
|
connections are available before attempting to clone, and this can be checked
|
|
before performing the clone using the <literal>--dry-run</literal> option).
|
|
</para>
|
|
<para>
|
|
To override this behaviour, in <filename>repmgr.conf</filename> set
|
|
<command>pg_basebackup</command>'s <literal>--xlog-method</literal>
|
|
parameter to <literal>fetch</literal>:
|
|
<programlisting>
|
|
pg_basebackup_options='--xlog-method=fetch'</programlisting>
|
|
|
|
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
|
|
See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
|
|
pg_basebackup</ulink> documentation for details.
|
|
</para>
|
|
|
|
<note>
|
|
<simpara>
|
|
From PostgreSQL 10, <command>pg_basebackup</command>'s
|
|
<literal>--xlog-method</literal> parameter has been renamed to
|
|
<literal>--wal-method</literal>.
|
|
</simpara>
|
|
</note>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="repmgr-standby-register" xreflabel="repmgr standby register">
|
|
<indexterm><primary>repmgr standby register</primary></indexterm>
|
|
<title>repmgr standby register</title>
|
|
<para>
|
|
<command>repmgr standby register</command> adds a standby's information to
|
|
the &repmgr; metadata. This command needs to be executed to enable
|
|
promote/follow operations and to allow <command>repmgrd</command> to work with the node.
|
|
An existing standby can be registered using this command. Execute with the
|
|
<literal>--dry-run</literal> option to check what would happen without actually registering the
|
|
standby.
|
|
</para>
|
|
|
|
<sect2 id="rempgr-standby-register-wait" xreflabel="rempgr standby register --wait">
|
|
<title>Waiting for the registration to propagate to the standby</title>
|
|
<para>
|
|
Depending on your environment and workload, it may take some time for
|
|
the standby's node record to propagate from the primary to the standby. Some
|
|
actions (such as starting <command>repmgrd</command>) require that the standby's node record
|
|
is present and up-to-date to function correctly.
|
|
</para>
|
|
<para>
|
|
By providing the option <literal>--wait-sync</literal> to the
|
|
<command>repmgr standby register</command> command, &repmgr; will wait
|
|
until the record is synchronised before exiting. An optional timeout (in
|
|
seconds) can be added to this option (e.g. <literal>--wait-sync=60</literal>).
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="rempgr-standby-register-inactive-node" xreflabel="Registering an inactive node">
|
|
<title>Registering an inactive node</title>
|
|
<para>
|
|
Under some circumstances you may wish to register a standby which is not
|
|
yet running; this can be the case when using provisioning tools to create
|
|
a complex replication cluster. In this case, by using the <literal>-F/--force</literal>
|
|
option and providing the connection parameters to the primary server,
|
|
the standby can be registered.
|
|
</para>
|
|
<para>
|
|
Similarly, with cascading replication it may be necessary to register
|
|
a standby whose upstream node has not yet been registered - in this case,
|
|
using <literal>-F/--force</literal> will result in the creation of an inactive placeholder
|
|
record for the upstream node, which will however later need to be registered
|
|
with the <literal>-F/--force</literal> option too.
|
|
</para>
|
|
<para>
|
|
When used with <command>repmgr standby register</command>, care should be taken that use of the
|
|
<literal>-F/--force</literal> option does not result in an incorrectly configured cluster.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="repmgr-standby-unregister" xreflabel="repmgr standby inregister">
|
|
<indexterm><primary>repmgr standby unregister</primary></indexterm>
|
|
<title>repmgr standby unregister</title>
|
|
<para>
|
|
Unregisters a standby with `repmgr`. This command does not affect the actual
|
|
replication, just removes the standby's entry from the &repmgr; metadata.
|
|
</para>
|
|
<para>
|
|
To unregister a running standby, execute:
|
|
<programlisting>
|
|
repmgr standby unregister -f /etc/repmgr.conf</programlisting>
|
|
</para>
|
|
<para>
|
|
This will remove the standby record from &repmgr;'s internal metadata
|
|
table (<literal>repmgr.nodes</literal>). A <literal>standby_unregister</literal>
|
|
event notification will be recorded in the <literal>repmgr.events</literal> table.
|
|
</para>
|
|
<para>
|
|
If the standby is not running, the command can be executed on another
|
|
node by providing the id of the node to be unregistered using
|
|
the command line parameter <literal>--node-id</literal>, e.g. executing the following
|
|
command on the master server will unregister the standby with
|
|
id <literal>3</literal>:
|
|
<programlisting>
|
|
repmgr standby unregister -f /etc/repmgr.conf --node-id=3
|
|
</programlisting>
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="repmgr-standby-promote" xreflabel="repmgr standby promote">
|
|
<indexterm>
|
|
<primary>repmgr standby promote</primary>
|
|
</indexterm>
|
|
<title>repmgr standby promote</title>
|
|
<para>
|
|
Promotes a standby to a primary if the current primary has failed. This
|
|
command requires a valid <filename>repmgr.conf</filename> file for the standby, either
|
|
specified explicitly with <literal>-f/--config-file</literal> or located in a
|
|
default location; no additional arguments are required.
|
|
</para>
|
|
<para>
|
|
If the standby promotion succeeds, the server will not need to be
|
|
restarted. However any other standbys will need to follow the new server,
|
|
by using <xref linkend="repmgr-standby-follow">; if <command>repmgrd</command> is active, it will
|
|
handle this automatically.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="repmgr-standby-follow" xreflabel="repmgr standby follow">
|
|
<indexterm>
|
|
<primary>repmgr standby follow</primary>
|
|
</indexterm>
|
|
<title>repmgr standby follow</title>
|
|
<para>
|
|
Attaches the standby to a new primary. This command requires a valid
|
|
<filename>repmgr.conf</filename> file for the standby, either specified
|
|
explicitly with <literal>-f/--config-file</literal> or located in a
|
|
default location; no additional arguments are required.
|
|
</para>
|
|
<para>
|
|
This command will force a restart of the standby server, which must be
|
|
running. It can only be used to attach a standby to a new primary node.
|
|
</para>
|
|
<para>
|
|
To re-add an inactive node to the replication cluster, see
|
|
<xref linkend="repmgr-node-rejoin">
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="repmgr-node-rejoin" xreflabel="repmgr node rejoin">
|
|
<indexterm>
|
|
<primary>repmgr node rejoin</primary>
|
|
</indexterm>
|
|
<title>repmgr node rejoin</title>
|
|
<para>
|
|
Enables a dormant (stopped) node to be rejoined to the replication cluster.
|
|
</para>
|
|
<para>
|
|
This can optionally use <command>pg_rewind</command> to re-integrate a node which has diverged
|
|
from the rest of the cluster, typically a failed primary.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="repmgr-cluster-show" xreflabel="repmgr cluster show">
|
|
<indexterm>
|
|
<primary>repmgr cluster show</primary>
|
|
</indexterm>
|
|
<title>repmgr cluster show</title>
|
|
<para>
|
|
Displays information about each active node in the replication cluster. This
|
|
command polls each registered server and shows its role (<literal>primary</literal> /
|
|
<literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
|
|
directly and can be run on any node in the cluster; this is also useful when analyzing
|
|
connectivity from a particular node.
|
|
</para>
|
|
<para>
|
|
This command requires either a valid <filename>repmgr.conf</filename> file or a database
|
|
connection string to one of the registered nodes; no additional arguments are needed.
|
|
</para>
|
|
|
|
<para>
|
|
Example:
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster show
|
|
|
|
ID | Name | Role | Status | Upstream | Location | Connection string
|
|
----+-------+---------+-----------+----------+----------+-----------------------------------------
|
|
1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr
|
|
2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr
|
|
3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To show database connection errors when polling nodes, run the command in
|
|
<literal>--verbose</literal> mode.
|
|
</para>
|
|
<para>
|
|
The `cluster show` command accepts an optional parameter <literal>--csv</literal>, which
|
|
outputs the replication cluster's status in a simple CSV format, suitable for
|
|
parsing by scripts:
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster show --csv
|
|
1,-1,-1
|
|
2,0,0
|
|
3,0,1</programlisting>
|
|
</para>
|
|
<para>
|
|
The columns have following meanings:
|
|
<itemizedlist spacing="compact" mark="bullet">
|
|
<listitem>
|
|
<simpara>
|
|
node ID
|
|
</simpara>
|
|
<simpara>
|
|
availability (0 = available, -1 = unavailable)
|
|
</simpara>
|
|
<simpara>
|
|
recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
|
|
</simpara>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Note that the availability is tested by connecting from the node where
|
|
<command>repmgr cluster show</command> is executed, and does not necessarily imply the node
|
|
is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
|
|
a better overviews of connections between nodes.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="repmgr-cluster-matrix" xreflabel="repmgr cluster matrix">
|
|
<indexterm>
|
|
<primary>repmgr cluster matrix</primary>
|
|
</indexterm>
|
|
<title>repmgr cluster matric</title>
|
|
<para>
|
|
<command>repmgr cluster matrix</command> runs <command>repmgr cluster show</command> on each
|
|
node and arranges the results in a matrix, recording success or failure.
|
|
</para>
|
|
<para>
|
|
<command>repmgr cluster matrix</command> requires a valid <filename>repmgr.conf</filename>
|
|
file on each node. Additionally passwordless `ssh` connections are required between
|
|
all nodes.
|
|
</para>
|
|
<para>
|
|
Example 1 (all nodes up):
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | *
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | * | * | *</programlisting>
|
|
</para>
|
|
<para>
|
|
Example 2 (<literal>node1</literal> and <literal>node2</literal> up, <literal>node3</literal> down):
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | x
|
|
node3 | 3 | ? | ? | ?
|
|
</programlisting>
|
|
</para>
|
|
<para>
|
|
Each row corresponds to one server, and indicates the result of
|
|
testing an outbound connection from that server.
|
|
</para>
|
|
<para>
|
|
Since <literal>node3</literal> is down, all the entries in its row are filled with
|
|
<literal>?</literal>, meaning that there we cannot test outbound connections.
|
|
</para>
|
|
<para>
|
|
The other two nodes are up; the corresponding rows have <literal>x</literal> in the
|
|
column corresponding to <literal>node3</literal>, meaning that inbound connections to
|
|
that node have failed, and `*` in the columns corresponding to
|
|
<literal>node1</literal> and <literal>node2</literal>, meaning that inbound connections
|
|
to these nodes have succeeded.
|
|
</para>
|
|
<para>
|
|
Example 3 (all nodes up, firewall dropping packets originating
|
|
from <literal>node1</literal> and directed to port 5432 on <literal>node3</literal>) -
|
|
running <command>repmgr cluster matrix</command> from <literal>node1</literal> gives the following output:
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster matrix
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | ? | ? | ?</programlisting>
|
|
</para>
|
|
<para>
|
|
Note this may take some time depending on the <varname>connect_timeout</varname>
|
|
setting in the node <varname>conninfo</varname> strings; default is
|
|
<literal>1 minute</literal> which means without modification the above
|
|
command would take around 2 minutes to run; see comment elsewhere about setting
|
|
<varname>connect_timeout</varname>)
|
|
</para>
|
|
<para>
|
|
The matrix tells us that we cannot connect from <literal>node1</literal> to <literal>node3</literal>,
|
|
and that (therefore) we don't know the state of any outbound
|
|
connection from <literal>node3</literal>.
|
|
</para>
|
|
<para>
|
|
In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
|
|
useful result.
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="repmgr-cluster-crosscheck" xreflabel="repmgr cluster crosscheck">
|
|
<indexterm>
|
|
<primary>repmgr cluster crosscheck</primary>
|
|
</indexterm>
|
|
<title>repmgr cluster crosscheck</title>
|
|
<para>
|
|
<command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
|
|
but cross-checks connections between each combination of nodes. In "Example 3" in
|
|
<xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
|
|
However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
|
|
overview of the cluster situation:
|
|
<programlisting>
|
|
$ repmgr -f /etc/repmgr.conf cluster crosscheck
|
|
|
|
Name | Id | 1 | 2 | 3
|
|
-------+----+----+----+----
|
|
node1 | 1 | * | * | x
|
|
node2 | 2 | * | * | *
|
|
node3 | 3 | * | * | *</programlisting>
|
|
</para>
|
|
<para>
|
|
What happened is that <command>repmgr cluster crosscheck</command> merged its own
|
|
<command>repmgr cluster matrix</command> with the <command>repmgr cluster matrix</command>
|
|
output from <literal>node2</literal>; the latter is able to connect to <literal>node3</literal>
|
|
and therefore determine the state ofx outbound connections from that node.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
</chapter>
|