doc: define entity for repmgrd

This commit is contained in:
Ian Barwick
2019-05-01 10:36:54 +09:00
parent 4d1e11533e
commit dbeffbf29a
31 changed files with 298 additions and 297 deletions

View File

@@ -19,7 +19,7 @@
<para>
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via <application>repmgrd</application>, and is not compatible with PostgreSQL 9.2
via &repmgrd;, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series is no longer maintained.
</para>
@@ -135,7 +135,7 @@
No.
</para>
<para>
&repmgr; (together with <application>repmgrd</application>) assists with
&repmgr; (together with &repmgrd;) assists with
<emphasis>managing</emphasis> replication. It does not actually perform replication, which
is part of the core PostgreSQL functionality.
</para>
@@ -152,8 +152,8 @@
<title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
<para>
Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
&repmgr; (in particular <application>repmgrd</application>)
may not run, or run properly, or in the worst case (if different <application>repmgrd</application>
&repmgr; (in particular &repmgrd;)
may not run, or run properly, or in the worst case (if different &repmgrd;
versions are running and there are differences in the failover implementation) break
your replication cluster.
</para>
@@ -282,19 +282,19 @@
<sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using <application>repmgrd</application>?</title>
in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
<para>
No, the <literal>repmgr</literal> shared library is only needed when running <application>repmgrd</application>.
If you later decide to run <application>repmgrd</application>, you just need to add
No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;.
If you later decide to run &repmgrd;, you just need to add
<literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
</para>
</sect2>
<sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
<title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
but <command>repmgr</command>/<application>repmgrd</application> complains it can't connect to the server... Why?</title>
but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title>
<para>
<command>repmgr</command> and <application>repmgrd</application> need to be able to connect to the repmgr database
<command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database
with a normal connection to query metadata. The <literal>replication</literal> connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user).
</para>
@@ -349,7 +349,7 @@
</sect1>
<sect1 id="faq-repmgrd" xreflabel="repmgrd">
<title><application>repmgrd</application></title>
<title>&repmgrd;</title>
<sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
@@ -365,12 +365,12 @@
</sect2>
<sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
<title>Does <application>repmgrd</application> support delayed standbys?</title>
<title>Does &repmgrd; support delayed standbys?</title>
<para>
<application>repmgrd</application> can monitor delayed standbys - those set up with
&repmgrd; can monitor delayed standbys - those set up with
<varname>recovery_min_apply_delay</varname> set to a non-zero value
in <filename>recovery.conf</filename> - but as it's not currently possible
to directly examine the value applied to the standby, <application>repmgrd</application>
to directly examine the value applied to the standby, &repmgrd;
may not be able to properly evaluate the node as a promotion candidate.
</para>
<para>
@@ -379,13 +379,13 @@
<filename>repmgr.conf</filename>.
</para>
<para>
Note that after registering a delayed standby, <application>repmgrd</application> will only start
Note that after registering a delayed standby, &repmgrd; will only start
once the metadata added in the primary node has been replicated.
</para>
</sect2>
<sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
<title>How can I get <application>repmgrd</application> to rotate its logfile?</title>
<title>How can I get &repmgrd; to rotate its logfile?</title>
<para>
Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation">.
</para>
@@ -393,11 +393,11 @@
</sect2>
<sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
<title>I've recloned a failed primary as a standby, but <application>repmgrd</application> refuses to start?</title>
<title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title>
<para>
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
<literal>automatic</literal>, which is probably not what you want. <application>repmgrd</application> will start if
<literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if
<varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
be monitored, if desired.
</para>
@@ -405,7 +405,7 @@
<sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
<title>
<application>repmgrd</application> ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
&repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
@@ -416,13 +416,13 @@
<sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
<title>
<application>repmgrd</application> aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
&repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
</title>
<para>
<application>repmgrd</application> does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, <application>repmgrd</application>
&repmgrd; does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, &repmgrd;
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if <application>repmgrd</application> is not running on other nodes.
particularly if &repmgrd; is not running on other nodes.
</para>
<para>
In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
@@ -430,7 +430,7 @@
</para>
<para>
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
starting <application>repmgrd</application>.
starting &repmgrd;.
</para>
</sect2>

View File

@@ -310,7 +310,7 @@
</para>
<para>
See also <xref linkend="repmgrd-configuration-debian-ubuntu"> for some specifics related
to configuring the <application>repmgrd</application> daemon.
to configuring the &repmgrd; daemon.
</para>
<table id="debian-9-packages">
@@ -552,7 +552,7 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
<listitem>
<para>
PID file location: the default <application>repmgrd</application> PID file
PID file location: the default &repmgrd; PID file
location can be hard-coded by patching <varname>package_pid_file</varname>
in <filename>repmgrd.c</filename>:
<programlisting>

View File

@@ -105,7 +105,7 @@
documentation section <link linkend="upgrading-major-version">Upgrading a major version release</link>.
</para>
<para>
If <application>repmgrd</application> is in use, a PostgreSQL restart <emphasis>is</emphasis> required;
If &repmgrd; is in use, a PostgreSQL restart <emphasis>is</emphasis> required;
in that case we suggest combining this &repmgr; upgrade with the next PostgreSQL
minor release, which will require a PostgreSQL restart in any case.
</para>
@@ -113,7 +113,7 @@
<important>
<para>
On Debian-based systems, including Ubuntu, if using <application>repmgrd</application>
On Debian-based systems, including Ubuntu, if using &repmgrd;
please ensure that in the file <filename>/etc/init.d/repmgrd</filename>, the parameter
<varname>REPMGRD_OPTS</varname> contains &quot;<literal>--daemonize=false</literal>&quot;, e.g.:
<programlisting>
@@ -156,7 +156,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
New commands <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link> and
<link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link>:
these provide a standardized way of starting and stopping <application>repmgrd</application>.
these provide a standardized way of starting and stopping &repmgrd;.
GitHub #528.
</para>
<note>
@@ -172,7 +172,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>
additionally displays the node priority and the interval (in seconds) since the
<application>repmgrd</application> instance last verified its upstream node was available.
&repmgrd; instance last verified its upstream node was available.
</para>
</listitem>
@@ -240,20 +240,20 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application> will no longer consider nodes where <application>repmgrd</application>
&repmgrd; will no longer consider nodes where &repmgrd;
is not running as promotion candidates.
</para>
<para>
Previously, if <application>repmgrd</application> was not running on a node, but
Previously, if &repmgrd; was not running on a node, but
that node qualified as the promotion candidate, it would never be promoted due to
the absence of a running <application>repmgrd</application>.
the absence of a running &repmgrd;.
</para>
</listitem>
<listitem>
<para>
Add option <option>connection_check_type</option> to enable selection of the method
<application>repmgrd</application> uses to determine whether the upstream node is available.
&repmgrd; uses to determine whether the upstream node is available.
</para>
<para>
Possible values are <literal>ping</literal> (default; uses <command>PQping()</command> to
@@ -266,7 +266,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
New configuration option <link linkend="repmgrd-failover-validation"><option>failover_validation_command</option></link>
to allow an external mechanism to validate the failover decision made by <application>repmgrd</application>.
to allow an external mechanism to validate the failover decision made by &repmgrd;.
</para>
</listitem>
@@ -279,7 +279,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
In a failover situation, <application>repmgrd</application> will not attempt to promote a
In a failover situation, &repmgrd; will not attempt to promote a
node if another primary has already appeared (e.g. by being promoted manually).
GitHub #420.
</para>
@@ -364,7 +364,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: on a cascaded standby, don't fail over if
&repmgrd;: on a cascaded standby, don't fail over if
<literal>failover=manual</literal>. GitHub #531.
</para>
</listitem>
@@ -393,7 +393,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<important>
<para>
On Debian-based systems, including Ubuntu, if using <application>repmgrd</application>
On Debian-based systems, including Ubuntu, if using &repmgrd;
please ensure that the in the file <filename>/etc/init.d/repmgrd</filename>, the parameter
<varname>REPMGRD_OPTS</varname> contains &quot;<literal>--daemonize=false</literal>&quot;, e.g.:
<programlisting>
@@ -488,12 +488,12 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application> can now be &quot;paused&quot;, i.e. instructed
&repmgrd; can now be &quot;paused&quot;, i.e. instructed
not to take any action such as a failover, even if the prerequisites for such an
action are detected.
</para>
<para>
This removes the need to stop <application>repmgrd</application> on all nodes when
This removes the need to stop &repmgrd; on all nodes when
performing a planned operation such as a switchover.
</para>
<para>
@@ -519,7 +519,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: fix parsing of <option>-d/--daemonize</option> option.
&repmgrd;: fix parsing of <option>-d/--daemonize</option> option.
</para>
</listitem>
@@ -537,7 +537,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
We recommend upgrading to this version as soon as possible.
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.1.0;
<application>repmgrd</application> (if running) should be restarted.
&repmgrd; (if running) should be restarted.
See <xref linkend="upgrading-repmgr"> for more details.
</para>
@@ -619,8 +619,8 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
Check <varname>promote_command</varname> and <varname>follow_command</varname>
are defined when reloading configuration. These were checked on startup but
not reload by <application>repmgrd</application>, which made it possible to
make <application>repmgrd</application> with invalid values. It's unlikely
not reload by &repmgrd;, which made it possible to
make &repmgrd; with invalid values. It's unlikely
anyone would want to do this, but we should make it impossible anyway.
(GitHub #486).
</para>
@@ -661,7 +661,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: fix startup on witness node when local data is stale. (GitHub #488, #489).
&repmgrd;: fix startup on witness node when local data is stale. (GitHub #488, #489).
</para>
</listitem>
@@ -687,7 +687,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<title>Release 4.1.0</title>
<para><emphasis>Tue July 31, 2018</emphasis></para>
<para>
&repmgr; 4.1.0 introduces some changes to <application>repmgrd</application>
&repmgr; 4.1.0 introduces some changes to &repmgrd;
behaviour and some additional configuration parameters.
</para>
<para>
@@ -703,7 +703,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
</listitem>
<listitem>
<para>
<application>repmgrd</application> must be restarted on all nodes where it is running.
&repmgrd; must be restarted on all nodes where it is running.
</para>
</listitem>
@@ -825,14 +825,14 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: create a PID file by default
&repmgrd;: create a PID file by default
(GitHub #457). For details, see <xref linkend="repmgrd-pid-file">.
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: daemonize process by default.
&repmgrd;: daemonize process by default.
In case, for whatever reason, the user does not wish to daemonize the
process, provide <option>--daemonize=false</option>.
(GitHub #458).
@@ -901,7 +901,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
We recommend upgrading to this version as soon as possible.
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.5;
<application>repmgrd</application> (if running) should be restarted. See <xref linkend="upgrading-repmgr">
&repmgrd; (if running) should be restarted. See <xref linkend="upgrading-repmgr">
for more details.
</para>
@@ -988,7 +988,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: ensure local node is counted as quorum member
&repmgrd;: ensure local node is counted as quorum member
(GitHub #439)
</para>
</listitem>
@@ -1005,7 +1005,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<para>
&repmgr; 4.0.5 contains a number of usability enhancements related to
<application>pg_rewind</application> usage, <filename>recovery.conf</filename>
generation and (in <application>repmgrd</application>) handling of various
generation and (in &repmgrd;) handling of various
corner-case situations, as well as a number of bug fixes.
</para>
@@ -1070,7 +1070,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: set <literal>connect_timeout=2</literal> (if not explicitly set)
&repmgrd;: set <literal>connect_timeout=2</literal> (if not explicitly set)
when pinging a server.
</para>
</listitem>
@@ -1126,20 +1126,20 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: handle <command>pg_ctl promote</command> timeout (GitHub #425).
&repmgrd;: handle <command>pg_ctl promote</command> timeout (GitHub #425).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: handle failover situation with only two nodes in the primary
&repmgrd;: handle failover situation with only two nodes in the primary
location, and at least one node in another location (GitHub #407).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: prevent standby connection handle from going stale.
&repmgrd;: prevent standby connection handle from going stale.
</para>
</listitem>
@@ -1163,7 +1163,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
</para>
<para>
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.3;
<application>repmgrd</application> (if running) should be restarted. See <xref linkend="upgrading-repmgr">
&repmgrd; (if running) should be restarted. See <xref linkend="upgrading-repmgr">
for more details.
</para>
@@ -1242,14 +1242,14 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: improve detection of status change from primary to
&repmgrd;: improve detection of status change from primary to
standby
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: improve reconnection to the local node after a
&repmgrd;: improve reconnection to the local node after a
failover (previously a connection error due to the node starting up was being
interpreted as the node being unavailable)
</para>
@@ -1257,14 +1257,14 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<application>repmgrd</application>: when running on a witness server, correctly connect
&repmgrd;: when running on a witness server, correctly connect
to new primary after a failover
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: add <link linkend="event-notifications">event notification</link>
&repmgrd;: add <link linkend="event-notifications">event notification</link>
<literal>repmgrd_shutdown</literal> (GitHub #393)
</para>
</listitem>
@@ -1433,7 +1433,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
</para>
<para>
This release can be installed as a simple package upgrade from &repmgr; 4.0.1 or 4.0;
<application>repmgrd</application> (if running) should be restarted.
&repmgrd; (if running) should be restarted.
</para>
<sect2>
@@ -1653,10 +1653,10 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<emphasis>improved logging output</emphasis>:
&repmgr; (and <application>repmgrd</application>) now provide more explicit
&repmgr; (and &repmgrd;) now provide more explicit
logging output giving a better picture of what is going on. Where appropriate,
<literal>DETAIL</literal> and <literal>HINT</literal> log lines provide additional
detail and suggestions for resolving problems. Additionally, <application>repmgrd</application>
detail and suggestions for resolving problems. Additionally, &repmgrd;
now emits informational log lines at regular, configurable intervals
to confirm that it's running correctly and which node(s) it's monitoring.
</para>
@@ -1703,11 +1703,11 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<emphasis>automatic failover</emphasis>:
improved detection of node status; promotion decision based on a consensual
model, with the promoted primary explicitly informing other standbys to
follow it. The <application>repmgrd</application> daemon will continue
follow it. The &repmgrd; daemon will continue
functioning even if the monitored PostgreSQL instance is down, and resume
monitoring if it reappears. Additionally, if the instance's role has changed
(typically from a primary to a standby, e.g. following reintegration of a
failed primary using <xref linkend="repmgr-node-rejoin">) <application>repmgrd</application>
failed primary using <xref linkend="repmgr-node-rejoin">) &repmgrd;
will automatically resume monitoring it as a standby.
</para>
</listitem>
@@ -1793,7 +1793,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
</para>
<para>
<application>repmgrd</application>
&repmgrd;
<itemizedlist>
<listitem><para>
@@ -1915,7 +1915,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<simpara>
new parameter <varname>log_status_interval</varname>, which causes
<application>repmgrd</application> to emit a status log
&repmgrd; to emit a status log
line at the specified interval
</simpara>
</listitem>

View File

@@ -80,7 +80,7 @@
the maximum level of logging output.
</para>
<para>
If issues are encountered with <application>repmgrd</application>,
If issues are encountered with &repmgrd;,
please provide relevant extracts from the &repmgr; log files
and if possible the PostgreSQL log itself. Please ensure these
logs do not contain any confidential data.

View File

@@ -10,7 +10,7 @@
<title>Log settings</title>
<para>
By default, &repmgr; and <application>repmgrd</application> write log output to
By default, &repmgr; and &repmgrd; write log output to
<literal>STDERR</literal>. An alternative log destination can be specified
(either a file or <literal>syslog</literal>).
</para>
@@ -24,7 +24,7 @@
<para>
This behaviour can be overriden with the command line option <option>--log-to-file</option>,
which will redirect all logging output to the configured log destination. This is recommended
when &repmgr; is executed by another application, particularly <application>repmgrd</application>,
when &repmgr; is executed by another application, particularly &repmgrd;,
to enable log output generated by the &repmgr; application to be stored for later reference.
</para>
</note>
@@ -93,9 +93,9 @@
</term>
<listitem>
<para>
This setting causes <application>repmgrd</application> to emit a status log
This setting causes &repmgrd; to emit a status log
line at the specified interval (in seconds, default <literal>300</literal>)
describing <application>repmgrd</application>'s current state, e.g.:
describing &repmgrd;'s current state, e.g.:
</para>
<programlisting>
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>

View File

@@ -99,7 +99,7 @@
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
</para>
<para>
For <application>repmgrd</application>-specific settings, see <xref linkend="repmgrd-configuration">.
For &repmgrd;-specific settings, see <xref linkend="repmgrd-configuration">.
</para>
<note>

View File

@@ -10,7 +10,7 @@
<title>Service command settings</title>
<para>
In some circumstances, &repmgr; (and <application>repmgrd</application>) need to
In some circumstances, &repmgr; (and &repmgrd;) need to
be able to stop, start or restart PostgreSQL. &repmgr; commands which need to do this
include <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>,
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link> and
@@ -68,7 +68,7 @@
</para>
<para>
Do not confuse this with <varname>promote_command</varname>, which is used
by <application>repmgrd</application> to execute <xref linkend="repmgr-standby-promote">.
by &repmgrd; to execute <xref linkend="repmgr-standby-promote">.
</para>
</note>

View File

@@ -11,7 +11,7 @@
<title>Configuration file</title>
<para>
<application>repmgr</application> and <application>repmgrd</application>
<application>repmgr</application> and &repmgrd;
use a common configuration file, by default called
<filename>repmgr.conf</filename> (although any name can be used if explicitly specified).
<filename>repmgr.conf</filename> must contain a number of required parameters, including

View File

@@ -6,7 +6,7 @@
<title>Event Notifications</title>
<para>
Each time &repmgr; or <application>repmgrd</application> perform a significant event, a record
Each time &repmgr; or &repmgrd; perform a significant event, a record
of that event is written into the <literal>repmgr.events</literal> table together with
a timestamp, an indication of failure or success, and further details
if appropriate. This is useful for gaining an overview of events
@@ -207,7 +207,7 @@
</para>
<para>
Events generated by <application>repmgrd</application> (streaming replication mode):
Events generated by &repmgrd; (streaming replication mode):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
@@ -274,7 +274,7 @@
</para>
<para>
Events generated by <application>repmgrd</application> (BDR mode):
Events generated by &repmgrd; (BDR mode):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>bdr_failover</literal></simpara>

View File

@@ -45,14 +45,14 @@
</simpara>
<simpara>
If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x)
are installed on different nodes, in the best case &repmgr; (in particular <application>repmgrd</application>)
are installed on different nodes, in the best case &repmgr; (in particular &repmgrd;)
will not run. In the worst case, you will end up with a broken cluster.
</simpara>
</note>
<para>
A dedicated system user for &repmgr; is <emphasis>not</emphasis> required; as many &repmgr; and
<application>repmgrd</application> actions require direct access to the PostgreSQL data directory,
&repmgrd; actions require direct access to the PostgreSQL data directory,
these commands should be executed by the <literal>postgres</literal> user.
</para>

View File

@@ -58,7 +58,7 @@
<listitem>
<simpara>
This is the action which occurs if a primary server fails and a suitable standby
is promoted as the new primary. The <application>repmgrd</application> daemon supports automatic failover
is promoted as the new primary. The &repmgrd; daemon supports automatic failover
to minimise downtime.
</simpara>
</listitem>
@@ -107,7 +107,7 @@
promotes a (local) standby.
</para>
<para>
A witness server only needs to be created if <application>repmgrd</application>
A witness server only needs to be created if &repmgrd;
is in use.
</para>
</listitem>
@@ -198,7 +198,7 @@
</listitem>
<listitem>
<simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
written by <application>repmgrd</application></simpara>
written by &repmgrd;</simpara>
</listitem>
</itemizedlist>
</para>
@@ -214,7 +214,7 @@
name of the server's upstream node</simpara>
</listitem>
<listitem>
<simpara>repmgr.replication_status: when <application>repmgrd</application>'s monitoring is enabled, shows
<simpara>repmgr.replication_status: when &repmgrd;'s monitoring is enabled, shows
current monitoring status for each standby.</simpara>
</listitem>
</itemizedlist>

View File

@@ -352,7 +352,7 @@
slot_name |
config_file | /etc/repmgr.conf</programlisting>
<para>
Each server in the replication cluster will have its own record. If <application>repmgrd</application>
Each server in the replication cluster will have its own record. If &repmgrd;
is in use, the fields <literal>upstream_node_id</literal>, <literal>active</literal> and
<literal>type</literal> will be updated when the node's status or role changes.
</para>

View File

@@ -38,7 +38,7 @@
<title>Notes</title>
<para>
Monitoring history will only be written if <application>repmgrd</application> is active, and
Monitoring history will only be written if &repmgrd; is active, and
<varname>monitoring_history</varname> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>.
</para>

View File

@@ -14,30 +14,30 @@
<refnamediv>
<refname>repmgr daemon pause</refname>
<refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to pause failover operations</refpurpose>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to pause failover operations</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command can be run on any active node in the replication cluster to instruct all
running <application>repmgrd</application> instances to &quot;pause&quot; themselves, i.e. take no
running &repmgrd; instances to &quot;pause&quot; themselves, i.e. take no
action (such as promoting themselves or following a new primary) if a failover event is detected.
</para>
<para>
This functionality is useful for performing maintenance operations, such as switchovers
or upgrades, which might otherwise trigger a failover if <application>repmgrd</application>
or upgrades, which might otherwise trigger a failover if &repmgrd;
is running normally.
</para>
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
<command>repmgr daemon pause</command>, as the &repmgrd; instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
<para>
<xref linkend="repmgr-daemon-unpause"> will instruct all previously paused <application>repmgrd</application>
<xref linkend="repmgr-daemon-unpause"> will instruct all previously paused &repmgrd;
instances to resume normal failover operation.
</para>
</refsect1>
@@ -69,7 +69,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check if nodes are reachable but don't pause <application>repmgrd</application>.
Check if nodes are reachable but don't pause &repmgrd;.
</para>
</listitem>
</varlistentry>
@@ -87,7 +87,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
<application>repmgrd</application> could be paused on all nodes.
&repmgrd; could be paused on all nodes.
</para>
</listitem>
</varlistentry>
@@ -96,7 +96,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>ERR_REPMGRD_PAUSE (26)</option></term>
<listitem>
<para>
<application>repmgrd</application> could not be paused on one or mode nodes.
&repmgrd; could not be paused on one or mode nodes.
</para>
</listitem>
</varlistentry>

View File

@@ -14,17 +14,17 @@
<refnamediv>
<refname>repmgr daemon start</refname>
<refpurpose>Start the <application>repmgrd</application> daemon</refpurpose>
<refpurpose>Start the &repmgrd; daemon</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command starts the <application>repmgrd</application> daemon on the
This command starts the &repmgrd; daemon on the
local node.
</para>
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that <application>repmgrd</application>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
started. This behaviour can be overridden by specifying a diffent value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>
@@ -50,7 +50,7 @@
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually attempt to start <application>repmgrd</application>.
Check prerequisites but don't actually attempt to start &repmgrd;.
</para>
<para>
This action will output the command which would be executed.
@@ -63,7 +63,7 @@
<term><option>--wait</option></term>
<listitem>
<para>
Wait for the specified number of seconds to confirm that <application>repmgrd</application>
Wait for the specified number of seconds to confirm that &repmgrd;
started successfully.
</para>
<para>
@@ -77,7 +77,7 @@
<term><option>--no-wait</option></term>
<listitem>
<para>
Don't wait to confirm that <application>repmgrd</application>
Don't wait to confirm that &repmgrd;
started successfully.
</para>
<para>
@@ -109,7 +109,7 @@
<para>
<command>repmgr daemon start</command> will execute the command defined by the
<varname>repmgrd_service_start_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will start <application>repmgrd</application>;
This must be set to a shell command which will start &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
@@ -117,7 +117,7 @@
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_start_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the <application>repmgrd</application>
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
@@ -139,12 +139,12 @@
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The <application>repmgrd</application> start command (defined in
The &repmgrd; start command (defined in
<varname>repmgrd_service_start_command</varname>) was successfully executed.
</para>
<para>
If the <option>--wait</option> option was provided, &repmgr; will confirm that
<application>repmgrd</application> has actually started up.
&repmgrd; has actually started up.
</para>
</listitem>
</varlistentry>
@@ -167,10 +167,10 @@
&repmgr; was unable to connect to the local PostgreSQL node.
</para>
<para>
PostgreSQL must be running before <application>repmgrd</application>
PostgreSQL must be running before &repmgrd;
can be started. Additionally, unless the <option>--no-wait</option> option was
provided, &repmgr; needs to be able to connect to the local PostgreSQL node
to determine the state of <application>repmgrd</application>.
to determine the state of &repmgrd;.
</para>
</listitem>
</varlistentry>
@@ -180,11 +180,11 @@
<term><option>ERR_REPMGRD_SERVICE (27)</option></term>
<listitem>
<para>
The <application>repmgrd</application> start command (defined in
The &repmgrd; start command (defined in
<varname>repmgrd_service_start_command</varname>) was not successfully executed.
</para>
<para>
This can also mean that &repmgr; was unable to confirm whether <application>repmgrd</application>
This can also mean that &repmgr; was unable to confirm whether &repmgrd;
successfully started (unless the <option>--no-wait</option> option was provided).
</para>
</listitem>

View File

@@ -14,14 +14,14 @@
<refnamediv>
<refname>repmgr daemon status</refname>
<refpurpose>display information about the status of <application>repmgrd</application> on each node in the cluster</refpurpose>
<refpurpose>display information about the status of &repmgrd; on each node in the cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command provides an overview over all active nodes in the cluster and the state
of each node's <application>repmgrd</application> instance. It can be used to check
of each node's &repmgrd; instance. It can be used to check
the result of <xref linkend="repmgr-daemon-pause"> and <xref linkend="repmgr-daemon-unpause">
operations.
</para>
@@ -35,13 +35,13 @@
</para>
<para>
If PostgreSQL is not running on a node, &repmgr; will not be able to determine the
status of that node's <application>repmgrd</application> instance.
status of that node's &repmgrd; instance.
</para>
<note>
<para>
After restarting PostgreSQL on any node, the <application>repmgrd</application> instance
After restarting PostgreSQL on any node, the &repmgrd; instance
will take a second or two before it is able to update its status. Until then,
<application>repmgrd</application> will be shown as not running.
&repmgrd; will be shown as not running.
</para>
</note>
@@ -50,7 +50,7 @@
<refsect1>
<title>Examples</title>
<para>
<application>repmgrd</application> running normally on all nodes:
&repmgrd; running normally on all nodes:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
@@ -60,7 +60,7 @@
</para>
<para>
<application>repmgrd</application> paused on all nodes (using <xref linkend="repmgr-daemon-pause">):
&repmgrd; paused on all nodes (using <xref linkend="repmgr-daemon-pause">):
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
@@ -70,7 +70,7 @@
</para>
<para>
<application>repmgrd</application> not running on one node:
&repmgrd; not running on one node:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+-------------+-------+---------+--------------------
@@ -127,25 +127,25 @@
<listitem>
<simpara>
<application>repmgrd</application> running (1 = running, 0 = not running, -1 = unknown)
&repmgrd; running (1 = running, 0 = not running, -1 = unknown)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> PID (-1 if not running or status unknown)
&repmgrd; PID (-1 if not running or status unknown)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> paused (1 = paused, 0 = not paused, -1 = unknown)
&repmgrd; paused (1 = paused, 0 = not paused, -1 = unknown)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> node priority
&repmgrd; node priority
</simpara>
</listitem>

View File

@@ -14,25 +14,25 @@
<refnamediv>
<refname>repmgr daemon stop</refname>
<refpurpose>Stop the <application>repmgrd</application> daemon</refpurpose>
<refpurpose>Stop the &repmgrd; daemon</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command stops the <application>repmgrd</application> daemon on the
This command stops the &repmgrd; daemon on the
local node.
</para>
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that <application>repmgrd</application>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
stopped. This behaviour can be overridden by specifying a diffent value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>
<note>
<para>
If PostgreSQL is not running on the local node, under some circumstances &repmgr; may not
be able to confirm if <application>repmgrd</application> has actually stopped.
be able to confirm if &repmgrd; has actually stopped.
</para>
</note>
@@ -50,7 +50,7 @@
<para>
<command>repmgr daemon stop</command> will execute the command defined by the
<varname>repmgrd_service_stop_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will stop <application>repmgrd</application>;
This must be set to a shell command which will stop &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
@@ -59,7 +59,7 @@
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_stop_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the <application>repmgrd</application>
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
@@ -76,7 +76,7 @@
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually attempt to stop <application>repmgrd</application>.
Check prerequisites but don't actually attempt to stop &repmgrd;.
</para>
<para>
This action will output the command which would be executed.
@@ -89,7 +89,7 @@
<term><option>--wait</option></term>
<listitem>
<para>
Wait for the specified number of seconds to confirm that <application>repmgrd</application>
Wait for the specified number of seconds to confirm that &repmgrd;
stopped successfully.
</para>
<para>
@@ -103,7 +103,7 @@
<term><option>--no-wait</option></term>
<listitem>
<para>
Don't wait to confirm that <application>repmgrd</application>
Don't wait to confirm that &repmgrd;
stopped successfully.
</para>
<para>
@@ -134,7 +134,7 @@
<para>
<command>repmgr daemon stop</command> will execute the command defined by the
<varname>repmgrd_service_stop_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will stop <application>repmgrd</application>;
This must be set to a shell command which will stop &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
@@ -142,7 +142,7 @@
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_stop_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the <application>repmgrd</application>
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
@@ -163,7 +163,7 @@
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
<application>repmgrd</application> could be stopped.
&repmgrd; could be stopped.
</para>
</listitem>
</varlistentry>
@@ -182,7 +182,7 @@
<term><option>ERR_REPMGRD_SERVICE (27)</option></term>
<listitem>
<para>
<application>repmgrd</application> could not be stopped.
&repmgrd; could not be stopped.
</para>
</listitem>
</varlistentry>

View File

@@ -15,14 +15,14 @@
<refnamediv>
<refname>repmgr daemon unpause</refname>
<refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to resume failover operations</refpurpose>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to resume failover operations</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command can be run on any active node in the replication cluster to instruct all
running <application>repmgrd</application> instances to &quot;unpause&quot;
running &repmgrd; instances to &quot;unpause&quot;
(following a previous execution of <xref linkend="repmgr-daemon-pause">)
and resume normal failover/monitoring operation.
</para>
@@ -30,7 +30,7 @@
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
<command>repmgr daemon pause</command>, as the &repmgrd; instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
@@ -64,7 +64,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check if nodes are reachable but don't unpause <application>repmgrd</application>.
Check if nodes are reachable but don't unpause &repmgrd;.
</para>
</listitem>
</varlistentry>
@@ -82,7 +82,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
<application>repmgrd</application> could be unpaused on all nodes.
&repmgrd; could be unpaused on all nodes.
</para>
</listitem>
</varlistentry>
@@ -91,7 +91,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>ERR_REPMGRD_PAUSE (26)</option></term>
<listitem>
<para>
<application>repmgrd</application> could not be unpaused on one or mode nodes.
&repmgrd; could not be unpaused on one or mode nodes.
</para>
</listitem>
</varlistentry>

View File

@@ -122,7 +122,7 @@
If not provided, &repmgr; will attempt to follow the current primary node.
</para>
<para>
Note that when using <application>repmgrd</application>, <option>--upstream-node-id</option>
Note that when using &repmgrd;, <option>--upstream-node-id</option>
should always be configured;
see <link linkend="repmgrd-automatic-failover-configuration">Automatic failover configuration</link>
for details.

View File

@@ -23,7 +23,7 @@
<para>
If the standby promotion succeeds, the server will not need to be
restarted. However any other standbys will need to follow the new server,
by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application>
by using <xref linkend="repmgr-standby-follow">; if &repmgrd;
is active, it will handle this automatically.
</para>
<para>

View File

@@ -17,7 +17,7 @@
<para>
<command>repmgr standby register</command> adds a standby's information to
the &repmgr; metadata. This command needs to be executed to enable
promote/follow operations and to allow <application>repmgrd</application> to work with the node.
promote/follow operations and to allow &repmgrd; to work with the node.
An existing standby can be registered using this command. Execute with the
<literal>--dry-run</literal> option to check what would happen without actually registering the
standby.
@@ -59,7 +59,7 @@
<para>
Depending on your environment and workload, it may take some time for the standby's node record
to propagate from the primary to the standby. Some actions (such as starting
<application>repmgrd</application>) require that the standby's node record
&repmgrd;) require that the standby's node record
is present and up-to-date to function correctly.
</para>
<para>

View File

@@ -48,12 +48,12 @@
<note>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
<application>repmgrd</application> instances to pause operations while the switchover
is being carried out, to prevent <application>repmgrd</application> from
&repmgrd; instances to pause operations while the switchover
is being carried out, to prevent &repmgrd; from
unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing">.
</para>
<para>
Users of &repmgr; versions prior to 4.2 should ensure that <application>repmgrd</application>
Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd;
is not running on any nodes while a switchover is being executed.
</para>
</note>
@@ -134,11 +134,11 @@
<term><option>--repmgrd-no-pause</option></term>
<listitem>
<para>
Don't pause <application>repmgrd</application> while executing a switchover.
Don't pause &repmgrd; while executing a switchover.
</para>
<para>
This option should not be used unless you take steps by other means
to ensure <application>repmgrd</application> is paused or not
to ensure &repmgrd; is paused or not
running on all nodes.
</para>
</listitem>

View File

@@ -20,7 +20,7 @@
record to the &repmgr; metadata, and if necessary initialises the witness
node by installing the &repmgr; extension and copying the &repmgr; metadata
to the witness server. This command needs to be executed to enable
use of the witness server with <application>repmgrd</application>.
use of the witness server with &repmgrd;.
</para>
<para>
When executing <command>repmgr witness register</command>, database connection

View File

@@ -1,4 +1,4 @@
<!-- doc/src/sgml/postgres.sgml -->
<!-- doc/repmgr.sgml -->
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
@@ -9,6 +9,7 @@
%filelist;
<!ENTITY repmgr "<productname>repmgr</productname>">
<!ENTITY repmgrd "<productname>repmgrd</productname>">
<!ENTITY postgres "<productname>PostgreSQL</productname>">
]>

View File

@@ -7,7 +7,7 @@
<title>Automatic failover with repmgrd</title>
<para>
<application>repmgrd</application> is a management and monitoring daemon which runs
&repmgrd; is a management and monitoring daemon which runs
on each node in a replication cluster. It can automate actions such as
failover and updating standbys to follow the new primary, as well as
providing monitoring information about the state of each standby.
@@ -60,7 +60,7 @@
<note>
<simpara>
A witness server will only be useful if <application>repmgrd</application>
A witness server will only be useful if &repmgrd;
is in use.
</simpara>
</note>
@@ -96,11 +96,11 @@
<simpara>
As the witness server is not part of the replication cluster, further
changes to the &repmgr; metadata will be synchronised by
<application>repmgrd</application>.
&repmgrd;.
</simpara>
</note>
<para>
Once the witness server has been configured, <application>repmgrd</application>
Once the witness server has been configured, &repmgrd;
should be started.
</para>
@@ -156,8 +156,8 @@
location='dc1'</programlisting>
</para>
<para>
In a failover situation, <application>repmgrd</application> will check if any servers in the
same location as the current primary node are visible. If not, <application>repmgrd</application>
In a failover situation, &repmgrd; will check if any servers in the
same location as the current primary node are visible. If not, &repmgrd;
will assume a network interruption and not promote any node in any
other location (it will however enter <link linkend="repmgrd-degraded-monitoring">degraded monitoring</link>
mode until a primary becomes visible).
@@ -184,7 +184,7 @@
</para>
</note>
<para>
When running on the primary node, <application>repmgrd</application> can
When running on the primary node, &repmgrd; can
monitor connections and in particular disconnections by its attached
child nodes (standbys), and optionally execute a custom command
if certain criteria are met (such as the number of attached nodes falling to
@@ -195,7 +195,7 @@
<note>
<para>
Currently <application>repmgrd</application> can only detect disconnections
Currently &repmgrd; can only detect disconnections
of streaming replication standbys and cannot determine whether a standby
has disconnected and fallen back to archive recovery.
</para>
@@ -207,7 +207,7 @@
<sect2 id="repmgrd-primary-child-disconnection-monitoring-process">
<title>Standby disconnections monitoring process and criteria</title>
<para>
<application>repmgrd</application> monitors attach child nodes and decides
&repmgrd; monitors attach child nodes and decides
whether to invoke the user-defined command based on the following process
and criteria:
<itemizedlist>
@@ -215,7 +215,7 @@
<listitem>
<para>
Every few seconds (defined by the configuration parameter <varname>child_nodes_check_interval</varname>;
default: <literal>5</literal> seconds, a value of <literal>0</literal> disables this altogether), <application>repmgrd</application> queries
default: <literal>5</literal> seconds, a value of <literal>0</literal> disables this altogether), &repmgrd; queries
the <literal>pg_stat_replication</literal> system view and compares
the nodes present there against the list of nodes registered with &repmgr; which
should be attached to the primary.
@@ -225,7 +225,7 @@
<listitem>
<para>
If a child node (standby) is no longer present in <literal>pg_stat_replication</literal>,
<application>repmgrd</application> notes the time it detected the node's absence, and additionally generates a
&repmgrd; notes the time it detected the node's absence, and additionally generates a
<literal>child_node_disconnect</literal> event.
</para>
</listitem>
@@ -233,14 +233,14 @@
<listitem>
<para>
If a chile node (standby) which was absent from <literal>pg_stat_replication</literal> reappears,
<application>repmgrd</application> clears the time it detected the node's absence, and additionally generates a
&repmgrd; clears the time it detected the node's absence, and additionally generates a
<literal>child_node_reconnect</literal> event.
</para>
</listitem>
<listitem>
<para>
If an entirely new child node (standby) is detected, <application>repmgrd</application> adds it to its internal list
If an entirely new child node (standby) is detected, &repmgrd; adds it to its internal list
and additionally generates a <literal>child_node_new_connect</literal> event.
</para>
</listitem>
@@ -248,10 +248,10 @@
<listitem>
<para>
If the <varname>child_nodes_disconnect_command</varname> parameter is set in
<filename>repmgr.conf</filename>, <application>repmgrd</application> will then loop through all child nodes.
<filename>repmgr.conf</filename>, &repmgrd; will then loop through all child nodes.
If it determines that insufficient child nodes are connected, and a
minimum of <varname>child_nodes_disconnect_timeout</varname> seconds (default: <literal>30</literal>)
has elapsed since the last node became disconnected, <application>repmgrd</application> will then execute the
has elapsed since the last node became disconnected, &repmgrd; will then execute the
<varname>child_nodes_disconnect_command</varname> script.
</para>
<para>
@@ -267,8 +267,8 @@
<listitem>
<para>
Note that child nodes which are not attached when <application>repmgrd</application>
starts will <emphasis>not</emphasis> be considered as missing, as <application>repmgrd</application>
Note that child nodes which are not attached when &repmgrd;
starts will <emphasis>not</emphasis> be considered as missing, as &repmgrd;
cannot know why they are not attached.
</para>
</listitem>
@@ -280,12 +280,12 @@
<sect2 id="repmgrd-primary-child-disconnection-example">
<title>Standby disconnections monitoring process example</title>
<para>
This example shows typical <application>repmgrd</application> log output from a three-node cluster
This example shows typical &repmgrd; log output from a three-node cluster
(primary and two child nodes), with <varname>child_nodes_connected_min_count</varname>
set to <literal>2</literal>.
</para>
<para>
<application>repmgrd</application> on the primary has started up, while two child
&repmgrd; on the primary has started up, while two child
nodes are being provisioned:
<programlisting>
[2019-04-24 15:25:33] [INFO] monitoring primary node "node1" (ID: 1) in normal state
@@ -298,7 +298,7 @@
(...)</programlisting>
</para>
<para>
One of the child nodes has disconnected; <application>repmgrd</application>
One of the child nodes has disconnected; &repmgrd;
is now waiting <varname>child_nodes_disconnect_timeout</varname> seconds
before executing <varname>child_nodes_disconnect_command</varname>:
<programlisting>
@@ -333,7 +333,7 @@
<para>
If a child node is configured to use archive recovery, it's possible that
the child node will disconnect from the primary node and fall back to
archive recovery. In this case <application>repmgrd</application>
archive recovery. In this case &repmgrd;
will nevertheless register a node disconnection.
</para>
</listitem>
@@ -374,7 +374,7 @@
<term><varname>child_nodes_check_interval</varname></term>
<listitem>
<para>
Interval (in seconds) after which <application>repmgrd</application> queries the
Interval (in seconds) after which &repmgrd; queries the
<literal>pg_stat_replication</literal> system view and compares the nodes present
there against the list of nodes registered with repmgr which should be attached to the primary.
</para>
@@ -393,7 +393,7 @@
<term><varname>child_nodes_disconnect_command</varname></term>
<listitem>
<para>
User-definable script to be executed when <application>repmgrd</application>
User-definable script to be executed when &repmgrd;
determines that an insufficient number of child nodes are connected. By default
the script is executed when no child nodes are executed, but the execution
threshold can be modified by setting one of <varname>child_nodes_connected_min_count</varname>
@@ -435,7 +435,7 @@
</para>
<para>
The <varname>child_nodes_disconnect_command</varname> script will not be executed if
<application>repmgrd</application> is <link linkend="repmgrd-pausing">paused</link>.
&repmgrd; is <link linkend="repmgrd-pausing">paused</link>.
</para>
</listitem>
@@ -449,7 +449,7 @@
<term><varname>child_nodes_disconnect_timeout</varname></term>
<listitem>
<para>
If <application>repmgrd</application> determines that an insufficient number of
If &repmgrd; determines that an insufficient number of
child nodes are connected, it will wait for the specified number of seconds
to execute the <varname>child_nodes_disconnect_command</varname>.
</para>
@@ -543,7 +543,7 @@
<term><varname>child_node_disconnect</varname></term>
<listitem>
<para>
This event is generated after <application>repmgrd</application>
This event is generated after &repmgrd;
detects that a child node is no longer streaming from the primary node.
</para>
<para>
@@ -565,7 +565,7 @@ $ repmgr cluster event --event=child_node_disconnect
<term><varname>child_node_reconnect</varname></term>
<listitem>
<para>
This event is generated after <application>repmgrd</application>
This event is generated after &repmgrd;
detects that a child node has resumed streaming from the primary node.
</para>
<para>
@@ -587,7 +587,7 @@ $ repmgr cluster event --event=child_node_reconnect
<term><varname>child_node_new_connect</varname></term>
<listitem>
<para>
This event is generated after <application>repmgrd</application>
This event is generated after &repmgrd;
detects that a new child node has been registered with &repmgr; and has
connected to the primary.
</para>
@@ -610,7 +610,7 @@ $ repmgr cluster event --event=child_node_new_connect
<term><varname>child_nodes_disconnect_command</varname></term>
<listitem>
<para>
This event is generated after <application>repmgrd</application> detects
This event is generated after &repmgrd; detects
that sufficient child nodes have been disconnected for a sufficient amount
of time to trigger execution of the <varname>child_nodes_disconnect_command</varname>.
</para>
@@ -645,7 +645,7 @@ $ repmgr cluster event --event=child_nodes_disconnect_command
<title>Standby disconnection on failover</title>
<para>
If <option>standby_disconnect_on_failover</option> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>, in a failover situation <application>repmgrd</application> will forcibly disconnect
<filename>repmgr.conf</filename>, in a failover situation &repmgrd; will forcibly disconnect
the local node's WAL receiver before making a failover decision.
</para>
<note>
@@ -667,7 +667,7 @@ $ repmgr cluster event --event=child_nodes_disconnect_command
<para>
Note that when using <option>standby_disconnect_on_failover</option> there will be a delay of 5 seconds
plus however many seconds it takes to confirm the WAL receiver is disconnected before
<application>repmgrd</application> proceeds with the failover decision.
&repmgrd; proceeds with the failover decision.
</para>
<para>
Following the failover operation, no matter what the outcome, each node will reconnect its WAL receiver.
@@ -692,7 +692,7 @@ $ repmgr cluster event --event=child_nodes_disconnect_command
<title>Failover validation</title>
<para>
From <link linkend="release-4.3">repmgr 4.3</link>, &repmgr; makes it possible to provide a script
to <application>repmgrd</application> which, in a failover situation,
to &repmgrd; which, in a failover situation,
will be executed by the promotion candidate (the node which has been selected
to be the new primary) to confirm whether the node should actually be promoted.
</para>
@@ -712,7 +712,7 @@ $ repmgr cluster event --event=child_nodes_disconnect_command
There is a pause of <option>election_rerun_interval</option> seconds before the election is rerun.
</para>
<para>
Sample <application>repmgrd</application> log file output during which the failover validation
Sample &repmgrd; log file output during which the failover validation
script rejects the proposed promotion candidate:
<programlisting>
[2019-03-13 21:01:30] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds
@@ -748,7 +748,7 @@ INFO: node 3 received notification to rerun promotion candidate election
<para>
Cascading replication - where a standby can connect to an upstream node and not
the primary server itself - was introduced in PostgreSQL 9.2. &repmgr; and
<application>repmgrd</application> support cascading replication by keeping track of the relationship
&repmgrd; support cascading replication by keeping track of the relationship
between standby servers - each node record is stored with the node id of its
upstream ("parent") server (except of course the primary server).
</para>

View File

@@ -24,10 +24,10 @@
<para>
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with <application>repmgrd</application> and redirecting queries from the failed node to the remaining
with &repmgrd; and redirecting queries from the failed node to the remaining
active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically
which is called by &repmgrd; to dynamically
reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
</para>
@@ -60,7 +60,7 @@
<para>
Application database connections *must* be passed through a proxy server/
connection pooler such as <application>PgBouncer</application>, and it must be possible to dynamically
reconfigure that from <application>repmgrd</application>. The example demonstrated in this document
reconfigure that from &repmgrd;. The example demonstrated in this document
will use <application>PgBouncer</application>
</para>
<para>
@@ -296,7 +296,7 @@
</listitem>
<listitem>
<simpara>recreates the <application>PgBouncer</application> configuration file on each
node using the information provided by <application>repmgrd</application>
node using the information provided by &repmgrd;
(primarily the <varname>conninfo</varname> string) to configure
<application>PgBouncer</application></simpara>
</listitem>
@@ -318,21 +318,21 @@
<title>Node monitoring and failover</title>
<para>
At the intervals specified by <varname>monitor_interval_secs</varname>
in <filename>repmgr.conf</filename>, <application>repmgrd</application>
in <filename>repmgr.conf</filename>, &repmgrd;
will ping each node to check if it's available. If a node isn't available,
<application>repmgrd</application> will enter failover mode and check <varname>reconnect_attempts</varname>
&repmgrd; will enter failover mode and check <varname>reconnect_attempts</varname>
times at intervals of <varname>reconnect_interval</varname> to confirm the node is definitely unreachable.
This buffer period is necessary to avoid false positives caused by transient
network outages.
</para>
<para>
If the node is still unavailable, <application>repmgrd</application> will enter failover mode and execute
If the node is still unavailable, &repmgrd; will enter failover mode and execute
the script defined in <varname>event_notification_command</varname>; an entry will be logged
in the <literal>repmgr.events</literal> table and <application>repmgrd</application> will
in the <literal>repmgr.events</literal> table and &repmgrd; will
(unless otherwise configured) resume monitoring of the node in "degraded" mode until it reappears.
</para>
<para>
<application>repmgrd</application> logfile output during a failover event will look something like this
&repmgrd; logfile output during a failover event will look something like this
on one node (usually the node which has failed, here <literal>node2</literal>):
<programlisting>
...
@@ -388,8 +388,8 @@
</para>
<para>
This assumes only the PostgreSQL instance on <literal>node2</literal> has failed. In this case the
<application>repmgrd</application> instance running on <literal>node2</literal> has performed the failover. However if
the entire server becomes unavailable, <application>repmgrd</application> on <literal>node1</literal> will perform
&repmgrd; instance running on <literal>node2</literal> has performed the failover. However if
the entire server becomes unavailable, &repmgrd; on <literal>node1</literal> will perform
the failover.
</para>
</sect1>
@@ -404,7 +404,7 @@
</para>
<para>
If the failed node comes back up and connects correctly, output similar to this
will be visible in the <application>repmgrd</application> log:
will be visible in the &repmgrd; log:
<programlisting>
[2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
[2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
@@ -417,10 +417,10 @@
<sect1 id="bdr-complete-shutdown" xreflabel="Shutdown of both nodes">
<title>Shutdown of both nodes</title>
<para>
If both PostgreSQL instances are shut down, <application>repmgrd</application> will try and handle the
If both PostgreSQL instances are shut down, &repmgrd; will try and handle the
situation as gracefully as possible, though with no failover candidates available
there's not much it can do. Should this case ever occur, we recommend shutting
down <application>repmgrd</application> on both nodes and restarting it once the PostgreSQL instances
down &repmgrd; on both nodes and restarting it once the PostgreSQL instances
are running properly.
</para>
</sect1>

View File

@@ -8,13 +8,13 @@
<title>repmgrd setup and configuration</title>
<para>
<application>repmgrd</application> is a daemon which runs on each PostgreSQL node,
&repmgrd; is a daemon which runs on each PostgreSQL node,
monitoring the local node, and (unless it's the primary node) the upstream server
(the primary server or with cascading replication, another standby) which it's
connected to.
</para>
<para>
<application>repmgrd</application> can be configured to provide failover
&repmgrd; can be configured to provide failover
capability in case the primary upstream node becomes unreachable, and/or
provide monitoring data to the &repmgr; metadatabase.
</para>
@@ -23,7 +23,7 @@
<title>repmgrd configuration</title>
<para>
To use <application>repmgrd</application>, its associated function library <emphasis>must</emphasis> be
To use &repmgrd;, its associated function library <emphasis>must</emphasis> be
included via <filename>postgresql.conf</filename> with:
<programlisting>
@@ -35,7 +35,7 @@
</para>
<para>
The following configuraton options apply to <application>repmgrd</application> in all circumstances:
The following configuraton options apply to &repmgrd; in all circumstances:
</para>
<variablelist>
@@ -62,7 +62,7 @@
<listitem>
<para>
The option <option>connection_check_type</option> is used to select the method
<application>repmgrd</application> uses to determine whether the upstream node is available.
&repmgrd; uses to determine whether the upstream node is available.
</para>
<para>
Possible values are:
@@ -132,7 +132,7 @@
<term><option>degraded_monitoring_timeout</option></term>
<listitem>
<para>
Interval (in seconds) after which <application>repmgrd</application> will terminate if
Interval (in seconds) after which &repmgrd; will terminate if
either of the servers (local node and or upstream node) being monitored is no longer available
(<link linkend="repmgrd-degraded-monitoring">degraded monitoring mode</link>).
</para>
@@ -152,7 +152,7 @@
<title>Required configuration for automatic failover</title>
<para>
The following <application>repmgrd</application> options <emphasis>must</emphasis> be set in
The following &repmgrd; options <emphasis>must</emphasis> be set in
<filename>repmgr.conf</filename>:
<itemizedlist spacing="compact" mark="bullet">
@@ -192,7 +192,7 @@
</para>
<note>
<para>
If <option>failover</option> is set to <literal>manual</literal>, <application>repmgrd</application>
If <option>failover</option> is set to <literal>manual</literal>, &repmgrd;
will not take any action if a failover situation is detected, and the node may need to
be modified manually (e.g. by executing <command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command>).
</para>
@@ -209,7 +209,7 @@
<listitem>
<para>
The program or script defined in <option>promote_command</option> will be executed
in a failover situation when <application>repmgrd</application> determines that
in a failover situation when &repmgrd; determines that
the current node is to become the new primary node.
</para>
<para>
@@ -231,8 +231,8 @@
<para>
Note that the <literal>--log-to-file</literal> option will cause
output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
output generated by the &repmgr; command, when executed by &repmgrd;,
to be logged to the same destination configured to receive log output for &repmgrd;.
</para>
<note>
<para>
@@ -252,7 +252,7 @@
<listitem>
<para>
The program or script defined in <option>follow_command</option> will be executed
in a failover situation when <application>repmgrd</application> determines that
in a failover situation when &repmgrd; determines that
the current node is to follow the new primary node.
</para>
<para>
@@ -263,7 +263,7 @@
The <option>follow_command</option> parameter
should provide the <literal>--upstream-node-id=%n</literal>
option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
<application>repmgrd</application> with the ID of the new primary node. If this is not provided,
&repmgrd; with the ID of the new primary node. If this is not provided,
<command>repmgr standby follow</command> will attempt to determine the new primary by itself, but if the
original primary comes back online after the new primary is promoted, there is a risk that
<command>repmgr standby follow</command> will result in the node continuing to follow
@@ -284,8 +284,8 @@
<para>
Note that the <literal>--log-to-file</literal> option will cause
output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
output generated by the &repmgr; command, when executed by &repmgrd;,
to be logged to the same destination configured to receive log output for &repmgrd;.
</para>
<note>
<para>
@@ -337,7 +337,7 @@
<listitem>
<para>
User-defined script to execute for an external mechanism to validate the failover
decision made by <application>repmgrd</application>.
decision made by &repmgrd;.
</para>
<note>
<para>
@@ -408,7 +408,7 @@
for this option.
</para>
<para>
<application>repmgrd</application> will refuse to start if this option is set
&repmgrd; will refuse to start if this option is set
but either of these prerequisites is not met.
</para>
</note>
@@ -469,14 +469,14 @@
</indexterm>
<title>PostgreSQL service configuration</title>
<para>
If using automatic failover, currently <application>repmgrd</application> will need to execute
If using automatic failover, currently &repmgrd; will need to execute
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
to restart PostgreSQL on standbys to have them follow a new primary.
</para>
<para>
To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
command appropriate to your operating system via <varname>service_restart_command</varname>
in <filename>repmgr.conf</filename>. If you don't do this, <application>repmgrd</application>
in <filename>repmgr.conf</filename>. If you don't do this, &repmgrd;
will default to using <command>pg_ctl</command>, which can result in unexpected problems,
particularly on <application>systemd</application>-based systems.
</para>
@@ -549,21 +549,21 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
</indexterm>
<title>Applying configuration changes to repmgrd</title>
<para>
To apply configuration file changes to a running <application>repmgrd</application>
daemon, execute the operating system's <application>repmgrd</application> service reload command
To apply configuration file changes to a running &repmgrd;
daemon, execute the operating system's &repmgrd; service reload command
(see <xref linkend="appendix-packages"> for examples),
or for instances which were manually started, execute <command>kill -HUP</command>, e.g.
<command>kill -HUP `cat /tmp/repmgrd.pid`</command>.
</para>
<tip>
<para>
Check the <application>repmgrd</application> log to see what changes were
Check the &repmgrd; log to see what changes were
applied, or if any issues were encountered when reloading the configuration.
</para>
</tip>
<para>
Note that only the following subset of configuration file parameters can be changed on a
running <application>repmgrd</application> daemon:
running &repmgrd; daemon:
</para>
<itemizedlist spacing="compact" mark="bullet">
@@ -770,7 +770,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
<note>
<para>
After executing <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
<application>repmgrd</application> <emphasis>must</emphasis> be restarted for the changes to take effect.
&repmgrd; <emphasis>must</emphasis> be restarted for the changes to take effect.
</para>
</note>
@@ -785,7 +785,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
</indexterm>
<title>repmgrd daemon</title>
<para>
If installed from a package, the <application>repmgrd</application> can be started
If installed from a package, the &repmgrd; can be started
via the operating system's service command, e.g. in <application>systemd</application>
using <command>systemctl</command>.
</para>
@@ -796,7 +796,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
<para>
The commands <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link> and
<link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link> can be used
as convenience wrappers to start and stop <application>repmgrd</application>.
as convenience wrappers to start and stop &repmgrd;.
</para>
<important>
<para>
@@ -808,7 +808,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
</para>
</important>
<para>
<application>repmgrd</application> can be started manually like this:
&repmgrd; can be started manually like this:
<programlisting>
repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid</programlisting>
and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
@@ -825,7 +825,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
</indexterm>
<title>repmgrd's PID file</title>
<para>
<application>repmgrd</application> will generate a PID file by default.
&repmgrd; will generate a PID file by default.
</para>
<note>
<simpara>
@@ -845,12 +845,12 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
<option>--pid-file</option> may be deprecated in future releases.
</para>
<para>
If a PID file location was specified by the package maintainer, <application>repmgrd</application>
If a PID file location was specified by the package maintainer, &repmgrd;
will use that. This only applies if &repmgr; was installed from a package and the package
maintainer has specified the PID file location.
</para>
<para>
If none of the above apply, <application>repmgrd</application> will create a PID file
If none of the above apply, &repmgrd; will create a PID file
in the operating system's temporary directory (as setermined by the environment variable
<varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
</para>
@@ -859,10 +859,10 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
<option>--no-pid-file</option>.
</para>
<para>
To see which PID file <application>repmgrd</application> would use, execute <application>repmgrd</application>
with the option <option>--show-pid-file</option>. <application>repmgrd</application>
To see which PID file &repmgrd; would use, execute &repmgrd;
with the option <option>--show-pid-file</option>. &repmgrd;
will not start if this option is provided. Note that the value shown is the
file <application>repmgrd</application> would use next time it starts, and is
file &repmgrd; would use next time it starts, and is
not necessarily the PID file currently in use.
</para>
</sect2>
@@ -881,7 +881,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
<para>
If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
is required before <application>repmgrd</application> is started as a daemon.
is required before &repmgrd; is started as a daemon.
</para>
<para>
This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
@@ -920,12 +920,12 @@ REPMGRD_OPTS="--daemonize=false"
</para>
</tip>
<para>
From <application>repmgrd</application> 4.1, ensure <varname>REPMGRD_OPTS</varname> includes
From &repmgrd; 4.1, ensure <varname>REPMGRD_OPTS</varname> includes
<option>--daemonize=false</option>, as daemonization is handled by the service command.
</para>
<para>
If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
Also, if you attempted to start <application>repmgrd</application> using <command>systemctl start repmgrd</command>,
Also, if you attempted to start &repmgrd; using <command>systemctl start repmgrd</command>,
you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
rolls.
</para>
@@ -972,7 +972,7 @@ REPMGRD_OPTS="--daemonize=false"
<title>repmgrd log rotation</title>
<para>
To ensure the current <application>repmgrd</application> logfile
To ensure the current &repmgrd; logfile
(specified in <filename>repmgr.conf</filename> with the parameter
<option>log_file</option>) does not grow indefinitely, configure your
system's <command>logrotate</command> to regularly rotate it.

View File

@@ -21,42 +21,42 @@
<title>Pausing repmgrd</title>
<para>
In normal operation, <application>repmgrd</application> monitors the state of the
In normal operation, &repmgrd; monitors the state of the
PostgreSQL node it is running on, and will take appropriate action if problems
are detected, e.g. (if so configured) promote the node to primary, if the existing
primary has been determined as failed.
</para>
<para>
However, <application>repmgrd</application> is unable to distinguish between
However, &repmgrd; is unable to distinguish between
planned outages (such as performing a <link linkend="performing-switchover">switchover</link>
or installing PostgreSQL maintenance released), and an actual server outage. In versions prior to
&repmgr; 4.2 it was necessary to stop <application>repmgrd</application> on all nodes (or at least
on all nodes where <application>repmgrd</application> is
&repmgr; 4.2 it was necessary to stop &repmgrd; on all nodes (or at least
on all nodes where &repmgrd; is
<link linkend="repmgrd-automatic-failover">configured for automatic failover</link>)
to prevent <application>repmgrd</application> from making unintentional changes to the
to prevent &repmgrd; from making unintentional changes to the
replication cluster.
</para>
<para>
From <link linkend="release-4.2">&repmgr; 4.2</link>, <application>repmgrd</application>
From <link linkend="release-4.2">&repmgr; 4.2</link>, &repmgrd;
can now be &quot;paused&quot;, i.e. instructed not to take any action such as performing a failover.
This can be done from any node in the cluster, removing the need to stop/restart
each <application>repmgrd</application> individually.
each &repmgrd; individually.
</para>
<note>
<para>
For major PostgreSQL upgrades, e.g. from PostgreSQL 10 to PostgreSQL 11,
<application>repmgrd</application> should be shut down completely and only started up
&repmgrd; should be shut down completely and only started up
once the &repmgr; packages for the new PostgreSQL major version have been installed.
</para>
</note>
<sect2 id="repmgrd-pausing-prerequisites">
<title>Prerequisites for pausing <application>repmgrd</application></title>
<title>Prerequisites for pausing &repmgrd;</title>
<para>
In order to be able to pause/unpause <application>repmgrd</application>, following
In order to be able to pause/unpause &repmgrd;, following
prerequisites must be met:
<itemizedlist spacing="compact" mark="bullet">
@@ -86,9 +86,9 @@
</sect2>
<sect2 id="repmgrd-pausing-execution">
<title>Pausing/unpausing <application>repmgrd</application></title>
<title>Pausing/unpausing &repmgrd;</title>
<para>
To pause <application>repmgrd</application>, execute <link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link>, e.g.:
To pause &repmgrd;, execute <link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link>, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf daemon pause
NOTICE: node 1 (node1) paused
@@ -96,7 +96,7 @@ NOTICE: node 2 (node2) paused
NOTICE: node 3 (node3) paused</programlisting>
</para>
<para>
The state of <application>repmgrd</application> on each node can be checked with
The state of &repmgrd; on each node can be checked with
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>, e.g.:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | repmgrd | PID | Paused?
@@ -109,12 +109,12 @@ NOTICE: node 3 (node3) paused</programlisting>
<note>
<para>
If executing a switchover with <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>,
&repmgr; will automatically pause/unpause <application>repmgrd</application> as part of the switchover process.
&repmgr; will automatically pause/unpause &repmgrd; as part of the switchover process.
</para>
</note>
<para>
If the primary (in this example, <literal>node1</literal>) is stopped, <application>repmgrd</application>
If the primary (in this example, <literal>node1</literal>) is stopped, &repmgrd;
running on one of the standbys (here: <literal>node2</literal>) will react like this:
<programlisting>
[2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (ID: 1)
@@ -130,14 +130,14 @@ NOTICE: node 3 (node3) paused</programlisting>
[2018-09-20 12:22:33] [HINT] execute "repmgr daemon unpause" to resume normal failover mode</programlisting>
</para>
<para>
If the primary becomes available again (e.g. following a software upgrade), <application>repmgrd</application>
If the primary becomes available again (e.g. following a software upgrade), &repmgrd;
will automatically reconnect, e.g.:
<programlisting>
[2018-09-20 13:12:41] [NOTICE] reconnected to upstream node 1 after 8 seconds, resuming monitoring</programlisting>
</para>
<para>
To unpause <application>repmgrd</application>, execute <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>, e.g.:
To unpause &repmgrd;, execute <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf daemon unpause
NOTICE: node 1 (node1) unpaused
@@ -147,7 +147,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<note>
<para>
If the previous primary is no longer accessible when <application>repmgrd</application>
If the previous primary is no longer accessible when &repmgrd;
is unpaused, no failover action will be taken. Instead, a new primary must be manually promoted using
<link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>,
and any standbys attached to the new primary with
@@ -156,13 +156,13 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<para>
This is to prevent <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
resulting in the automatic promotion of a new primary, which may be a problem particularly
in larger clusters, where <application>repmgrd</application> could select a different promotion
in larger clusters, where &repmgrd; could select a different promotion
candidate to the one intended by the administrator.
</para>
</note>
</sect2>
<sect2 id="repmgrd-pausing-details">
<title>Details on the <application>repmgrd</application> pausing mechanism</title>
<title>Details on the &repmgrd; pausing mechanism</title>
<para>
The pause state of each node will be stored over a PostgreSQL restart.
@@ -171,14 +171,14 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<para>
<link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
<link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link> can be
executed even if <application>repmgrd</application> is not running; in this case,
<application>repmgrd</application> will start up in whichever pause state has been set.
executed even if &repmgrd; is not running; in this case,
&repmgrd; will start up in whichever pause state has been set.
</para>
<note>
<para>
<link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
<link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
<emphasis>do not</emphasis> stop/start <application>repmgrd</application>.
<emphasis>do not</emphasis> stop/start &repmgrd;.
</para>
</note>
</sect2>
@@ -194,7 +194,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<para>
If WAL replay has been paused (using <command>pg_wal_replay_pause()</command>,
on PostgreSQL 9.6 and earlier <command>pg_xlog_replay_pause()</command>),
in a failover situation <application>repmgrd</application> will
in a failover situation &repmgrd; will
automatically resume WAL replay.
</para>
<para>
@@ -225,9 +225,9 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<title>"degraded monitoring" mode</title>
<para>
In certain circumstances, <application>repmgrd</application> is not able to fulfill its primary mission
In certain circumstances, &repmgrd; is not able to fulfill its primary mission
of monitoring the node's upstream server. In these cases it enters &quot;degraded monitoring&quot;
mode, where <application>repmgrd</application> remains active but is waiting for the situation
mode, where &repmgrd; remains active but is waiting for the situation
to be resolved.
</para>
<para>
@@ -287,12 +287,12 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<para>
By default, <literal>repmgrd</literal> will continue in degraded monitoring mode indefinitely.
However a timeout (in seconds) can be set with <varname>degraded_monitoring_timeout</varname>,
after which <application>repmgrd</application> will terminate.
after which &repmgrd; will terminate.
</para>
<note>
<para>
If <application>repmgrd</application> is monitoring a primary mode which has been stopped
If &repmgrd; is monitoring a primary mode which has been stopped
and manually restarted as a standby attached to a new primary, it will automatically detect
the status change and update the node record to reflect the node's new status
as an active standby. It will then resume monitoring the node as a standby.
@@ -313,7 +313,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<title>Storing monitoring data</title>
<para>
When <application>repmgrd</application> is running with the option <literal>monitoring_history=true</literal>,
When &repmgrd; is running with the option <literal>monitoring_history=true</literal>,
it will constantly write standby node status information to the
<varname>monitoring_history</varname> table, providing a near-real time
overview of replication status on all nodes
@@ -351,7 +351,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
specify how many day's worth of data should be retained.
</para>
<para>
It's possible to use <application>repmgrd</application> to run in monitoring
It's possible to use &repmgrd; to run in monitoring
mode only (without automatic failover capability) for some or all
nodes by setting <literal>failover=manual</literal> in the node's
<filename>repmgr.conf</filename> file. In the event of the node's upstream failing,

View File

@@ -7,18 +7,18 @@
<title>repmgrd overview</title>
<para>
<application>repmgrd</application> (&quot;<literal>replication manager daemon</literal>&quot;)
&repmgrd; (&quot;<literal>replication manager daemon</literal>&quot;)
is a management and monitoring daemon which runs
on each node in a replication cluster. It can automate actions such as
failover and updating standbys to follow the new primary, as well as
providing monitoring information about the state of each standby.
</para>
<para>
<application>repmgrd</application> is designed to be straightforward to set up
&repmgrd; is designed to be straightforward to set up
and does not require additional external infrastructure.
</para>
<para>
Functionality provided by <application>repmgrd</application> includes:
Functionality provided by &repmgrd; includes:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
@@ -93,11 +93,11 @@
<tip>
<para>
See section <link linkend="repmgrd-automatic-failover-configuration">Required configuration for automatic failover</link>
for an example of minimal <filename>repmgr.conf</filename> file settings suitable for use with <application>repmgrd</application>.
for an example of minimal <filename>repmgr.conf</filename> file settings suitable for use with &repmgrd;.
</para>
</tip>
<para>
Start <application>repmgrd</application> on each standby and verify that it's running by examining the
Start &repmgrd; on each standby and verify that it's running by examining the
log output, which at log level <literal>INFO</literal> will look like this:
<programlisting>
[2019-03-15 06:32:05] [NOTICE] repmgrd (repmgrd 4.3) starting up
@@ -107,7 +107,7 @@
[2019-03-15 06:32:05] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
</para>
<para>
Each <application>repmgrd</application> should also have recorded its successful startup as an event:
Each &repmgrd; should also have recorded its successful startup as an event:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
Node ID | Name | Event | OK | Timestamp | Details
@@ -123,8 +123,8 @@
</para>
<para>
This will force the primary to shut down straight away, aborting all processes
and transactions. This will cause a flurry of activity in the <application>repmgrd</application> log
files as each <application>repmgrd</application> detects the failure of the primary and a failover
and transactions. This will cause a flurry of activity in the &repmgrd; log
files as each &repmgrd; detects the failure of the primary and a failover
decision is made. This is an extract from the log of a standby server (<literal>node2</literal>)
which has promoted to new primary after failure of the original primary (<literal>node1</literal>).
<programlisting>

View File

@@ -159,12 +159,12 @@
<note>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
<application>repmgrd</application> instances to pause operations while the switchover
is being carried out, to prevent <application>repmgrd</application> from
&repmgrd; instances to pause operations while the switchover
is being carried out, to prevent &repmgrd; from
unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing">.
</para>
<para>
Users of &repmgr; versions prior to 4.2 should ensure that <application>repmgrd</application>
Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd;
is not running on any nodes while a switchover is being executed.
</para>
</note>
@@ -313,13 +313,13 @@
</programlisting>
</para>
<para>
If <application>repmgrd</application> is in use, it's worth double-checking that
If &repmgrd; is in use, it's worth double-checking that
all nodes are unpaused by executing <command><link linkend="repmgr-daemon-status">repmgr-daemon-status</link></command>.
</para>
<note>
<para>
Users of &repmgr; versions prior to 4.2 will need to manually restart <application>repmgrd</application>
Users of &repmgr; versions prior to 4.2 will need to manually restart &repmgrd;
on all nodes after the switchover is completed.
</para>
</note>

View File

@@ -24,7 +24,7 @@
<listitem>
<simpara>
the <application>repmgr</application> and <application>repmgrd</application> executables
the <application>repmgr</application> and &repmgrd; executables
</simpara>
</listitem>
@@ -37,7 +37,7 @@
<listitem>
<simpara>
the shared library module used by <application>repmgrd</application> which
the shared library module used by &repmgrd; which
is resident in the PostgreSQL backend
</simpara>
</listitem>
@@ -45,8 +45,8 @@
</para>
<para>
With <emphasis>minor releases</emphasis>, usually changes are only made to the <application>repmgr</application>
and <application>repmgrd</application> executables. In this case, the upgrade is quite straightforward,
and is simply a case of installing the new version, and restarting <application>repmgrd</application>
and &repmgrd; executables. In this case, the upgrade is quite straightforward,
and is simply a case of installing the new version, and restarting &repmgrd;
(if running).
</para>
@@ -82,7 +82,7 @@
<listitem>
<simpara>
restart <application>repmgrd</application> on all nodes where it is running
restart &repmgrd; on all nodes where it is running
</simpara>
</listitem>
@@ -93,7 +93,7 @@
<note>
<para>
Some packaging systems (e.g. <link linkend="packages-debian-ubuntu">Debian/Ubuntu</link>
may restart <application>repmgrd</application> as part of the package upgrade process.
may restart &repmgrd; as part of the package upgrade process.
</para>
</note>
@@ -126,7 +126,7 @@
<para>
&quot;major version&quot; upgrades need to be planned more carefully, as they may include
changes to the &repmgr; metadata (which need to be propagated from the primary to all
standbys) and/or changes to the shared object file used by <application>repmgrd</application>
standbys) and/or changes to the shared object file used by &repmgrd;
(which require a PostgreSQL restart).
</para>
<para>
@@ -138,14 +138,14 @@
<listitem>
<simpara>
Stop <application>repmgrd</application> (if in use) on all nodes where it is running.
Stop &repmgrd; (if in use) on all nodes where it is running.
</simpara>
</listitem>
<listitem>
<simpara>
Disable the <application>repmgrd</application> service on all nodes where it is in use;
this is to prevent packages from prematurely restarting <application>repmgrd</application>.
Disable the &repmgrd; service on all nodes where it is in use;
this is to prevent packages from prematurely restarting &repmgrd;.
</simpara>
</listitem>
@@ -167,12 +167,12 @@ systemctl daemon-reload</programlisting>
<listitem>
<simpara>
If the &repmgr; shared library module has been updated (check the <link linkend="appendix-release-notes">release notes</link>!),
restart PostgreSQL, then <application>repmgrd</application> (if in use) on each node,
restart PostgreSQL, then &repmgrd; (if in use) on each node,
The order in which this is applied to individual nodes is not critical,
and it's also fine to restart PostgreSQL on all nodes first before starting <application>repmgrd</application>.
and it's also fine to restart PostgreSQL on all nodes first before starting &repmgrd;.
</simpara>
<simpara>
Note that if the upgrade requires a PostgreSQL restart, <application>repmgrd</application>
Note that if the upgrade requires a PostgreSQL restart, &repmgrd;
will only function correctly once all nodes have been restarted.
</simpara>
</listitem>
@@ -188,7 +188,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<listitem>
<simpara>
Reenable the <application>repmgrd</application> service on all nodes where it is in use, and
Reenable the &repmgrd; service on all nodes where it is in use, and
ensure it is running.
</simpara>
</listitem>
@@ -212,7 +212,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<title>Checking repmgrd status after an upgrade</title>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, once the upgrade is complete, execute the <command><link linkend="repmgr-daemon-status">repmgr daemon status</link></command>
command (on any node) to show an overview of the status of <application>repmgrd</application> on all nodes.
command (on any node) to show an overview of the status of &repmgrd; on all nodes.
</para>
</sect2>
</sect1>
@@ -332,7 +332,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
</listitem>
<listitem>
<simpara><varname>monitoring_history</varname>: this replaces the
<application>repmgrd</application> command line option
&repmgrd; command line option
<literal>--monitoring-history</literal></simpara>
</listitem>
</itemizedlist>
@@ -433,7 +433,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<sect2>
<title>Upgrading the repmgr schema</title>
<para>
Ensure <application>repmgrd</application> is not running, or any cron jobs which execute the
Ensure &repmgrd; is not running, or any cron jobs which execute the
<command>repmgr</command> binary.
</para>
<para>
@@ -499,7 +499,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
</para>
<para>
Check the data is updated as expected by examining the <structname>repmgr.nodes</structname>
table; restart <application>repmgrd</application> if required.
table; restart &repmgrd; if required.
</para>
<para>
The original <literal>repmgr_$cluster</literal> schema can be dropped at any time.