FAQ (Frequently Asked Questions)FAQ (Frequently Asked Questions)GeneralWhat's the difference between the repmgr versions?
&repmgr; 4 is a complete rewrite of the previous &repmgr; code base
and implements &repmgr; as a PostgreSQL extension. It
supports all PostgreSQL versions from 9.3 (although some &repmgr;
features are not available for PostgreSQL 9.3 and 9.4).
&repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides
support for the revised replication configuration mechanism in PostgreSQL 12.
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via &repmgrd;, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series is no longer maintained.
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
no longer maintained.
See also &repmgr; compatibility matrix
and Should I upgrade &repmgr;?.
What's the advantage of using replication slots?
Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed
by all standby servers. This means standby servers should never
fail due to not being able to retrieve required WAL files from the
primary.
However this does mean that if a standby is no longer connected to the
primary, the presence of the replication slot will cause WAL files
to be retained indefinitely, and eventually lead to disk space
exhaustion.
2ndQuadrant's recommended configuration is to configure
Barman as a fallback
source of WAL files, rather than maintain replication slots for
each standby. See also: Using Barman as a WAL file source.
How many replication slots should I define in max_replication_slots?
Normally at least same number as the number of standbys which will connect
to the node. Note that changes to max_replication_slots require a server
restart to take effect, and as there is no particular penalty for unused
replication slots, setting a higher figure will make adding new nodes
easier.
Does &repmgr; support hash indexes?
Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
for use in streaming replication in PostgreSQL 9.6 and earlier. See the
PostgreSQL documentation
for details.
From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
in a streaming replication cluster.
Can &repmgr; assist with upgrading a PostgreSQL cluster?
For minor version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
switchover promoting it to a primary,
then upgrade the former primary.
For major version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with pg_upgrade
and recloning standbys from this.
To minimize downtime during major upgrades from PostgreSQL 9.4 and later,
pglogical
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
What does this error mean: ERROR: could not access file "$libdir/repmgr"?
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
is not possible, contact your vendor for assistance.
How can I obtain old versions of &repmgr; packages?
See appendix for details.
Is &repmgr; required for streaming replication?
No.
&repmgr; (together with &repmgrd;) assists with
managing replication. It does not actually perform replication, which
is part of the core PostgreSQL functionality.
Will replication stop working if &repmgr; is uninstalled?
No. See preceding question.
Does it matter if different &repmgr; versions are present in the replication cluster?
Yes. If different "major" &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
&repmgr; (in particular &repmgrd;)
may not run, or run properly, or in the worst case (if different &repmgrd;
versions are running and there are differences in the failover implementation) break
your replication cluster.
If different "minor" &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
&repmgr; will function, but we strongly recommend always running the same version
to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
differently to the older version.
See also Should I upgrade &repmgr;?.
Should I upgrade &repmgr;?
Yes.
We don't release new versions for fun, you know. Upgrading may require a little effort,
but running an older &repmgr; version with bugs which have since been fixed may end up
costing you more effort. The same applies to PostgreSQL itself.
Why do I need to specify the data directory location in repmgr.conf?
In some circumstances &repmgr; may need to access a PostgreSQL data
directory while the PostgreSQL server is not running, e.g. to confirm
it shut down cleanly during a switchover.
Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
earlier, where the repmgr user is not a superuser; in that
case the repmgr user will not be able to access the
data_directory configuration setting, access to which is restricted
to superusers.
In PostgreSQL 10 and later, non-superusers can be added to the
default role
(or the meta-role )
which will enable them to read this setting.
Are &repmgr; packages compatible with $third_party_vendor's packages?
&repmgr; packages provided by 2ndQuadrant are compatible with the community-provided PostgreSQL
packages and any software provided by 2ndQuadrant.
A number of other vendors provide their own versions of PostgreSQL packages, often with different
package naming schemes and/or file locations.
We cannot guarantee that &repmgr; packages will be compatible with these packages.
It may be possible to override package dependencies (e.g. rpm --nodeps
for CentOS-based systems or dpkg --force-depends for Debian-based systems).
repmgrCan I register an existing PostgreSQL server with repmgr?
Yes, any existing PostgreSQL server which is part of the same replication
cluster can be registered with &repmgr;. There's no requirement for a
standby to have been cloned using &repmgr;.
Can I use a standby not cloned by &repmgr; as a &repmgr; node?
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
repmgr standby clone --recovery-conf-only
can be used to create the correct recovery.conf file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
register the node as usual.
What does &repmgr; write in recovery.conf, and what options can be set there?
See section Customising recovery.conf.
How can a failed primary be re-added as a standby?
This is a two-stage process. First, the failed primary's data directory
must be re-synced with the current primary; secondly the failed primary
needs to be re-registered as a standby.
It's possible to use pg_rewind to re-synchronise the existing data
directory, which will usually be much
faster than re-cloning the server. However pg_rewind can only
be used if PostgreSQL either has wal_log_hints enabled, or
data checksums were enabled when the cluster was initialized.
Note that pg_rewind is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
&repmgr; provides the command repmgr node rejoin which can
optionally execute pg_rewind; see the
documentation for details, in particular the section .
If pg_rewind cannot be used, then the data directory will need
to be re-cloned from scratch.
Is there an easy way to check my primary server is correctly configured for use with &repmgr;?
Execute repmgr standby clone
with the --dry-run option; this will report any configuration problems
which need to be rectified.
When cloning a standby, how can I get &repmgr; to copy
postgresql.conf and pg_hba.conf from the PostgreSQL configuration
directory in /etc?
Use the command line option --copy-external-config-files. For more details
see .
Do I need to include shared_preload_libraries = 'repmgr'
in postgresql.conf if I'm not using &repmgrd;?
No, the repmgr shared library is only needed when running &repmgrd;.
If you later decide to run &repmgrd;, you just need to add
shared_preload_libraries = 'repmgr' and restart PostgreSQL.
I've provided replication permission for the repmgr user in pg_hba.conf
but repmgr/&repmgrd; complains it can't connect to the server... Why?repmgr and &repmgrd; need to be able to connect to the repmgr database
with a normal connection to query metadata. The replication connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the repmgr user).
When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If &repmgr; needs to establish a connection to the primary
server, it can retrieve this from the repmgr.nodes table on the local
node, and if necessary scan the replication cluster until it locates the active primary.
When cloning a standby, how do I ensure the WAL files are placed in a custom directory?
Provide the option --waldir (--xlogdir in PostgreSQL 9.6
and earlier) with the absolute path to the WAL directory in pg_basebackup_options.
For more details see .
Why is there no foreign key on the node_id column in the repmgr.events
table?
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the repmgr.nodes table.
Why are some values in recovery.conf surrounded by pairs of single quotes?
This is to ensure that user-supplied values which are written as parameter values in recovery.conf
are escaped correctly and do not cause errors when recovery.conf is parsed.
The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
even if the string does not contain any characters which need escaping.
&repmgrd;How can I prevent a node from ever being promoted to primary?
In repmgr.conf, set its priority to a value of 0; apply the changed setting with
repmgr standby register --force.
Additionally, if failover is set to manual, the node will never
be considered as a promotion candidate.
Does &repmgrd; support delayed standbys?
&repmgrd; can monitor delayed standbys - those set up with
recovery_min_apply_delay set to a non-zero value
in recovery.conf - but as it's not currently possible
to directly examine the value applied to the standby, &repmgrd;
may not be able to properly evaluate the node as a promotion candidate.
We recommend that delayed standbys are explicitly excluded from promotion
by setting priority to 0 in
repmgr.conf.
Note that after registering a delayed standby, &repmgrd; will only start
once the metadata added in the primary node has been replicated.
How can I get &repmgrd; to rotate its logfile?
Configure your system's logrotate service to do this; see .
I've recloned a failed primary as a standby, but &repmgrd; refuses to start?
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if failover is set to
automatic, which is probably not what you want. &repmgrd; will start if
failover is set to manual so the node's replication status can still
be monitored, if desired.
&repmgrd; ignores pg_bindir when executing promote_command or follow_commandpromote_command or follow_command can be user-defined scripts,
so &repmgr; will not apply even if excuting &repmgr;. Always provide the full
path; see for more details.
&repmgrd; aborts startup with the error "upstream node must be running before repmgrd can start"
&repmgrd; does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, &repmgrd;
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if &repmgrd; is not running on other nodes.
In particular, it's possible that the node's local copy of the repmgr.nodes copy
is out-of-date, which may lead to incorrect failover behaviour.
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
starting &repmgrd;.