Compare commits

..

16 Commits

Author SHA1 Message Date
Ian Barwick
0cafeb3828 repmgrd: fix upstream role check
Only take action if it's confirmed as a standby.
2018-10-23 12:50:04 +09:00
Ian Barwick
79e79bd5f2 "standby switchover": close all connections used to check repmgrd status
The connections used to check repmgrd status on all nodes were not being
closed if repmgrd was not running. Normally this wouldn't be a huge
problem as they will go away when repmgr terminates or the PostgreSQL
server restarted. However, if shutdown mode is "smart", the open
connection on the demotion candidate will cause the shutdown operation
to fail until repmgr times out.
2018-10-23 10:59:24 +09:00
Ian Barwick
a4e21fd8fe Update doc version 2018-10-23 09:28:46 +09:00
Ian Barwick
e826f72312 Bump version number
4.2
2018-10-23 09:24:17 +09:00
Ian Barwick
1e8b3313ee doc: fix typos 2018-10-23 09:22:04 +09:00
Ian Barwick
b5772d88dd doc: fix typo
Per user report on mailing list.
2018-10-23 09:00:09 +09:00
Ian Barwick
22614573b9 Fix Makefile for VPATH builds under PostgreSQL 11 2018-10-22 20:05:09 +09:00
Ian Barwick
77c9092794 repmgrd: improve node role change detection 2018-10-19 11:33:08 +09:00
Ian Barwick
15bbe04a6f Speed up witness "failover" during a switchover 2018-10-18 18:35:23 +09:00
Ian Barwick
0842560a88 repmgrd: handle case where upstream is no longer primary
If the upstream comes back on line (e.g. after a switchover), and its
status is no longer primary, restart monitoring to ensure the correct
primary (potentially the current node) is being monitored.
2018-10-18 17:04:14 +09:00
Ian Barwick
8bec4946bc Ensure witness repmgrd detects change in upstream's role
This ensures that e.g. after a switchover, repmgrd running on a witness
node will automatically detect the new primary and monitor that.
2018-10-18 16:15:52 +09:00
Ian Barwick
3ab22f9442 repmgrd: ensure witness node doesn't try and follow another witness
Theoretically there should never be more than one witness node
visible here, but it's not impossible to rule it out, so add a
check just in case.
2018-10-18 12:20:04 +09:00
Ian Barwick
3a9c36a36c doc: improve upgrade instructions
Note requirement to execute "systemctl daemon-reload" for systemd
systems...
2018-10-17 17:08:54 +09:00
Ian Barwick
2ded8987ac doc: improve upgrade instructions 2018-10-17 14:35:36 +09:00
Ian Barwick
6311f3f30a Handle NULL strings when parsing boolean arguments 2018-10-17 11:46:29 +09:00
Ian Barwick
12ec6c7abc Doc: update HISTORY and 4.2 release notes 2018-10-17 09:50:36 +09:00
185 changed files with 13564 additions and 34931 deletions

8
.gitignore vendored
View File

@@ -42,17 +42,13 @@ lib*.pc
/regression.diffs
/regression.out
/doc/Makefile
# other
/.lineno
*.dSYM
*.orig
*.rej
# generated binaries
repmgr
repmgrd
repmgr4
repmgrd4
# generated files
configfile-scan.c

View File

@@ -2,7 +2,7 @@ License and Contributions
=========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2021, EnterpriseDB Corporation. See the files COPYRIGHT and LICENSE for
Copyright 2010-2018, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -12,10 +12,10 @@ which has received funding from the European Union's Seventh Framework Programme
(FP7/2007-2013) under grant agreement 258862.
Contributions to `repmgr` are welcome, and will be listed in the file `CREDITS`.
EnterpriseDB Corporation requires that any contributions provide a copyright
2ndQuadrant Limited requires that any contributions provide a copyright
assignment and a disclaimer of any work-for-hire ownership claims from the
employer of the developer. This lets us make sure that all of the repmgr
distribution remains free code. Please contact info@enterprise.com for a
distribution remains free code. Please contact info@2ndQuadrant.com for a
copy of the relevant Copyright Assignment Form.
Code style
@@ -24,7 +24,7 @@ Code style
Code in repmgr should be formatted to the same standards as the main PostgreSQL
project. For more details see:
https://www.postgresql.org/docs/current/source-format.html
https://www.postgresql.org/docs/current/static/source-format.html
Contributors should reformat their code similarly before submitting code to
the project, in order to minimize merge conflicts with other work.

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2021, EnterpriseDB Corporation
Copyright (c) 2010-2018, 2ndQuadrant Limited
All rights reserved.
This program is free software: you can redistribute it and/or modify

6
FAQ.md
View File

@@ -1,10 +1,8 @@
FAQ - Frequently Asked Questions about repmgr
=============================================
The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](https://repmgr.org/docs/current/appendix-faq.html "repmgr FAQ")
The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](https://repmgr.org/docs/4.0/appendix-faq.html "repmgr FAQ")
The repmgr 3.x FAQ can be found here:
https://github.com/EnterpriseDB/repmgr/blob/REL3_3_STABLE/FAQ.md
Note that repmgr 3.x is no longer supported.
https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/FAQ.md

170
HISTORY
View File

@@ -1,172 +1,4 @@
5.4.1 2023-??-??
repmgrd: ensure witness node metadata is updated (Ian)
5.4.0 2023-03-16
Support cloning replicas using pg-backup-api
5.3.3 2022-10-17
Support for PostgreSQL added
repmgrd: ensure event notification script is called for event
"repmgrd_upstream_disconnect"; GitHub #760 (Ian)
5.3.2 2022-05-25
standby clone: don't error out if unable to determine cluster size (Ian)
node check: fix --downstream --nagios output; GitHub #749 (Ian)
repmgrd: ensure witness node marked active (hslightdb)
repmgrd: improve walsender disable check (Ian)
general: ensure replication slots can be dropped by a
replication-only user (Ian)
5.3.1 2022-02-15
repmgrd: fixes for potential connection leaks (hslightdb)
repmgr: fix upgrade path from repmgr 4.2 and 4.3 to repmgr 5.3 (Ian)
5.3.0 2021-10-12
standby switchover: improve handling of node rejoin failure (Ian)
repmgrd: prefix all shared library functions with "repmgr_" to
minimize the risk of clashes with other shared libraries (Ian)
repmgrd: at startup, if node record is marked as "inactive", attempt
to set it to "active" (Ian)
standby clone: set "slot_name" in node record if required (Ian)
node rejoin: emit rejoin target note information as NOTICE (Ian)
repmgrd: ensure short option "-s" is accepted (Ian)
5.2.1 2020-12-07
config: fix parsing of "replication_type"; GitHub #672 (Ian)
standby clone: handle missing "postgresql.auto.conf" (Ian)
standby clone: add option --recovery-min-apply-delay (Ian)
standby clone: fix data directory permissions handling for
PostgreSQL 11 and later (Ian)
repmgrd: prevent termination when local node not available and
standby_disconnect_on_failover; GitHub #675 (Ian)
repmgrd: ensure reconnect_interval" is correctly handled;
GitHub #673 (Ian)
5.2.0 2020-10-22
general: add support for PostgreSQL 13 (Ian)
general: remove support for PostgreSQL 9.3 (Ian)
config: add support for file inclusion directives (Ian)
repmgr: "primary unregister --force" will unregister an active primary
with no registered standby nodes (Ian)
repmgr: add option --verify-backup to "standby clone" (Ian)
repmgr: "standby clone" honours --waldir option if set in
"pg_basebackup_options" (Ian)
repmgr: add option --db-connection to "node check" (Ian)
repmgr: report database connection error if the --optformat option was
provided to "node check" (Ian)
repmgr: improve "node rejoin" checks (Ian)
repmgr: enable "node rejoin" to join a target with a lower timeline (Ian)
repmgr: support pg_rewind's automatic crash recovery in Pg13 and later (Ian)
repmgr: improve output formatting for cluster matrix/crosscheck (Ian)
repmgr: improve database connection failure error checking on the
demotion candidate during "standby switchover" (Ian)
repmgr: make repmgr metadata tables dumpable (Ian)
repmgr: fix issue with tablespace mapping when cloning from Barman;
GitHub #650 (Ian)
repmgr: improve handling of pg_control read errors (Ian)
repmgrd: add additional optional parameters to "failover_validation command"
(spaskalev; GitHub #651)
repmgrd: ensure primary connection is reset if same as upstream;
GitHub #633 (Ian)
5.1.0 2020-04-13
repmgr: remove BDR 2.x support
repmgr: don't query upstream's data directory (Ian)
repmgr: rename --recovery-conf-only to --replication-conf-only (Ian)
repmgr: ensure postgresql.auto.conf is created with correct permissions (Ian)
repmgr: minimize requirement to check upstream data directory location
during "standby clone" (Ian)
repmgr: warn about missing pg_rewind prerequisites when executing
"standby clone" (Ian)
repmgr: add --upstream option to "node check"
repmgr: report error code on follow/rejoin failure due to non-available
replication slot (Ian)
repmgr: ensure "node rejoin" checks for available replication slots (Ian)
repmgr: improve "standby switchover" completion checks (Ian)
repmgr: add replication configuration file ownership check to
"standby switchover" (Ian)
repmgr: check the demotion candidate's registered repmgr.conf file can
be found (laixiong; GitHub 615)
repmgr: consolidate replication connection code (Ian)
repmgr: check permissions for "pg_promote()" and fall back to pg_ctl
if necessary (Ian)
repmgr: in --dry-run mode, display promote command which will be used (Ian)
repmgr: enable "service_promote_command" in PostgreSQL 12 (Ian)
repmgr: accept option -S/--superuser for "node check"; GitHub #612 (Ian)
5.0 2019-10-15
general: add PostgreSQL 12 support (Ian)
general: parse configuration file using flex (Ian)
repmgr: rename "repmgr daemon ..." commands to "repmgr service ..." (Ian)
repmgr: improve data directory check (Ian)
repmgr: improve extension check during "standby clone" (Ian)
repmgr: pass provided log level when executing repmgr remotely (Ian)
repmgrd: fix handling of upstream node change check (Ian)
4.4 2019-06-27
repmgr: improve "daemon status" output (Ian)
repmgr: add "--siblings-follow" option to "standby promote" (Ian)
repmgr: add "--repmgrd-force-unpause" option to "standby switchover" (Ian)
repmgr: fix data directory permissions issue in barman mode where
an existing directory is being overwritten (Ian)
repmgr: improve "--dry-run" behaviour for "standby promote" and
"standby switchover" (Ian)
repmgr: when running "standby clone" with the "--upstream-conninfo" option
ensure that "application_name" is set correctly in "primary_conninfo" (Ian)
repmgr: ensure "--dry-run" together with --force when running "standby clone"
in barman mode does not modify an existing data directory (Ian)
repmgr: improve "--dry-run" output when running "standby clone" in
basebackup mode (Ian)
repmgr: improve upstream walsender checks when running "standby clone" (Ian)
repmgr: display node timeline ID in "cluster show" output (Ian)
repmgr: in "cluster show" and "daemon status", show upstream node name
as reported by each individual node (Ian)
repmgr: in "cluster show" and "daemon status", check if a node is attached
to its advertised upstream node
repmgr: use --compact rather than --terse option in "cluster event" (Ian)
repmgr: prevent a standby being cloned from a witness server (Ian)
repmgr: prevent a witness server being registered on the cluster primary (John)
repmgr: ensure BDR2-specific functionality cannot be used on
BDR3 and later (Ian)
repmgr: canonicalize the data directory path (Ian)
repmgr: note that "standby follow" requires a primary to be available (Ian)
repmgrd: monitor standbys attached to primary (Ian)
repmgrd: add "primary visibility consensus" functionality (Ian)
repmgrd: fix memory leak which occurs while the monitored PostgreSQL
node is not running (Ian)
general: documentation converted to DocBook XML format (Ian)
4.3 2019-04-02
repmgr: add "daemon (start|stop)" command; GitHub #528 (Ian)
repmgr: add --version-number command line option (Ian)
repmgr: add --compact option to "cluster show"; GitHub #521 (Ian)
repmgr: cluster show - differentiate between unreachable nodes
and nodes which are running but rejecting connections (Ian)
repmgr: add --dry-run option to "standby promote"; GitHub #522 (Ian)
repmgr: add "node check --data-directory-config"; GitHub #523 (Ian)
repmgr: prevent potential race condition in "standby switchover"
when checking received WAL location; GitHub #518 (Ian)
repmgr: ensure "standby switchover" verifies repmgr can read the
data directory on the demotion candidate; GitHub #523 (Ian)
repmgr: ensure "standby switchover" verifies replication connection
exists; GitHub #519 (Ian)
repmgr: add sanity check for correct extension version (Ian)
repmgr: ensure "witness register --dry-run" does not attempt to read node
tables if repmgr extension not installed; GitHub #513 (Ian)
repmgr: ensure "standby register" fails when --upstream-node-id is the
same as the local node ID (Ian)
repmgrd: check binary and extension major versions match; GitHub #515 (Ian)
repmgrd: on a cascaded standby, don't fail over if "failover=manual";
GitHub #531 (Ian)
repmgrd: don't consider nodes where repmgrd is not running as promotion
candidates (Ian)
repmgrd: add option "connection_check_type" (Ian)
repmgrd: improve witness monitoring when primary node not available (Ian)
repmgrd: handle situation where a primary has unexpectedly appeared
during failover; GitHub #420 (Ian)
general: fix Makefile (John)
4.2 2018-10-24
4.2.0 2018-??-??
repmgr: add parameter "shutdown_check_timeout" for use by "standby switchover";
GitHub #504 (Ian)
repmgr: add "--node-id" option to "repmgr cluster cleanup"; GitHub #493 (Ian)

View File

@@ -2,7 +2,6 @@
# Makefile.global.in
# @configure_input@
# Can only be built using pgxs
USE_PGXS=1
@@ -15,26 +14,14 @@ ifeq ($(vpath_build),yes)
VPATH := $(repmgr_abs_srcdir)/$(repmgr_subdir)
USE_VPATH :=$(VPATH)
endif
SED=@SED@
GIT_WORK_TREE=${repmgr_abs_srcdir}
GIT_DIR=${repmgr_abs_srcdir}/.git
export GIT_DIR
export GIT_WORK_TREE
PG_LDFLAGS=-lcurl -ljson-c
include $(PGXS)
-include ${repmgr_abs_srcdir}/Makefile.custom
REPMGR_VERSION=$(shell awk '/^\#define REPMGR_VERSION / { print $3; }' ${repmgr_abs_srcdir}/repmgr_version.h.in | cut -d '"' -f 2)
REPMGR_RELEASE_DATE=$(shell awk '/^\#define REPMGR_RELEASE_DATE / { print $3; }' ${repmgr_abs_srcdir}/repmgr_version.h.in | cut -d '"' -f 2)
FLEX = flex
##########################################################################
#
# Global targets and rules
%.c: %.l
$(FLEX) $(FLEXFLAGS) -o'$@' $<

View File

@@ -11,28 +11,11 @@ EXTENSION = repmgr
DATA = \
repmgr--unpackaged--4.0.sql \
repmgr--unpackaged--5.1.sql \
repmgr--unpackaged--5.2.sql \
repmgr--unpackaged--5.3.sql \
repmgr--4.0.sql \
repmgr--4.0--4.1.sql \
repmgr--4.1.sql \
repmgr--4.1--4.2.sql \
repmgr--4.2.sql \
repmgr--4.2--4.3.sql \
repmgr--4.3.sql \
repmgr--4.3--4.4.sql \
repmgr--4.4.sql \
repmgr--4.4--5.0.sql \
repmgr--5.0.sql \
repmgr--5.0--5.1.sql \
repmgr--5.1.sql \
repmgr--5.1--5.2.sql \
repmgr--5.2.sql \
repmgr--5.2--5.3.sql \
repmgr--5.3.sql \
repmgr--5.3--5.4.sql \
repmgr--5.4.sql
repmgr--4.2.sql
REGRESS = repmgr_extension
@@ -64,19 +47,13 @@ $(info Building against PostgreSQL $(MAJORVERSION))
REPMGR_CLIENT_OBJS = repmgr-client.o \
repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o \
repmgr-action-cluster.o repmgr-action-node.o repmgr-action-service.o repmgr-action-daemon.o \
configdata.o configfile.o configfile-scan.o log.o strutil.o controldata.o dirutil.o compat.o \
dbutils.o sysutils.o pgbackupapi.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o configdata.o configfile.o configfile-scan.o log.o \
dbutils.o strutil.o controldata.o compat.o sysutils.o
repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o repmgr-action-daemon.o \
configfile.o log.o strutil.o controldata.o dirutil.o compat.o dbutils.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o compat.o
DATE=$(shell date "+%Y-%m-%d")
repmgr_version.h: repmgr_version.h.in
$(SED) -E 's/REPMGR_VERSION_DATE.*""/REPMGR_VERSION_DATE "$(DATE)"/' $< >$@; \
$(SED) -i -E 's/PG_ACTUAL_VERSION_NUM/PG_ACTUAL_VERSION_NUM $(VERSION_NUM)/' $@
configfile-scan.c: configfile-scan.l
sed '0,/REPMGR_VERSION_DATE/s,\(REPMGR_VERSION_DATE\).*,\1 "$(DATE)",' $< >$@
$(REPMGR_CLIENT_OBJS): repmgr-client.h repmgr_version.h
@@ -96,19 +73,10 @@ Makefile: Makefile.in config.status configure
Makefile.global: Makefile.global.in config.status configure
./config.status $@
doc: repmgr_version.h
$(MAKE) -C doc html
doc:
$(MAKE) -C doc all
doc-repmgr.html: repmgr_version.h
$(MAKE) -C doc repmgr.html
doc-repmgr-A4.pdf: repmgr_version.h
$(MAKE) -C doc repmgr-A4.pdf
doc-repmgr-US.pdf: repmgr_version.h
$(MAKE) -C doc repmgr-US.pdf
install-doc: doc
install-doc:
$(MAKE) -C doc install
clean: additional-clean
@@ -116,17 +84,29 @@ clean: additional-clean
maintainer-clean: additional-maintainer-clean
additional-clean:
rm -f *.o
rm -f repmgr_version.h
$(MAKE) -C doc clean
rm -f repmgr-client.o
rm -f repmgr-action-primary.o
rm -f repmgr-action-standby.o
rm -f repmgr-action-witness.o
rm -f repmgr-action-bdr.o
rm -f repmgr-action-node.o
rm -f repmgr-action-cluster.o
rm -f repmgr-action-daemon.o
rm -f repmgrd.o
rm -f repmgrd-physical.o
rm -f repmgrd-bdr.o
rm -f compat.o
rm -f configfile.o
rm -f controldata.o
rm -f dbutils.o
rm -f dirutil.o
rm -f log.o
rm -f strutil.o
additional-maintainer-clean: clean
$(MAKE) -C doc maintainer-clean
maintainer-additional-clean: clean
rm -f configure
rm -f config.status config.log
rm -f config.h
rm -f repmgr_version.h
rm -f Makefile
rm -f Makefile.global
@rm -rf autom4te.cache/
ifeq ($(MAJORVERSION),$(filter $(MAJORVERSION),9.3 9.4))
@@ -141,4 +121,3 @@ installdirs-scripts:
.PHONY: installdirs-scripts
endif
.PHONY: doc doc-repmgr.html doc-repmgr-A4.pdf doc-repmgr-US.pdf install-doc

View File

@@ -7,23 +7,32 @@ replication capabilities with utilities to set up standby servers, monitor
replication, and perform administrative tasks such as failover or switchover
operations.
The most recent `repmgr` version (5.4.1) supports all PostgreSQL versions from
10 to 16.
`repmgr 4` is a complete rewrite of the existing `repmgr` codebase, allowing
the use of all of the latest features in PostgreSQL replication.
PostgreSQL 10, 9.6 and 9.5 are fully supported.
PostgreSQL 9.4 and 9.3 are supported, with some restrictions.
`repmgr` is distributed under the GNU GPL 3 and maintained by 2ndQuadrant.
### BDR support
`repmgr 4` supports monitoring of a two-node BDR 2.0 cluster on PostgreSQL 9.6
only. Note that BDR 2.0 is not publicly available; please contact 2ndQuadrant
for details. `repmgr 4` will support future public BDR releases.
`repmgr` is distributed under the GNU GPL 3 and maintained by EnterpriseDB.
Documentation
-------------
The full `repmgr` documentation is available here:
The main `repmgr` documentation is available here:
> [repmgr documentation](https://repmgr.org/docs/current/index.html)
> [repmgr 4 documentation](https://repmgr.org/docs/4.0/index.html)
Versions
--------
The `README` file for `repmgr` 3.x is available here:
> https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/README.md
For an overview of `repmgr` versions and PostgreSQL compatibility, see the
[repmgr compatibility matrix](https://repmgr.org/docs/current/install-requirements.html#INSTALL-COMPATIBILITY-MATRIX).
Files
------
@@ -40,41 +49,46 @@ Directories
- `contrib/`: additional utilities
- `doc/`: DocBook-based documentation files
- `expected/`: expected regression test output
- `scripts/`: example scripts
- `sql/`: regression test input
Support and Assistance
----------------------
EnterpriseDB provides 24x7 production support for `repmgr`, including
2ndQuadrant provides 24x7 production support for `repmgr`, including
configuration assistance, installation verification and training for
running a robust replication cluster. For further details see:
* [EDB Support Services](https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql)
* https://2ndquadrant.com/en/support/
There is a mailing list/forum to discuss contributions or issues:
* https://groups.google.com/group/repmgr
The IRC channel #repmgr is registered with freenode.
Please report bugs and other issues to:
* https://github.com/EnterpriseDB/repmgr
* https://github.com/2ndQuadrant/repmgr
Further information is available at https://repmgr.org/
Further information is available at https://www.repmgr.org/
We'd love to hear from you about how you use repmgr. Case studies and
news are always welcome.
news are always welcome. Send us an email at info@2ndQuadrant.com, or
send a postcard to
repmgr
c/o 2ndQuadrant
7200 The Quorum
Oxford Business Park North
Oxford
OX4 2JZ
United Kingdom
Thanks from the repmgr core team.
* Ian Barwick
* Israel Barth
* Mario González
* Martín Marqués
* Gianni Ciolli
Past contributors:
* Jaime Casanova
* Abhijit Menon-Sen
* Simon Riggs
@@ -83,7 +97,7 @@ Past contributors:
Further reading
---------------
* [repmgr documentation](https://repmgr.org/docs/current/index.html)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 1](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-1/)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 2](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-2/)
* [How to implement repmgr for PostgreSQL automatic failover](https://www.enterprisedb.com/postgres-tutorials/how-implement-repmgr-postgresql-automatic-failover)
* https://blog.2ndquadrant.com/repmgr-3-2-is-here-barman-support-brand-new-high-availability-features/
* https://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/
* https://blog.2ndquadrant.com/managing-useful-clusters-repmgr/
* https://blog.2ndquadrant.com/easier_postgresql_90_clusters/

View File

@@ -1,7 +1,7 @@
TODO
====
This file contains a list of improvements which are desirable and/or have
This file contains a list of improvements which are desireable and/or have
been requested, and which we aim to address/implement when time and resources
permit.
@@ -17,4 +17,4 @@ repmgrd nodes to prevent unintended failover; this is obviously inconvenient.
We'll need to implement some way of notifying each repmgrd to suspend automatic
failover until further notice.
Requested in GitHub #410 ( https://github.com/EnterpriseDB/repmgr/issues/410 )
Requested in GitHub #410 ( https://github.com/2ndQuadrant/repmgr/issues/410 )

View File

@@ -6,7 +6,7 @@
* supported PostgreSQL versions. They're unlikely to change but
* it would be worth keeping an eye on them for any fixes/improvements.
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -98,42 +98,9 @@ appendShellString(PQExpBuffer buf, const char *str)
if (*p == '\'')
appendPQExpBufferStr(buf, "'\"'\"'");
else if (*p == '&')
appendPQExpBufferStr(buf, "\\&");
else
appendPQExpBufferChar(buf, *p);
}
appendPQExpBufferChar(buf, '\'');
}
/*
* Adapted from: src/fe_utils/string_utils.c
*/
void
appendRemoteShellString(PQExpBuffer buf, const char *str)
{
const char *p;
appendPQExpBufferStr(buf, "\\'");
for (p = str; *p; p++)
{
if (*p == '\n' || *p == '\r')
{
fprintf(stderr,
_("shell command argument contains a newline or carriage return: \"%s\"\n"),
str);
exit(ERR_BAD_CONFIG);
}
if (*p == '\'')
appendPQExpBufferStr(buf, "'\"'\"'");
else if (*p == '&')
appendPQExpBufferStr(buf, "\\&");
else
appendPQExpBufferChar(buf, *p);
}
appendPQExpBufferStr(buf, "\\'");
}

View File

@@ -1,6 +1,6 @@
/*
* compat.h
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -27,6 +27,4 @@ extern void appendConnStrVal(PQExpBuffer buf, const char *str);
extern void appendShellString(PQExpBuffer buf, const char *str);
extern void appendRemoteShellString(PQExpBuffer buf, const char *str);
#endif

View File

@@ -1,986 +0,0 @@
/*
* configdata.c - contains structs with parsed configuration data
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "repmgr.h"
#include "configfile.h"
/*
* Parsed configuration settings are stored here
*/
t_configuration_options config_file_options;
/*
* Configuration settings are defined here
*/
struct ConfigFileSetting config_file_settings[] =
{
/* ================
* node information
* ================
*/
/* node_id */
{
"node_id",
CONFIG_INT,
{ .intptr = &config_file_options.node_id },
{ .intdefault = UNKNOWN_NODE_ID },
{ .intminval = MIN_NODE_ID },
{},
{}
},
/* node_name */
{
"node_name",
CONFIG_STRING,
{ .strptr = config_file_options.node_name },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.node_name) },
{}
},
/* conninfo */
{
"conninfo",
CONFIG_STRING,
{ .strptr = config_file_options.conninfo },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.conninfo) },
{}
},
/* replication_user */
{
"replication_user",
CONFIG_STRING,
{ .strptr = config_file_options.replication_user },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.replication_user) },
{}
},
/* data_directory */
{
"data_directory",
CONFIG_STRING,
{ .strptr = config_file_options.data_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.data_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* config_directory */
{
"config_directory",
CONFIG_STRING,
{ .strptr = config_file_options.config_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.config_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* pg_bindir */
{
"pg_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.pg_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* repmgr_bindir */
{
"repmgr_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.repmgr_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgr_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* replication_type */
{
"replication_type",
CONFIG_REPLICATION_TYPE,
{ .replicationtypeptr = &config_file_options.replication_type },
{ .replicationtypedefault = DEFAULT_REPLICATION_TYPE },
{},
{},
{}
},
/* ================
* logging settings
* ================
*/
/*
* log_level
* NOTE: the default for "log_level" is set in log.c and does not need
* to be initialised here
*/
{
"log_level",
CONFIG_STRING,
{ .strptr = config_file_options.log_level },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_level) },
{}
},
/* log_facility */
{
"log_facility",
CONFIG_STRING,
{ .strptr = config_file_options.log_facility },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_facility) },
{}
},
/* log_file */
{
"log_file",
CONFIG_STRING,
{ .strptr = config_file_options.log_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* log_status_interval */
{
"log_status_interval",
CONFIG_INT,
{ .intptr = &config_file_options.log_status_interval },
{ .intdefault = DEFAULT_LOG_STATUS_INTERVAL, },
{ .intminval = 0 },
{},
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* use_replication_slots */
{
"use_replication_slots",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_replication_slots },
{ .booldefault = DEFAULT_USE_REPLICATION_SLOTS },
{},
{},
{}
},
/* pg_basebackup_options */
{
"pg_basebackup_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_basebackup_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_basebackup_options) },
{}
},
/* restore_command */
{
"restore_command",
CONFIG_STRING,
{ .strptr = config_file_options.restore_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.restore_command) },
{}
},
/* tablespace_mapping */
{
"tablespace_mapping",
CONFIG_TABLESPACE_MAPPING,
{ .tablespacemappingptr = &config_file_options.tablespace_mapping },
{},
{},
{},
{}
},
/* recovery_min_apply_delay */
{
"recovery_min_apply_delay",
CONFIG_STRING,
{ .strptr = config_file_options.recovery_min_apply_delay },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.recovery_min_apply_delay) },
{
.process_func = &parse_time_unit_parameter,
.providedptr = &config_file_options.recovery_min_apply_delay_provided
}
},
/* archive_cleanup_command */
{
"archive_cleanup_command",
CONFIG_STRING,
{ .strptr = config_file_options.archive_cleanup_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.archive_cleanup_command) },
{}
},
/* use_primary_conninfo_password */
{
"use_primary_conninfo_password",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_primary_conninfo_password },
{ .booldefault = DEFAULT_USE_PRIMARY_CONNINFO_PASSWORD },
{},
{},
{}
},
/* passfile */
{
"passfile",
CONFIG_STRING,
{ .strptr = config_file_options.passfile },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.passfile) },
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* promote_check_timeout */
{
"promote_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_timeout },
{ .intdefault = DEFAULT_PROMOTE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* promote_check_interval */
{
"promote_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_interval },
{ .intdefault = DEFAULT_PROMOTE_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* pg_backupapi_backup_id*/
{
"pg_backupapi_backup_id",
CONFIG_STRING,
{ .strptr = config_file_options.pg_backupapi_backup_id },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_backupapi_backup_id) },
{}
},
/* pg_backupapi_host*/
{
"pg_backupapi_host",
CONFIG_STRING,
{ .strptr = config_file_options.pg_backupapi_host },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_backupapi_host) },
{}
},
/* pg_backupapi_node_name */
{
"pg_backupapi_node_name",
CONFIG_STRING,
{ .strptr = config_file_options.pg_backupapi_node_name },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_backupapi_node_name) },
{}
},
/* pg_backupapi_remote_ssh_command */
{
"pg_backupapi_remote_ssh_command",
CONFIG_STRING,
{ .strptr = config_file_options.pg_backupapi_remote_ssh_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_backupapi_remote_ssh_command) },
{}
},
/* =======================
* standby follow settings
* =======================
*/
/* primary_follow_timeout */
{
"primary_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_follow_timeout },
{ .intdefault = DEFAULT_PRIMARY_FOLLOW_TIMEOUT, },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_timeout */
{
"standby_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_follow_timeout },
{ .intdefault = DEFAULT_STANDBY_FOLLOW_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_restart */
{
"standby_follow_restart",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_follow_restart },
{ .booldefault = DEFAULT_STANDBY_FOLLOW_RESTART },
{},
{},
{}
},
/* ===========================
* standby switchover settings
* ===========================
*/
/* shutdown_check_timeout */
{
"shutdown_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.shutdown_check_timeout },
{ .intdefault = DEFAULT_SHUTDOWN_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_reconnect_timeout */
{
"standby_reconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_reconnect_timeout },
{ .intdefault = DEFAULT_STANDBY_RECONNECT_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* wal_receive_check_timeout */
{
"wal_receive_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.wal_receive_check_timeout },
{ .intdefault = DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ====================
* node rejoin settings
* ====================
*/
/* node_rejoin_timeout */
{
"node_rejoin_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.node_rejoin_timeout },
{ .intdefault = DEFAULT_NODE_REJOIN_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ===================
* node check settings
* ===================
*/
/* archive_ready_warning */
{
"archive_ready_warning",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_warning },
{ .intdefault = DEFAULT_ARCHIVE_READY_WARNING },
{ .intminval = 1 },
{},
{}
},
/* archive_ready_critical */
{
"archive_ready_critical",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_critical },
{ .intdefault = DEFAULT_ARCHIVE_READY_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_warning */
{
"replication_lag_warning",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_warning },
{ .intdefault = DEFAULT_REPLICATION_LAG_WARNING },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_critical */
{
"replication_lag_critical",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_critical },
{ .intdefault = DEFAULT_REPLICATION_LAG_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* witness settings
* ================
*/
/* witness_sync_interval */
{
"witness_sync_interval",
CONFIG_INT,
{ .intptr = &config_file_options.witness_sync_interval },
{ .intdefault = DEFAULT_WITNESS_SYNC_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* repmgrd settings
* ================
*/
/* failover */
{
"failover",
CONFIG_FAILOVER_MODE,
{ .failovermodeptr = &config_file_options.failover },
{ .failovermodedefault = FAILOVER_MANUAL },
{},
{},
{}
},
/* location */
{
"location",
CONFIG_STRING,
{ .strptr = config_file_options.location },
{ .strdefault = DEFAULT_LOCATION },
{},
{ .strmaxlen = sizeof(config_file_options.location) },
{}
},
/* priority */
{
"priority",
CONFIG_INT,
{ .intptr = &config_file_options.priority },
{ .intdefault = DEFAULT_PRIORITY, },
{ .intminval = 0 },
{},
{}
},
/* promote_command */
{
"promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.promote_command) },
{}
},
/* follow_command */
{
"follow_command",
CONFIG_STRING,
{ .strptr = config_file_options.follow_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.follow_command) },
{}
},
/* monitor_interval_secs */
{
"monitor_interval_secs",
CONFIG_INT,
{ .intptr = &config_file_options.monitor_interval_secs },
{ .intdefault = DEFAULT_MONITORING_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* reconnect_attempts */
{
"reconnect_attempts",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_attempts },
{ .intdefault = DEFAULT_RECONNECTION_ATTEMPTS },
{ .intminval = 0 },
{},
{}
},
/* reconnect_interval */
{
"reconnect_interval",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_interval },
{ .intdefault = DEFAULT_RECONNECTION_INTERVAL },
{ .intminval = 0 },
{},
{}
},
/* monitoring_history */
{
"monitoring_history",
CONFIG_BOOL,
{ .boolptr = &config_file_options.monitoring_history },
{ .booldefault = DEFAULT_MONITORING_HISTORY },
{},
{},
{}
},
/* degraded_monitoring_timeout */
{
"degraded_monitoring_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.degraded_monitoring_timeout },
{ .intdefault = DEFAULT_DEGRADED_MONITORING_TIMEOUT },
{ .intminval = -1 },
{},
{}
},
/* async_query_timeout */
{
"async_query_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.async_query_timeout },
{ .intdefault = DEFAULT_ASYNC_QUERY_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* primary_notification_timeout */
{
"primary_notification_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_notification_timeout },
{ .intdefault = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_standby_startup_timeout */
{
"repmgrd_standby_startup_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.repmgrd_standby_startup_timeout },
{ .intdefault = DEFAULT_REPMGRD_STANDBY_STARTUP_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_pid_file */
{
"repmgrd_pid_file",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_pid_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_pid_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* repmgrd_exit_on_inactive_node */
{
"repmgrd_exit_on_inactive_node",
CONFIG_BOOL,
{ .boolptr = &config_file_options.repmgrd_exit_on_inactive_node},
{ .booldefault = DEFAULT_REPMGRD_EXIT_ON_INACTIVE_NODE },
{},
{},
{}
},
/* standby_disconnect_on_failover */
{
"standby_disconnect_on_failover",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_disconnect_on_failover },
{ .booldefault = DEFAULT_STANDBY_DISCONNECT_ON_FAILOVER },
{},
{},
{}
},
/* sibling_nodes_disconnect_timeout */
{
"sibling_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.sibling_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* connection_check_type */
{
"connection_check_type",
CONFIG_CONNECTION_CHECK_TYPE,
{ .checktypeptr = &config_file_options.connection_check_type },
{ .checktypedefault = DEFAULT_CONNECTION_CHECK_TYPE },
{},
{},
{}
},
/* primary_visibility_consensus */
{
"primary_visibility_consensus",
CONFIG_BOOL,
{ .boolptr = &config_file_options.primary_visibility_consensus },
{ .booldefault = DEFAULT_PRIMARY_VISIBILITY_CONSENSUS },
{},
{},
{}
},
/* always_promote */
{
"always_promote",
CONFIG_BOOL,
{ .boolptr = &config_file_options.always_promote },
{ .booldefault = DEFAULT_ALWAYS_PROMOTE },
{},
{},
{}
},
/* failover_validation_command */
{
"failover_validation_command",
CONFIG_STRING,
{ .strptr = config_file_options.failover_validation_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.failover_validation_command) },
{}
},
/* election_rerun_interval */
{
"election_rerun_interval",
CONFIG_INT,
{ .intptr = &config_file_options.election_rerun_interval },
{ .intdefault = DEFAULT_ELECTION_RERUN_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_check_interval */
{
"child_nodes_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_check_interval },
{ .intdefault = DEFAULT_CHILD_NODES_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_disconnect_min_count */
{
"child_nodes_disconnect_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT },
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_min_count */
{
"child_nodes_connected_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_connected_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT},
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_include_witness */
{
"child_nodes_connected_include_witness",
CONFIG_BOOL,
{ .boolptr = &config_file_options.child_nodes_connected_include_witness },
{ .booldefault = DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS },
{},
{},
{}
},
/* child_nodes_disconnect_timeout */
{
"child_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* child_nodes_disconnect_command */
{
"child_nodes_disconnect_command",
CONFIG_STRING,
{ .strptr = config_file_options.child_nodes_disconnect_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.child_nodes_disconnect_command) },
{}
},
/* ================
* service settings
* ================
*/
/* pg_ctl_options */
{
"pg_ctl_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_ctl_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_ctl_options) },
{}
},
/* service_start_command */
{
"service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_start_command) },
{}
},
/* service_stop_command */
{
"service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_stop_command) },
{}
},
/* service_restart_command */
{
"service_restart_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_restart_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_restart_command) },
{}
},
/* service_reload_command */
{
"service_reload_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_reload_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_reload_command) },
{}
},
/* service_promote_command */
{
"service_promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_promote_command) },
{}
},
/* ========================
* repmgrd service settings
* ========================
*/
/* repmgrd_service_start_command */
{
"repmgrd_service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_start_command) },
{}
},
/* repmgrd_service_stop_command */
{
"repmgrd_service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_stop_command) },
{}
},
/* ===========================
* event notification settings
* ===========================
*/
/* event_notification_command */
{
"event_notification_command",
CONFIG_STRING,
{ .strptr = config_file_options.event_notification_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.event_notification_command) },
{}
},
{
"event_notifications",
CONFIG_EVENT_NOTIFICATION_LIST,
{ .notificationlistptr = &config_file_options.event_notifications },
{},
{},
{},
{}
},
/* ===============
* barman settings
* ===============
*/
/* barman_host */
{
"barman_host",
CONFIG_STRING,
{ .strptr = config_file_options.barman_host },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_host) },
{}
},
/* barman_server */
{
"barman_server",
CONFIG_STRING,
{ .strptr = config_file_options.barman_server },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_server) },
{}
},
/* barman_config */
{
"barman_config",
CONFIG_STRING,
{ .strptr = config_file_options.barman_config },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_config) },
{}
},
/* ==================
* rsync/ssh settings
* ==================
*/
/* rsync_options */
{
"rsync_options",
CONFIG_STRING,
{ .strptr = config_file_options.rsync_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.rsync_options) },
{}
},
/* ssh_options */
{
"ssh_options",
CONFIG_STRING,
{ .strptr = config_file_options.ssh_options },
{ .strdefault = DEFAULT_SSH_OPTIONS },
{},
{ .strmaxlen = sizeof(config_file_options.ssh_options) },
{}
},
/* ==================================
* undocumented experimental settings
* ==================================
*/
/* reconnect_loop_sync */
{
"reconnect_loop_sync",
CONFIG_BOOL,
{ .boolptr = &config_file_options.reconnect_loop_sync },
{ .booldefault = false },
{},
{},
{}
},
/* ==========================
* undocumented test settings
* ==========================
*/
/* promote_delay */
{
"promote_delay",
CONFIG_INT,
{ .intptr = &config_file_options.promote_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
/* failover_delay */
{
"failover_delay",
CONFIG_INT,
{ .intptr = &config_file_options.failover_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
{
"connection_check_query",
CONFIG_STRING,
{ .strptr = config_file_options.connection_check_query },
{ .strdefault = "SELECT 1" },
{},
{ .strmaxlen = sizeof(config_file_options.connection_check_query) },
{}
},
/* End-of-list marker */
{
NULL, CONFIG_INT, {}, {}, {}, {}, {}
}
};

View File

@@ -1,644 +0,0 @@
/*
* Scanner for the configuration file
*/
%{
#include <setjmp.h>
#include <sys/stat.h>
#include <dirent.h>
#include "repmgr.h"
#include "configfile.h"
/*
* flex emits a yy_fatal_error() function that it calls in response to
* critical errors like malloc failure, file I/O errors, and detection of
* internal inconsistency. That function prints a message and calls exit().
* Mutate it to instead call our handler, which jumps out of the parser.
*/
#undef fprintf
#define fprintf(file, fmt, msg) CONF_flex_fatal(msg)
enum
{
CONF_ID = 1,
CONF_STRING = 2,
CONF_INTEGER = 3,
CONF_REAL = 4,
CONF_EQUALS = 5,
CONF_UNQUOTED_STRING = 6,
CONF_QUALIFIED_ID = 7,
CONF_EOL = 99,
CONF_ERROR = 100
};
static unsigned int ConfigFileLineno;
static const char *CONF_flex_fatal_errmsg;
static sigjmp_buf *CONF_flex_fatal_jmp;
static char *CONF_scanstr(const char *s);
static int CONF_flex_fatal(const char *msg);
static bool ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static char *AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file);
%}
%option 8bit
%option never-interactive
%option nodefault
%option noinput
%option nounput
%option noyywrap
%option warn
%option prefix="CONF_yy"
SIGN ("-"|"+")
DIGIT [0-9]
HEXDIGIT [0-9a-fA-F]
UNIT_LETTER [a-zA-Z]
INTEGER {SIGN}?({DIGIT}+|0x{HEXDIGIT}+){UNIT_LETTER}*
EXPONENT [Ee]{SIGN}?{DIGIT}+
REAL {SIGN}?{DIGIT}*"."{DIGIT}*{EXPONENT}?
LETTER [A-Za-z_\200-\377]
LETTER_OR_DIGIT [A-Za-z_0-9\200-\377]
ID {LETTER}{LETTER_OR_DIGIT}*
QUALIFIED_ID {ID}"."{ID}
UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])*
STRING \'([^'\\\n]|\\.|\'\')*\'
%%
\n ConfigFileLineno++; return CONF_EOL;
[ \t\r]+ /* eat whitespace */
#.* /* eat comment (.* matches anything until newline) */
{ID} return CONF_ID;
{QUALIFIED_ID} return CONF_QUALIFIED_ID;
{STRING} return CONF_STRING;
{UNQUOTED_STRING} return CONF_UNQUOTED_STRING;
{INTEGER} return CONF_INTEGER;
{REAL} return CONF_REAL;
= return CONF_EQUALS;
. return CONF_ERROR;
%%
extern bool
ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(base_dir, config_file, NULL, true, 0, NULL, error_list, warning_list);
}
extern bool
ProcessPostgresConfigFile(const char *config_file, const char *base_dir, bool strict, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(base_dir, config_file, NULL, strict, 0, contents, error_list, warning_list);
}
static bool
ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *abs_path;
bool success = true;
FILE *fp;
/*
* Reject file name that is all-blank (including empty), as that leads to
* confusion --- we'd try to read the containing directory as a file.
*/
if (strspn(config_file, " \t\r\n") == strlen(config_file))
{
return false;
}
/*
* Reject too-deep include nesting depth. This is just a safety check to
* avoid dumping core due to stack overflow if an include file loops back
* to itself. The maximum nesting depth is pretty arbitrary.
*/
if (depth > 10)
{
item_list_append_format(error_list,
_("could not open configuration file \"%s\": maximum nesting depth exceeded"),
config_file);
return false;
}
abs_path = AbsoluteConfigLocation(base_dir, config_file, calling_file);
/* Reject direct recursion */
if (calling_file && strcmp(abs_path, calling_file) == 0)
{
item_list_append_format(error_list,
_("configuration file recursion in \"%s\""),
calling_file);
pfree(abs_path);
return false;
}
fp = fopen(abs_path, "r");
if (!fp)
{
if (strict == false)
{
item_list_append_format(error_list,
"skipping configuration file \"%s\"",
abs_path);
}
else
{
item_list_append_format(error_list,
"could not open configuration file \"%s\": %s",
abs_path,
strerror(errno));
success = false;
}
}
else
{
success = ProcessConfigFp(fp, abs_path, calling_file, depth + 1, base_dir, contents, error_list, warning_list);
}
free(abs_path);
return success;
}
static bool
ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
volatile bool OK = true;
volatile YY_BUFFER_STATE lex_buffer = NULL;
sigjmp_buf flex_fatal_jmp;
int errorcount;
int token;
if (sigsetjmp(flex_fatal_jmp, 1) == 0)
{
CONF_flex_fatal_jmp = &flex_fatal_jmp;
}
else
{
/*
* Regain control after a fatal, internal flex error. It may have
* corrupted parser state. Consequently, abandon the file, but trust
* that the state remains sane enough for yy_delete_buffer().
*/
item_list_append_format(error_list,
"%s at file \"%s\" line %u",
CONF_flex_fatal_errmsg, config_file, ConfigFileLineno);
OK = false;
goto cleanup;
}
/*
* Parse
*/
ConfigFileLineno = 1;
errorcount = 0;
lex_buffer = yy_create_buffer(fp, YY_BUF_SIZE);
yy_switch_to_buffer(lex_buffer);
/* This loop iterates once per logical line */
while ((token = yylex()))
{
char *opt_name = NULL;
char *opt_value = NULL;
if (token == CONF_EOL) /* empty or comment line */
continue;
/* first token on line is option name */
if (token != CONF_ID && token != CONF_QUALIFIED_ID)
goto parse_error;
opt_name = pstrdup(yytext);
/* next we have an optional equal sign; discard if present */
token = yylex();
if (token == CONF_EQUALS)
token = yylex();
/* now we must have the option value */
if (token != CONF_ID &&
token != CONF_STRING &&
token != CONF_INTEGER &&
token != CONF_REAL &&
token != CONF_UNQUOTED_STRING)
goto parse_error;
if (token == CONF_STRING) /* strip quotes and escapes */
opt_value = CONF_scanstr(yytext);
else
opt_value = pstrdup(yytext);
/* now we'd like an end of line, or possibly EOF */
token = yylex();
if (token != CONF_EOL)
{
if (token != 0)
goto parse_error;
/* treat EOF like \n for line numbering purposes, cf bug 4752 */
ConfigFileLineno++;
}
/* Handle include files */
if (base_dir != NULL && strcasecmp(opt_name, "include_dir") == 0)
{
/*
* An include_dir directive isn't a variable and should be
* processed immediately.
*/
if (!ProcessConfigDirectory(base_dir, opt_value, config_file,
depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include_if_exists") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
false, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
true, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else
{
/* OK, process the option name and value */
if (contents != NULL)
{
key_value_list_replace_or_set(contents,
opt_name,
opt_value);
}
else
{
parse_configuration_item(error_list,
warning_list,
opt_name,
opt_value);
}
}
/* break out of loop if read EOF, else loop for next line */
if (token == 0)
break;
continue;
parse_error:
/* release storage if we allocated any on this line */
if (opt_name)
pfree(opt_name);
if (opt_value)
pfree(opt_value);
/* report the error */
if (token == CONF_EOL || token == 0)
{
item_list_append_format(error_list,
_("syntax error in file \"%s\" line %u, near end of line"),
config_file, ConfigFileLineno - 1);
}
else
{
item_list_append_format(error_list,
_("syntax error in file \"%s\" line %u, near token \"%s\""),
config_file, ConfigFileLineno, yytext);
}
OK = false;
errorcount++;
/*
* To avoid producing too much noise when fed a totally bogus file,
* give up after 100 syntax errors per file (an arbitrary number).
* Also, if we're only logging the errors at DEBUG level anyway, might
* as well give up immediately. (This prevents postmaster children
* from bloating the logs with duplicate complaints.)
*/
if (errorcount >= 100)
{
fprintf(stderr,
_("too many syntax errors found, abandoning file \"%s\"\n"),
config_file);
break;
}
/* resync to next end-of-line or EOF */
while (token != CONF_EOL && token != 0)
token = yylex();
/* break out of loop on EOF */
if (token == 0)
break;
}
cleanup:
yy_delete_buffer(lex_buffer);
return OK;
}
/*
* Read and parse all config files in a subdirectory in alphabetical order
*
* includedir is the absolute or relative path to the subdirectory to scan.
*
* See ProcessConfigFp for further details.
*/
static bool
ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *directory;
DIR *d;
struct dirent *de;
char **filenames;
int num_filenames;
int size_filenames;
bool status;
/*
* Reject directory name that is all-blank (including empty), as that
* leads to confusion --- we'd read the containing directory, typically
* resulting in recursive inclusion of the same file(s).
*/
if (strspn(includedir, " \t\r\n") == strlen(includedir))
{
item_list_append_format(error_list,
_("empty configuration directory name: \"%s\""),
includedir);
return false;
}
directory = AbsoluteConfigLocation(base_dir, includedir, calling_file);
d = opendir(directory);
if (d == NULL)
{
item_list_append_format(error_list,
_("could not open configuration directory \"%s\": %s"),
directory,
strerror(errno));
status = false;
goto cleanup;
}
/*
* Read the directory and put the filenames in an array, so we can sort
* them prior to processing the contents.
*/
size_filenames = 32;
filenames = (char **) palloc(size_filenames * sizeof(char *));
num_filenames = 0;
while ((de = readdir(d)) != NULL)
{
struct stat st;
char filename[MAXPGPATH];
/*
* Only parse files with names ending in ".conf". Explicitly reject
* files starting with ".". This excludes things like "." and "..",
* as well as typical hidden files, backup files, and editor debris.
*/
if (strlen(de->d_name) < 6)
continue;
if (de->d_name[0] == '.')
continue;
if (strcmp(de->d_name + strlen(de->d_name) - 5, ".conf") != 0)
continue;
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
if (stat(filename, &st) == 0)
{
if (!S_ISDIR(st.st_mode))
{
/* Add file to array, increasing its size in blocks of 32 */
if (num_filenames >= size_filenames)
{
size_filenames += 32;
filenames = (char **) repalloc(filenames,
size_filenames * sizeof(char *));
}
filenames[num_filenames] = pstrdup(filename);
num_filenames++;
}
}
else
{
/*
* stat does not care about permissions, so the most likely reason
* a file can't be accessed now is if it was removed between the
* directory listing and now.
*/
item_list_append_format(error_list,
_("could not stat file \"%s\": %s"),
filename, strerror(errno));
status = false;
goto cleanup;
}
}
if (num_filenames > 0)
{
int i;
qsort(filenames, num_filenames, sizeof(char *), pg_qsort_strcmp);
for (i = 0; i < num_filenames; i++)
{
if (!ProcessConfigFile(base_dir, filenames[i], calling_file,
true, depth, contents,
error_list, warning_list))
{
status = false;
goto cleanup;
}
}
}
status = true;
cleanup:
if (d)
closedir(d);
pfree(directory);
return status;
}
/*
* scanstr
*
* Strip the quotes surrounding the given string, and collapse any embedded
* '' sequences and backslash escapes.
*
* the string returned is palloc'd and should eventually be pfree'd by the
* caller.
*/
static char *
CONF_scanstr(const char *s)
{
char *newStr;
int len,
i,
j;
Assert(s != NULL && s[0] == '\'');
len = strlen(s);
Assert(s != NULL);
Assert(len >= 2);
Assert(s[len - 1] == '\'');
/* Skip the leading quote; we'll handle the trailing quote below */
s++, len--;
/* Since len still includes trailing quote, this is enough space */
newStr = palloc(len);
for (i = 0, j = 0; i < len; i++)
{
if (s[i] == '\\')
{
i++;
switch (s[i])
{
case 'b':
newStr[j] = '\b';
break;
case 'f':
newStr[j] = '\f';
break;
case 'n':
newStr[j] = '\n';
break;
case 'r':
newStr[j] = '\r';
break;
case 't':
newStr[j] = '\t';
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
{
int k;
long octVal = 0;
for (k = 0;
s[i + k] >= '0' && s[i + k] <= '7' && k < 3;
k++)
octVal = (octVal << 3) + (s[i + k] - '0');
i += k - 1;
newStr[j] = ((char) octVal);
}
break;
default:
newStr[j] = s[i];
break;
} /* switch */
}
else if (s[i] == '\'' && s[i + 1] == '\'')
{
/* doubled quote becomes just one quote */
newStr[j] = s[++i];
}
else
newStr[j] = s[i];
j++;
}
/* We copied the ending quote to newStr, so replace with \0 */
Assert(j > 0 && j <= len);
newStr[--j] = '\0';
return newStr;
}
/*
* Given a configuration file or directory location that may be a relative
* path, return an absolute one. We consider the location to be relative to
* the directory holding the calling file, or to DataDir if no calling file.
*/
static char *
AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file)
{
char abs_path[MAXPGPATH];
if (is_absolute_path(location))
return strdup(location);
if (calling_file != NULL)
{
strlcpy(abs_path, calling_file, sizeof(abs_path));
get_parent_directory(abs_path);
join_path_components(abs_path, abs_path, location);
canonicalize_path(abs_path);
}
else if (base_dir != NULL)
{
join_path_components(abs_path, base_dir, location);
canonicalize_path(abs_path);
}
else
{
strlcpy(abs_path, location, sizeof(abs_path));
}
return strdup(abs_path);
}
/*
* Flex fatal errors bring us here. Stash the error message and jump back to
* ParseConfigFp(). Assume all msg arguments point to string constants; this
* holds for flex 2.5.31 (earliest we support) and flex 2.5.35 (latest as of
* this writing). Otherwise, we would need to copy the message.
*
* We return "int" since this takes the place of calls to fprintf().
*/
static int
CONF_flex_fatal(const char *msg)
{
CONF_flex_fatal_errmsg = msg;
siglongjmp(*CONF_flex_fatal_jmp, 1);
return 0; /* keep compiler quiet */
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
/*
* configfile.h
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
*
* This program is free software: you can redistribute it and/or modify
@@ -28,12 +28,6 @@
/* magic number for use in t_recovery_conf */
#define TARGET_TIMELINE_LATEST 0
/*
* This is defined in src/include/utils.h, however it's not practical
* to include that from a frontend application.
*/
#define PG_AUTOCONF_FILENAME "postgresql.auto.conf"
extern bool config_file_found;
extern char config_file_path[MAXPGPATH];
@@ -43,18 +37,6 @@ typedef enum
FAILOVER_AUTOMATIC
} failover_mode_opt;
typedef enum
{
CHECK_PING,
CHECK_QUERY,
CHECK_CONNECTION
} ConnectionCheckType;
typedef enum
{
REPLICATION_TYPE_PHYSICAL
} ReplicationType;
typedef struct EventNotificationListCell
{
struct EventNotificationListCell *next;
@@ -83,75 +65,23 @@ typedef struct TablespaceList
} TablespaceList;
typedef enum
{
CONFIG_BOOL,
CONFIG_INT,
CONFIG_STRING,
CONFIG_FAILOVER_MODE,
CONFIG_CONNECTION_CHECK_TYPE,
CONFIG_EVENT_NOTIFICATION_LIST,
CONFIG_TABLESPACE_MAPPING,
CONFIG_REPLICATION_TYPE
} ConfigItemType;
typedef struct ConfigFileSetting
{
const char *name;
ConfigItemType type;
union
{
int *intptr;
char *strptr;
bool *boolptr;
failover_mode_opt *failovermodeptr;
ConnectionCheckType *checktypeptr;
EventNotificationList *notificationlistptr;
TablespaceList *tablespacemappingptr;
ReplicationType *replicationtypeptr;
} val;
union {
int intdefault;
const char *strdefault;
bool booldefault;
failover_mode_opt failovermodedefault;
ConnectionCheckType checktypedefault;
ReplicationType replicationtypedefault;
} defval;
union {
int intminval;
} minval;
union {
int strmaxlen;
} maxval;
struct {
void (*process_func)(const char *, const char *, char *, ItemList *errors);
void (*postprocess_func)(const char *, const char *, char *, ItemList *errors);
bool *providedptr;
} process;
} ConfigFileSetting;
/* Declare the main configfile structure for client applications */
extern ConfigFileSetting config_file_settings[];
typedef struct
{
/* node information */
int node_id;
char node_name[NAMEDATALEN];
char node_name[MAXLEN];
char conninfo[MAXLEN];
char replication_user[NAMEDATALEN];
char data_directory[MAXPGPATH];
char config_directory[MAXPGPATH];
char pg_bindir[MAXPGPATH];
char repmgr_bindir[MAXPGPATH];
ReplicationType replication_type;
int replication_type;
/* log settings */
char log_level[MAXLEN];
char log_facility[MAXLEN];
char log_file[MAXPGPATH];
char log_file[MAXLEN];
int log_status_interval;
/* standby clone settings */
@@ -164,10 +94,6 @@ typedef struct
char archive_cleanup_command[MAXLEN];
bool use_primary_conninfo_password;
char passfile[MAXPGPATH];
char pg_backupapi_backup_id[NAMEDATALEN];
char pg_backupapi_host[NAMEDATALEN];
char pg_backupapi_node_name[NAMEDATALEN];
char pg_backupapi_remote_ssh_command[MAXLEN];
/* standby promote settings */
int promote_check_timeout;
@@ -176,12 +102,10 @@ typedef struct
/* standby follow settings */
int primary_follow_timeout;
int standby_follow_timeout;
bool standby_follow_restart;
/* standby switchover settings */
int shutdown_check_timeout;
int standby_reconnect_timeout;
int wal_receive_check_timeout;
/* node rejoin settings */
int node_rejoin_timeout;
@@ -210,35 +134,21 @@ typedef struct
int primary_notification_timeout;
int repmgrd_standby_startup_timeout;
char repmgrd_pid_file[MAXPGPATH];
bool repmgrd_exit_on_inactive_node;
bool standby_disconnect_on_failover;
int sibling_nodes_disconnect_timeout;
ConnectionCheckType connection_check_type;
bool primary_visibility_consensus;
bool always_promote;
char failover_validation_command[MAXPGPATH];
int election_rerun_interval;
int child_nodes_check_interval;
int child_nodes_disconnect_min_count;
int child_nodes_connected_min_count;
bool child_nodes_connected_include_witness;
int child_nodes_disconnect_timeout;
char child_nodes_disconnect_command[MAXPGPATH];
/* BDR settings */
bool bdr_local_monitoring_only;
bool bdr_recovery_timeout;
/* service settings */
char pg_ctl_options[MAXLEN];
char service_start_command[MAXPGPATH];
char service_stop_command[MAXPGPATH];
char service_restart_command[MAXPGPATH];
char service_reload_command[MAXPGPATH];
char service_promote_command[MAXPGPATH];
/* repmgrd service settings */
char repmgrd_service_start_command[MAXPGPATH];
char repmgrd_service_stop_command[MAXPGPATH];
char service_stop_command[MAXLEN];
char service_start_command[MAXLEN];
char service_restart_command[MAXLEN];
char service_reload_command[MAXLEN];
char service_promote_command[MAXLEN];
/* event notification settings */
char event_notification_command[MAXPGPATH];
char event_notification_command[MAXLEN];
char event_notifications_orig[MAXLEN];
EventNotificationList event_notifications;
@@ -251,35 +161,70 @@ typedef struct
char rsync_options[MAXLEN];
char ssh_options[MAXLEN];
/*
* undocumented settings
*
* These settings are for testing or experimental features
* and may be changed without notice.
*/
/* experimental settings */
bool reconnect_loop_sync;
/* test settings */
/* undocumented test settings */
int promote_delay;
int failover_delay;
char connection_check_query[MAXLEN];
} t_configuration_options;
/*
* The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file
*/
#define T_CONFIGURATION_OPTIONS_INITIALIZER { \
/* node information */ \
UNKNOWN_NODE_ID, "", "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL, \
/* log settings */ \
"", "", "", DEFAULT_LOG_STATUS_INTERVAL, \
/* standby clone settings */ \
false, "", "", { NULL, NULL }, "", false, "", false, "", \
/* standby promote settings */ \
DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
/* standby follow settings */ \
DEFAULT_PRIMARY_FOLLOW_TIMEOUT, \
DEFAULT_STANDBY_FOLLOW_TIMEOUT, \
/* standby switchover settings */ \
DEFAULT_SHUTDOWN_CHECK_TIMEOUT, \
DEFAULT_STANDBY_RECONNECT_TIMEOUT, \
/* node rejoin settings */ \
DEFAULT_NODE_REJOIN_TIMEOUT, \
/* node check settings */ \
DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
/* witness settings */ \
DEFAULT_WITNESS_SYNC_INTERVAL, \
/* repmgrd settings */ \
FAILOVER_MANUAL, DEFAULT_LOCATION, DEFAULT_PRIORITY, "", "", \
DEFAULT_MONITORING_INTERVAL, \
DEFAULT_RECONNECTION_ATTEMPTS, \
DEFAULT_RECONNECTION_INTERVAL, \
false, -1, \
DEFAULT_ASYNC_QUERY_TIMEOUT, \
DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT, \
-1, "", \
/* BDR settings */ \
false, DEFAULT_BDR_RECOVERY_TIMEOUT, \
/* service settings */ \
"", "", "", "", "", "", \
/* event notification settings */ \
"", "", { NULL, NULL }, \
/* barman settings */ \
"", "", "", \
/* rsync/ssh settings */ \
"", "", \
/* undocumented test settings */ \
0 \
}
/* Declare the main configfile structure for client applications */
extern t_configuration_options config_file_options;
typedef struct
{
char slot[MAXLEN];
char wal_method[MAXLEN];
char waldir[MAXPGPATH];
char xlog_method[MAXLEN];
bool no_slot; /* from PostgreSQL 10 */
} t_basebackup_options;
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", "", false }
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false }
typedef enum
@@ -336,11 +281,8 @@ typedef struct
void set_progname(const char *argv0);
const char *progname(void);
void load_config(const char *config_file, bool verbose, bool terse, char *argv0);
bool reload_config(t_server_type server_type);
void dump_config(void);
void parse_configuration_item(ItemList *error_list, ItemList *warning_list, const char *name, const char *value);
void load_config(const char *config_file, bool verbose, bool terse, t_configuration_options *options, char *argv0);
bool reload_config(t_configuration_options *orig_options, t_server_type server_type);
bool parse_recovery_conf(const char *data_dir, t_recovery_conf *conf);
@@ -353,9 +295,6 @@ int repmgr_atoi(const char *s,
ItemList *error_list,
int minval);
void parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemList *errors);
void repmgr_canonicalize_path(const char *name, const char *value, char *config_item, ItemList *errors);
bool parse_pg_basebackup_options(const char *pg_basebackup_options,
t_basebackup_options *backup_options,
int server_version_num,
@@ -363,21 +302,10 @@ bool parse_pg_basebackup_options(const char *pg_basebackup_options,
int parse_output_to_argv(const char *string, char ***argv_array);
void free_parsed_argv(char ***argv_array);
const char *format_failover_mode(failover_mode_opt failover);
/* called by repmgr-client and repmgrd */
void exit_with_cli_errors(ItemList *error_list, const char *repmgr_command);
void print_item_list(ItemList *item_list);
const char *print_replication_type(ReplicationType type);
const char *print_connection_check_type(ConnectionCheckType type);
char *print_event_notification_list(EventNotificationList *list);
char *print_tablespace_mapping(TablespaceList *tablespacemappingptr);
extern bool modify_auto_conf(const char *data_dir, KeyValueList *items);
extern bool ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list);
extern bool ProcessPostgresConfigFile(const char *config_file, const char *base_dir, bool strict, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
#endif /* _REPMGR_CONFIGFILE_H_ */

179
configure vendored
View File

@@ -1,8 +1,8 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for repmgr 5.4.0.
# Generated by GNU Autoconf 2.69 for repmgr 4.2.
#
# Report bugs to <repmgr@googlegroups.com>.
# Report bugs to <pgsql-bugs@postgresql.org>.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@@ -11,7 +11,7 @@
# This configure script is free software; the Free Software Foundation
# gives unlimited permission to copy, distribute and modify it.
#
# Copyright (c) 2010-2021, EnterpriseDB Corporation
# Copyright (c) 2010-2018, 2ndQuadrant Ltd.
## -------------------- ##
## M4sh Initialization. ##
## -------------------- ##
@@ -269,7 +269,7 @@ fi
$as_echo "$0: be upgraded to zsh 4.3.4 or later."
else
$as_echo "$0: Please tell bug-autoconf@gnu.org and
$0: repmgr@googlegroups.com about your system, including
$0: pgsql-bugs@postgresql.org about your system, including
$0: any error possibly output before this message. Then
$0: install a modern shell, or manually run the script
$0: under such a shell if you do have one."
@@ -582,16 +582,13 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='repmgr'
PACKAGE_TARNAME='repmgr'
PACKAGE_VERSION='5.4.0'
PACKAGE_STRING='repmgr 5.4.0'
PACKAGE_BUGREPORT='repmgr@googlegroups.com'
PACKAGE_URL='https://repmgr.org/'
PACKAGE_VERSION='4.2'
PACKAGE_STRING='repmgr 4.2'
PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org'
PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/'
ac_subst_vars='LTLIBOBJS
LIBOBJS
HAVE_SED
HAVE_GSED
HAVE_GNUSED
vpath_build
SED
PG_CONFIG
@@ -1181,7 +1178,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures repmgr 5.4.0 to adapt to many kinds of systems.
\`configure' configures repmgr 4.2 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1242,7 +1239,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of repmgr 5.4.0:";;
short | recursive ) echo "Configuration of repmgr 4.2:";;
esac
cat <<\_ACEOF
@@ -1252,8 +1249,8 @@ Some influential environment variables:
Use these variables to override the choices made by `configure' or to help
it to find libraries and programs with nonstandard names/locations.
Report bugs to <repmgr@googlegroups.com>.
repmgr home page: <https://repmgr.org/>.
Report bugs to <pgsql-bugs@postgresql.org>.
repmgr home page: <https://2ndquadrant.com/en/resources/repmgr/>.
_ACEOF
ac_status=$?
fi
@@ -1316,14 +1313,14 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
repmgr configure 5.4.0
repmgr configure 4.2
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
This configure script is free software; the Free Software Foundation
gives unlimited permission to copy, distribute and modify it.
Copyright (c) 2010-2021, EnterpriseDB Corporation
Copyright (c) 2010-2018, 2ndQuadrant Ltd.
_ACEOF
exit
fi
@@ -1335,7 +1332,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by repmgr $as_me 5.4.0, which was
It was created by repmgr $as_me 4.2, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@@ -1811,11 +1808,11 @@ fi
pgac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
major_version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[^0-9]\+ \([0-9]\{1,2\}\).*$/\1/')
$SED -e 's/^PostgreSQL \([0-9]\{1,2\}\).*$/\1/')
if test "$major_version_num" -lt '10'; then
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[^0-9]\+ \([0-9]*\)\.\([0-9]*\)\([a-zA-Z0-9.]*\)$/\1.\2/')
$SED -e 's/^PostgreSQL \([0-9]*\)\.\([0-9]*\)\([a-zA-Z0-9.]*\)$/\1.\2/')
if test -z "$version_num"; then
as_fn_error $? "could not detect the PostgreSQL version, wrong or broken pg_config?" "$LINENO" 5
@@ -1824,12 +1821,12 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([0-9]*\)\.\([0-9]*\)$/\1\2/')
if test "$version_num_int" -lt '94'; then
if test "$version_num_int" -lt '93'; then
as_fn_error $? "repmgr is not compatible with detected PostgreSQL version: $version_num" "$LINENO" 5
fi
else
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[^0-9]\+ \(.\+\)$/\1/')
$SED -e 's/^PostgreSQL \(.\+\)$/\1/')
if test -z "$version_num"; then
as_fn_error $? "could not detect the PostgreSQL version, wrong or broken pg_config?" "$LINENO" 5
@@ -1850,137 +1847,12 @@ else
fi
# Extract the first word of "gnused", so it can be a program name with args.
set dummy gnused; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_GNUSED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_GNUSED"; then
ac_cv_prog_HAVE_GNUSED="$HAVE_GNUSED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_GNUSED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_GNUSED" && ac_cv_prog_HAVE_GNUSED="no"
fi
fi
HAVE_GNUSED=$ac_cv_prog_HAVE_GNUSED
if test -n "$HAVE_GNUSED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_GNUSED" >&5
$as_echo "$HAVE_GNUSED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
# Extract the first word of "gsed", so it can be a program name with args.
set dummy gsed; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_GSED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_GSED"; then
ac_cv_prog_HAVE_GSED="$HAVE_GSED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_GSED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_GSED" && ac_cv_prog_HAVE_GSED="no"
fi
fi
HAVE_GSED=$ac_cv_prog_HAVE_GSED
if test -n "$HAVE_GSED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_GSED" >&5
$as_echo "$HAVE_GSED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
# Extract the first word of "sed", so it can be a program name with args.
set dummy sed; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_SED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_SED"; then
ac_cv_prog_HAVE_SED="$HAVE_SED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_SED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_SED" && ac_cv_prog_HAVE_SED="no"
fi
fi
HAVE_SED=$ac_cv_prog_HAVE_SED
if test -n "$HAVE_SED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_SED" >&5
$as_echo "$HAVE_SED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
if test "$HAVE_GNUSED" = yes; then
SED=gnused
else
if test "$HAVE_GSED" = yes; then
SED=gsed
else
SED=sed
fi
fi
ac_config_files="$ac_config_files Makefile"
ac_config_files="$ac_config_files Makefile.global"
ac_config_files="$ac_config_files doc/Makefile"
cat >confcache <<\_ACEOF
# This file is a shell script that caches the results of configure
# tests run on this system so they can be shared between configure
@@ -2487,7 +2359,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by repmgr $as_me 5.4.0, which was
This file was extended by repmgr $as_me 4.2, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@@ -2543,14 +2415,14 @@ $config_files
Configuration headers:
$config_headers
Report bugs to <repmgr@googlegroups.com>.
repmgr home page: <https://repmgr.org/>."
Report bugs to <pgsql-bugs@postgresql.org>.
repmgr home page: <https://2ndquadrant.com/en/resources/repmgr/>."
_ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
repmgr config.status 5.4.0
repmgr config.status 4.2
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"
@@ -2674,6 +2546,7 @@ do
"config.h") CONFIG_HEADERS="$CONFIG_HEADERS config.h" ;;
"Makefile") CONFIG_FILES="$CONFIG_FILES Makefile" ;;
"Makefile.global") CONFIG_FILES="$CONFIG_FILES Makefile.global" ;;
"doc/Makefile") CONFIG_FILES="$CONFIG_FILES doc/Makefile" ;;
*) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;;
esac

View File

@@ -1,6 +1,6 @@
AC_INIT([repmgr], [5.4.0], [repmgr@googlegroups.com], [repmgr], [https://repmgr.org/])
AC_INIT([repmgr], [4.2], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
AC_COPYRIGHT([Copyright (c) 2010-2021, EnterpriseDB Corporation])
AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.])
AC_CONFIG_HEADER(config.h)
@@ -19,11 +19,11 @@ fi
pgac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
major_version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[[^0-9]]\+ \([[0-9]]\{1,2\}\).*$/\1/')
$SED -e 's/^PostgreSQL \([[0-9]]\{1,2\}\).*$/\1/')
if test "$major_version_num" -lt '10'; then
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[[^0-9]]\+ \([[0-9]]*\)\.\([[0-9]]*\)\([[a-zA-Z0-9.]]*\)$/\1.\2/')
$SED -e 's/^PostgreSQL \([[0-9]]*\)\.\([[0-9]]*\)\([[a-zA-Z0-9.]]*\)$/\1.\2/')
if test -z "$version_num"; then
AC_MSG_ERROR([could not detect the PostgreSQL version, wrong or broken pg_config?])
@@ -32,12 +32,12 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([[0-9]]*\)\.\([[0-9]]*\)$/\1\2/')
if test "$version_num_int" -lt '94'; then
if test "$version_num_int" -lt '93'; then
AC_MSG_ERROR([repmgr is not compatible with detected PostgreSQL version: $version_num])
fi
else
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^[[^0-9]]\+ \(.\+\)$/\1/')
$SED -e 's/^PostgreSQL \(.\+\)$/\1/')
if test -z "$version_num"; then
AC_MSG_ERROR([could not detect the PostgreSQL version, wrong or broken pg_config?])
@@ -57,43 +57,8 @@ else
fi
AC_SUBST(vpath_build)
AC_CHECK_PROG(HAVE_GNUSED,gnused,yes,no)
AC_CHECK_PROG(HAVE_GSED,gsed,yes,no)
AC_CHECK_PROG(HAVE_SED,sed,yes,no)
AC_CHECK_PROG(HAVE_FLEX,flex,yes,no)
if test "$HAVE_GNUSED" = yes; then
SED=gnused
else
if test "$HAVE_GSED" = yes; then
SED=gsed
else
SED=sed
fi
fi
AC_SUBST(SED)
AS_IF([test x"$HAVE_FLEX" != x"yes"], AC_MSG_ERROR([flex should be installed first]))
#Checking libraries
GENERIC_LIB_FAILED_MSG="library should be installed"
AC_CHECK_LIB(selinux, is_selinux_enabled, [],
[AC_MSG_ERROR(['selinux' $GENERIC_LIB_FAILED_MSG])])
AC_CHECK_LIB(lz4, LZ4_compress_default, [],
[AC_MSG_ERROR(['Z4' $GENERIC_LIB_FAILED_MSG])])
AC_CHECK_LIB(xslt, xsltCleanupGlobals, [],
[AC_MSG_ERROR(['xslt' $GENERIC_LIB_FAILED_MSG])])
AC_CHECK_LIB(pam, pam_start, [],
[AC_MSG_ERROR(['pam' $GENERIC_LIB_FAILED_MSG])])
AC_CHECK_LIB(gssapi_krb5, gss_init_sec_context, [],
[AC_MSG_ERROR([gssapi_krb5 $GENERIC_LIB_FAILED_MSG])])
AC_CONFIG_FILES([Makefile])
AC_CONFIG_FILES([Makefile.global])
AC_CONFIG_FILES([doc/Makefile])
AC_OUTPUT

View File

@@ -73,16 +73,7 @@ while(<$fh>) {
if ($param eq 'data_directory') {
$data_directory_found = 1;
}
# Don't quote numbers
if ($value =~ /^\d+$/) {
push @outp, sprintf(q|%s=%s|, $param, $value);
}
# Quote everything else
else {
$value =~ s/'/''/g;
push @outp, sprintf(q|%s='%s'|, $param, $value);
}
push @outp, $line;
}
}
@@ -92,6 +83,6 @@ print join("\n", @outp);
print "\n";
if ($data_directory_found == 0) {
print "data_directory=''\n";
print "data_directory=\n";
}

View File

@@ -1,12 +1,6 @@
/*
* controldata.c - functions for reading the pg_control file
*
* The functions provided here enable repmgr to read a pg_control file
* in a version-independent way, even if the PostgreSQL instance is not
* running. For that reason we can't use on the pg_control_*() functions
* provided in PostgreSQL 9.6 and later.
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* controldata.c
* Copyright (c) 2ndQuadrant, 2010-2018
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -36,53 +30,6 @@
static ControlFileInfo *get_controlfile(const char *DataDir);
int
get_pg_version(const char *data_directory, char *version_string)
{
char PgVersionPath[MAXPGPATH] = "";
FILE *fp = NULL;
char *endptr = NULL;
char file_version_string[MAX_VERSION_STRING] = "";
long file_major, file_minor;
int ret;
snprintf(PgVersionPath, MAXPGPATH, "%s/PG_VERSION", data_directory);
fp = fopen(PgVersionPath, "r");
if (fp == NULL)
{
log_warning(_("could not open file \"%s\" for reading"),
PgVersionPath);
log_detail("%s", strerror(errno));
return UNKNOWN_SERVER_VERSION_NUM;
}
file_version_string[0] = '\0';
ret = fscanf(fp, "%23s", file_version_string);
fclose(fp);
if (ret != 1 || endptr == file_version_string)
{
log_warning(_("unable to determine major version number from PG_VERSION"));
return UNKNOWN_SERVER_VERSION_NUM;
}
file_major = strtol(file_version_string, &endptr, 10);
file_minor = 0;
if (*endptr == '.')
file_minor = strtol(endptr + 1, NULL, 10);
if (version_string != NULL)
strncpy(version_string, file_version_string, MAX_VERSION_STRING);
return ((int) file_major * 10000) + ((int) file_minor * 100);
}
uint64
get_system_identifier(const char *data_directory)
{
@@ -90,35 +37,30 @@ get_system_identifier(const char *data_directory)
uint64 system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
system_identifier = control_file_info->system_identifier;
system_identifier = control_file_info->system_identifier;
pfree(control_file_info);
return system_identifier;
}
bool
get_db_state(const char *data_directory, DBState *state)
DBState
get_db_state(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
bool control_file_processed;
DBState state;
control_file_info = get_controlfile(data_directory);
control_file_processed = control_file_info->control_file_processed;
if (control_file_processed == true)
*state = control_file_info->state;
state = control_file_info->state;
pfree(control_file_info);
return control_file_processed;
return state;
}
XLogRecPtr
extern XLogRecPtr
get_latest_checkpoint_location(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
@@ -126,8 +68,7 @@ get_latest_checkpoint_location(const char *data_directory)
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
checkPoint = control_file_info->checkPoint;
checkPoint = control_file_info->checkPoint;
pfree(control_file_info);
@@ -139,12 +80,11 @@ int
get_data_checksum_version(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
int data_checksum_version = UNKNOWN_DATA_CHECKSUM_VERSION;
int data_checksum_version = -1;
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
data_checksum_version = (int) control_file_info->data_checksum_version;
data_checksum_version = (int) control_file_info->data_checksum_version;
pfree(control_file_info);
@@ -172,59 +112,10 @@ describe_db_state(DBState state)
case DB_IN_PRODUCTION:
return _("in production");
}
return _("unrecognized status code");
}
TimeLineID
get_timeline(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
TimeLineID timeline = -1;
control_file_info = get_controlfile(data_directory);
timeline = (int) control_file_info->timeline;
pfree(control_file_info);
return timeline;
}
TimeLineID
get_min_recovery_end_timeline(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
TimeLineID timeline = -1;
control_file_info = get_controlfile(data_directory);
timeline = (int) control_file_info->minRecoveryPointTLI;
pfree(control_file_info);
return timeline;
}
XLogRecPtr
get_min_recovery_location(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
XLogRecPtr minRecoveryPoint = InvalidXLogRecPtr;
control_file_info = get_controlfile(data_directory);
minRecoveryPoint = control_file_info->minRecoveryPoint;
pfree(control_file_info);
return minRecoveryPoint;
}
/*
* We maintain our own version of get_controlfile() as we need cross-version
* compatibility, and also don't care if the file isn't readable.
@@ -232,10 +123,14 @@ get_min_recovery_location(const char *data_directory)
static ControlFileInfo *
get_controlfile(const char *DataDir)
{
char file_version_string[MAX_VERSION_STRING] = "";
ControlFileInfo *control_file_info;
int fd, version_num;
FILE *fp = NULL;
int fd, ret, version_num;
char PgVersionPath[MAXPGPATH] = "";
char ControlFilePath[MAXPGPATH] = "";
char file_version_string[64] = "";
long file_major, file_minor;
char *endptr = NULL;
void *ControlFileDataPtr = NULL;
int expected_size = 0;
@@ -247,32 +142,50 @@ get_controlfile(const char *DataDir)
control_file_info->state = DB_SHUTDOWNED;
control_file_info->checkPoint = InvalidXLogRecPtr;
control_file_info->data_checksum_version = -1;
control_file_info->timeline = -1;
control_file_info->minRecoveryPointTLI = -1;
control_file_info->minRecoveryPoint = InvalidXLogRecPtr;
/*
* Read PG_VERSION, as we'll need to determine which struct to read
* the control file contents into
*/
snprintf(PgVersionPath, MAXPGPATH, "%s/PG_VERSION", DataDir);
version_num = get_pg_version(DataDir, file_version_string);
fp = fopen(PgVersionPath, "r");
if (version_num == UNKNOWN_SERVER_VERSION_NUM)
if (fp == NULL)
{
log_warning(_("unable to determine server version number from PG_VERSION"));
log_warning(_("could not open file \"%s\" for reading"),
PgVersionPath);
log_detail("%s", strerror(errno));
return control_file_info;
}
if (version_num < MIN_SUPPORTED_VERSION_NUM)
file_version_string[0] = '\0';
ret = fscanf(fp, "%63s", file_version_string);
fclose(fp);
if (ret != 1 || endptr == file_version_string)
{
log_warning(_("data directory appears to be initialised for %s"),
file_version_string);
log_detail(_("minimum supported PostgreSQL version is %s"),
MIN_SUPPORTED_VERSION);
log_warning(_("unable to determine major version number from PG_VERSION"));
return control_file_info;
}
file_major = strtol(file_version_string, &endptr, 10);
file_minor = 0;
if (*endptr == '.')
file_minor = strtol(endptr + 1, NULL, 10);
version_num = ((int) file_major * 10000) + ((int) file_minor * 100);
if (version_num < 90300)
{
log_warning(_("Data directory appears to be initialised for %s"), file_version_string);
return control_file_info;
}
snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);
if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1)
@@ -283,19 +196,8 @@ get_controlfile(const char *DataDir)
return control_file_info;
}
if (version_num >= 120000)
{
#if PG_ACTUAL_VERSION_NUM >= 120000
expected_size = sizeof(ControlFileData12);
ControlFileDataPtr = palloc0(expected_size);
#endif
}
else if (version_num >= 110000)
{
expected_size = sizeof(ControlFileData11);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90500)
if (version_num >= 90500)
{
expected_size = sizeof(ControlFileData95);
ControlFileDataPtr = palloc0(expected_size);
@@ -305,6 +207,12 @@ get_controlfile(const char *DataDir)
expected_size = sizeof(ControlFileData94);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90300)
{
expected_size = sizeof(ControlFileData93);
ControlFileDataPtr = palloc0(expected_size);
}
if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
{
@@ -312,8 +220,6 @@ get_controlfile(const char *DataDir)
ControlFilePath);
log_detail("%s", strerror(errno));
close(fd);
return control_file_info;
}
@@ -321,32 +227,13 @@ get_controlfile(const char *DataDir)
control_file_info->control_file_processed = true;
if (version_num >= 120000)
{
#if PG_ACTUAL_VERSION_NUM >= 120000
ControlFileData12 *ptr = (struct ControlFileData12 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
#else
fprintf(stderr, "ERROR: please use a repmgr version built for PostgreSQL 12 or later\n");
exit(ERR_BAD_CONFIG);
#endif
}
else if (version_num >= 110000)
if (version_num >= 110000)
{
ControlFileData11 *ptr = (struct ControlFileData11 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
else if (version_num >= 90500)
{
@@ -355,9 +242,6 @@ get_controlfile(const char *DataDir)
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
else if (version_num >= 90400)
{
@@ -366,9 +250,14 @@ get_controlfile(const char *DataDir)
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
else if (version_num >= 90300)
{
ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
}
pfree(ControlFileDataPtr);
@@ -376,7 +265,9 @@ get_controlfile(const char *DataDir)
/*
* We don't check the CRC here as we're potentially checking a pg_control
* file from a different PostgreSQL version to the one repmgr was compiled
* against.
* against. However we're only interested in the first few fields, which
* should be constant across supported versions
*
*/
return control_file_info;

View File

@@ -1,6 +1,6 @@
/*
* controldata.h
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -12,8 +12,6 @@
#include "postgres_fe.h"
#include "catalog/pg_control.h"
#define MAX_VERSION_STRING 24
/*
* A simplified representation of pg_control containing only those fields
* required by repmgr.
@@ -25,13 +23,12 @@ typedef struct
DBState state;
XLogRecPtr checkPoint;
uint32 data_checksum_version;
TimeLineID timeline;
TimeLineID minRecoveryPointTLI;
XLogRecPtr minRecoveryPoint;
} ControlFileInfo;
typedef struct CheckPoint94
/* Same for 9.3, 9.4 */
typedef struct CheckPoint93
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
@@ -51,10 +48,10 @@ typedef struct CheckPoint94
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestActiveXid;
} CheckPoint94;
} CheckPoint93;
/* Same for 9.5, 9.6, 10, 11 */
/* Same for 9.5, 9.6, 10, HEAD */
typedef struct CheckPoint95
{
XLogRecPtr redo; /* next RecPtr available when we began to
@@ -82,51 +79,68 @@ typedef struct CheckPoint95
} CheckPoint95;
#if PG_ACTUAL_VERSION_NUM >= 120000
/*
* Following fields removed in PostgreSQL 12;
*
* uint32 nextXidEpoch;
* TransactionId nextXid;
*
* and replaced by:
*
* FullTransactionId nextFullXid;
*/
typedef struct CheckPoint12
typedef struct ControlFileData93
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
TimeLineID ThisTimeLineID; /* current TLI */
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
Oid oldestXidDB; /* database with minimum datfrozenxid */
MultiXactId oldestMulti; /* cluster-wide minimum datminmxid */
Oid oldestMultiDB; /* database with minimum datminmxid */
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestCommitTsXid; /* oldest Xid with valid commit
* timestamp */
TransactionId newestCommitTsXid; /* newest Xid with valid commit
* timestamp */
uint64 system_identifier;
/*
* Oldest XID still running. This is only needed to initialize hot standby
* mode from an online checkpoint, so we only bother calculating this for
* online checkpoints and only when wal_level is replica. Otherwise it's
* set to InvalidTransactionId.
*/
TransactionId oldestActiveXid;
} CheckPoint12;
#endif
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
int MaxConnections;
int max_prepared_xacts;
int max_locks_per_xact;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
/* flag indicating internal format of timestamp, interval, time */
bool enableIntTimes; /* int64 storage enabled? */
/* flags indicating pass-by-value status of various types */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
/* Are data pages protected by checksums? Zero if no checksum version */
uint32 data_checksum_version;
} ControlFileData93;
/*
* Following fields added since 9.3:
*
* int max_worker_processes;
* int max_prepared_xacts;
* int max_locks_per_xact;
*
*/
typedef struct ControlFileData94
{
uint64 system_identifier;
@@ -139,7 +153,7 @@ typedef struct ControlFileData94
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint94 checkPointCopy; /* copy of last check point record */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
@@ -317,73 +331,11 @@ typedef struct ControlFileData11
} ControlFileData11;
#if PG_ACTUAL_VERSION_NUM >= 120000
/*
* Following field added in Pg12:
*
* int max_wal_senders;
*/
typedef struct ControlFileData12
{
uint64 system_identifier;
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
CheckPoint12 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
bool wal_log_hints;
int MaxConnections;
int max_worker_processes;
int max_wal_senders;
int max_prepared_xacts;
int max_locks_per_xact;
bool track_commit_timestamp;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
uint32 loblksize; /* chunk size in pg_largeobject */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
uint32 data_checksum_version;
} ControlFileData12;
#endif
extern int get_pg_version(const char *data_directory, char *version_string);
extern bool get_db_state(const char *data_directory, DBState *state);
extern DBState get_db_state(const char *data_directory);
extern const char *describe_db_state(DBState state);
extern int get_data_checksum_version(const char *data_directory);
extern uint64 get_system_identifier(const char *data_directory);
extern XLogRecPtr get_latest_checkpoint_location(const char *data_directory);
extern TimeLineID get_timeline(const char *data_directory);
extern TimeLineID get_min_recovery_end_timeline(const char *data_directory);
extern XLogRecPtr get_min_recovery_location(const char *data_directory);
#endif /* _CONTROLDATA_H_ */

3066
dbutils.c

File diff suppressed because it is too large Load Diff

283
dbutils.h
View File

@@ -1,7 +1,7 @@
/*
* dbutils.h
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -20,7 +20,6 @@
#ifndef _REPMGR_DBUTILS_H_
#define _REPMGR_DBUTILS_H_
#include "access/timeline.h"
#include "access/xlogdefs.h"
#include "pqexpbuffer.h"
#include "portability/instr_time.h"
@@ -29,35 +28,9 @@
#include "strutil.h"
#include "voting.h"
#define REPMGR_NODES_COLUMNS \
"n.node_id, " \
"n.type, " \
"n.upstream_node_id, " \
"n.node_name, " \
"n.conninfo, " \
"n.repluser, " \
"n.slot_name, " \
"n.location, " \
"n.priority, " \
"n.active, " \
"n.config_file, " \
"'' AS upstream_node_name, " \
"NULL AS attached "
#define REPMGR_NODES_COLUMNS_WITH_UPSTREAM \
"n.node_id, " \
"n.type, " \
"n.upstream_node_id, " \
"n.node_name, " \
"n.conninfo, " \
"n.repluser, " \
"n.slot_name, " \
"n.location, " \
"n.priority, " \
"n.active, "\
"n.config_file, " \
"un.node_name AS upstream_node_name, " \
"NULL AS attached "
#define REPMGR_NODES_COLUMNS "n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name "
#define BDR2_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_name, node_local_dsn, ''"
#define BDR3_NODES_COLUMNS "ns.node_id, 0, 0, ns.node_name, ns.interface_connstr, ns.peer_state_name"
#define ERRBUFF_SIZE 512
@@ -67,13 +40,13 @@ typedef enum
UNKNOWN = 0,
PRIMARY,
STANDBY,
WITNESS
WITNESS,
BDR
} t_server_type;
typedef enum
{
REPMGR_INSTALLED = 0,
REPMGR_OLD_VERSION_INSTALLED,
REPMGR_AVAILABLE,
REPMGR_UNAVAILABLE,
REPMGR_UNKNOWN
@@ -105,8 +78,7 @@ typedef enum
NODE_STATUS_UP,
NODE_STATUS_SHUTTING_DOWN,
NODE_STATUS_DOWN,
NODE_STATUS_UNCLEAN_SHUTDOWN,
NODE_STATUS_REJECTED
NODE_STATUS_UNCLEAN_SHUTDOWN
} NodeStatus;
typedef enum
@@ -117,23 +89,9 @@ typedef enum
CONN_ERROR
} ConnectionStatus;
typedef enum
{
/* unable to query "pg_stat_replication" or other error */
NODE_ATTACHED_UNKNOWN = -1,
/* node has record in "pg_stat_replication" and state is not "streaming" */
NODE_ATTACHED,
/* node has record in "pg_stat_replication" but state is not "streaming" */
NODE_NOT_ATTACHED,
/* node has no record in "pg_stat_replication" */
NODE_DETACHED
} NodeAttached;
typedef enum
{
SLOT_UNKNOWN = -1,
SLOT_NOT_FOUND,
SLOT_NOT_PHYSICAL,
SLOT_INACTIVE,
SLOT_ACTIVE
} ReplSlotStatus;
@@ -146,48 +104,8 @@ typedef enum
} BackupState;
/*
* Struct to store extension version information
*/
typedef struct s_extension_versions {
char default_version[8];
int default_version_num;
char installed_version[8];
int installed_version_num;
} t_extension_versions;
#define T_EXTENSION_VERSIONS_INITIALIZER { \
"", \
UNKNOWN_SERVER_VERSION_NUM, \
"", \
UNKNOWN_SERVER_VERSION_NUM \
}
typedef struct
{
char current_timestamp[MAXLEN];
bool in_recovery;
TimeLineID timeline_id;
char timeline_id_str[MAXLEN];
XLogRecPtr last_wal_receive_lsn;
XLogRecPtr last_wal_replay_lsn;
char last_xact_replay_timestamp[MAXLEN];
int replication_lag_time;
bool receiving_streamed_wal;
bool wal_replay_paused;
int upstream_last_seen;
int upstream_node_id;
} ReplInfo;
/*
* Struct to store node information.
*
* The first section represents the contents of the "repmgr.nodes"
* table; subsequent section contain information collated in
* various contexts.
* Struct to store node information
*/
typedef struct s_node_info
{
@@ -195,8 +113,8 @@ typedef struct s_node_info
int node_id;
int upstream_node_id;
t_server_type type;
char node_name[NAMEDATALEN];
char upstream_node_name[NAMEDATALEN];
char node_name[MAXLEN];
char upstream_node_name[MAXLEN];
char conninfo[MAXLEN];
char repluser[NAMEDATALEN];
char location[MAXLEN];
@@ -213,7 +131,7 @@ typedef struct s_node_info
/* for ad-hoc use e.g. when working with a list of nodes */
char details[MAXLEN];
bool reachable;
NodeAttached attached;
bool attached;
/* various statistics */
int max_wal_senders;
int attached_wal_receivers;
@@ -221,8 +139,6 @@ typedef struct s_node_info
int total_replication_slots;
int active_replication_slots;
int inactive_replication_slots;
/* replication info */
ReplInfo *replication_info;
} t_node_info;
@@ -247,10 +163,9 @@ typedef struct s_node_info
MS_NORMAL, \
NULL, \
/* for ad-hoc use e.g. when working with a list of nodes */ \
"", true, true, \
"", true, true \
/* various statistics */ \
-1, -1, -1, -1, -1, -1, \
NULL \
-1, -1, -1, -1, -1, -1 \
}
@@ -326,6 +241,62 @@ typedef struct s_connection_user
#define T_CONNECTION_USER_INITIALIZER { "", false }
/* represents an entry in bdr.bdr_nodes */
typedef struct s_bdr_node_info
{
char node_sysid[MAXLEN];
uint32 node_timeline;
uint32 node_dboid;
char node_name[MAXLEN];
char node_local_dsn[MAXLEN];
char peer_state_name[MAXLEN];
} t_bdr_node_info;
#define T_BDR_NODE_INFO_INITIALIZER { \
"", InvalidOid, InvalidOid, \
"", "", "" \
}
/* structs to store a list of BDR node records */
typedef struct BdrNodeInfoListCell
{
struct BdrNodeInfoListCell *next;
t_bdr_node_info *node_info;
} BdrNodeInfoListCell;
typedef struct BdrNodeInfoList
{
BdrNodeInfoListCell *head;
BdrNodeInfoListCell *tail;
int node_count;
} BdrNodeInfoList;
#define T_BDR_NODE_INFO_LIST_INITIALIZER { \
NULL, \
NULL, \
0 \
}
typedef struct
{
char current_timestamp[MAXLEN];
uint64 last_wal_receive_lsn;
uint64 last_wal_replay_lsn;
char last_xact_replay_timestamp[MAXLEN];
int replication_lag_time;
bool receiving_streamed_wal;
} ReplInfo;
#define T_REPLINFO_INTIALIZER { \
"", \
InvalidXLogRecPtr, \
InvalidXLogRecPtr, \
"", \
0 \
}
typedef struct
{
char filepath[MAXPGPATH];
@@ -335,7 +306,6 @@ typedef struct
#define T_CONFIGFILE_INFO_INITIALIZER { "", "", false }
typedef struct
{
int size;
@@ -345,7 +315,6 @@ typedef struct
#define T_CONFIGFILE_LIST_INITIALIZER { 0, 0, NULL }
typedef struct
{
uint64 system_identifier;
@@ -367,16 +336,16 @@ typedef struct RepmgrdInfo {
char pid_file[MAXLEN];
bool pg_running;
char pg_running_text[MAXLEN];
RecoveryType recovery_type;
bool running;
char repmgrd_running[MAXLEN];
bool paused;
bool wal_paused_pending_wal;
int upstream_last_seen;
char upstream_last_seen_text[MAXLEN];
} RepmgrdInfo;
/* global variables */
extern int server_version_num;
/* macros */
#define is_streaming_replication(x) (x == PRIMARY || x == STANDBY)
@@ -385,27 +354,26 @@ typedef struct RepmgrdInfo {
/* utility functions */
XLogRecPtr parse_lsn(const char *str);
extern void
wrap_ddl_query(PQExpBufferData *query_buf, int replication_type, const char *fmt,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
bool atobool(const char *value);
/* connection functions */
PGconn *establish_db_connection(const char *conninfo,
PGconn *establish_db_connection(const char *conninfo,
const bool exit_on_error);
PGconn *establish_db_connection_quiet(const char *conninfo);
PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
const bool exit_on_error);
PGconn *establish_db_connection_with_replacement_param(const char *conninfo,
const char *param,
const char *value,
const bool exit_on_error);
PGconn *establish_replication_connection_from_conn(PGconn *conn, const char *repluser);
PGconn *establish_replication_connection_from_conninfo(const char *conninfo, const char *repluser);
PGconn *establish_primary_db_connection(PGconn *conn,
PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
const bool exit_on_error);
PGconn *establish_primary_db_connection(PGconn *conn,
const bool exit_on_error);
PGconn *get_primary_connection(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
PGconn *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
PGconn *duplicate_connection(PGconn *conn, const char *user, bool replication);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
void close_connection(PGconn **conn);
/* conninfo manipulation functions */
@@ -418,10 +386,8 @@ void conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
void param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
void param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
char *param_get(t_conninfo_param_list *param_list, const char *param);
bool validate_conninfo_string(const char *conninfo_str, char **errmsg);
bool parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params);
char *param_list_to_string(t_conninfo_param_list *param_list);
char *normalize_conninfo_string(const char *conninfo_str);
bool has_passfile(void);
@@ -429,65 +395,44 @@ bool has_passfile(void);
bool begin_transaction(PGconn *conn);
bool commit_transaction(PGconn *conn);
bool rollback_transaction(PGconn *conn);
bool check_cluster_schema(PGconn *conn);
/* GUC manipulation functions */
bool set_config(PGconn *conn, const char *config_param, const char *config_value);
bool set_config_bool(PGconn *conn, const char *config_param, bool state);
int guc_set(PGconn *conn, const char *parameter, const char *op, const char *value);
int guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype);
bool get_pg_setting(PGconn *conn, const char *setting, char *output);
bool get_pg_setting_bool(PGconn *conn, const char *setting, bool *output);
bool get_pg_setting_int(PGconn *conn, const char *setting, int *output);
bool alter_system_int(PGconn *conn, const char *name, int value);
bool pg_reload_conf(PGconn *conn);
/* server information functions */
bool get_cluster_size(PGconn *conn, char *size);
int get_server_version(PGconn *conn, char *server_version_buf);
int get_server_version(PGconn *conn, char *server_version);
RecoveryType get_recovery_type(PGconn *conn);
int get_primary_node_id(PGconn *conn);
int get_ready_archive_files(PGconn *conn, const char *data_directory);
bool identify_system(PGconn *repl_conn, t_system_identification *identification);
uint64 system_identifier(PGconn *conn);
TimeLineHistoryEntry *get_timeline_history(PGconn *repl_conn, TimeLineID tli);
pid_t get_wal_receiver_pid(PGconn *conn);
/* user/role information functions */
bool can_execute_pg_promote(PGconn *conn);
bool can_disable_walsender(PGconn *conn);
bool connection_has_pg_monitor_role(PGconn *conn, const char *subrole);
bool is_replication_role(PGconn *conn, char *rolname);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
/* repmgrd shared memory functions */
bool repmgrd_set_local_node_id(PGconn *conn, int local_node_id);
int repmgrd_get_local_node_id(PGconn *conn);
bool repmgrd_check_local_node_id(PGconn *conn);
BackupState server_in_exclusive_backup_mode(PGconn *conn);
void repmgrd_set_pid(PGconn *conn, pid_t repmgrd_pid, const char *pidfile);
pid_t repmgrd_get_pid(PGconn *conn);
bool repmgrd_is_running(PGconn *conn);
bool repmgrd_is_paused(PGconn *conn);
bool repmgrd_pause(PGconn *conn, bool pause);
int repmgrd_get_upstream_node_id(PGconn *conn);
bool repmgrd_set_upstream_node_id(PGconn *conn, int node_id);
/* extension functions */
ExtensionStatus get_repmgr_extension_status(PGconn *conn, t_extension_versions *extversions);
ExtensionStatus get_repmgr_extension_status(PGconn *conn);
/* node management functions */
void checkpoint(PGconn *conn);
bool vacuum_table(PGconn *conn, const char *table);
bool promote_standby(PGconn *conn, bool wait, int wait_seconds);
bool resume_wal_replay(PGconn *conn);
/* node record functions */
t_server_type parse_node_type(const char *type);
const char *get_node_type_string(t_server_type type);
RecordStatus get_node_record(PGconn *conn, int node_id, t_node_info *node_info);
RecordStatus refresh_node_record(PGconn *conn, int node_id, t_node_info *node_info);
RecordStatus get_node_record_with_upstream(PGconn *conn, int node_id, t_node_info *node_info);
RecordStatus get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_info);
@@ -497,10 +442,8 @@ bool get_local_node_record(PGconn *conn, int node_id, t_node_info *node_info);
bool get_primary_node_record(PGconn *conn, t_node_info *node_info);
bool get_all_node_records(PGconn *conn, NodeInfoList *node_list);
bool get_all_nodes_count(PGconn *conn, int *count);
void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
bool get_child_nodes(PGconn *conn, int node_id, NodeInfoList *node_list);
void get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list);
bool get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
bool get_downstream_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *noede_list);
@@ -536,14 +479,10 @@ PGresult *get_event_records(PGconn *conn, int node_id, const char *node_name,
/* replication slot functions */
void create_slot_name(char *slot_name, int node_id);
bool create_replication_slot_sql(PGconn *conn, char *slot_name, PQExpBufferData *error_msg);
bool create_replication_slot_replprot(PGconn *conn, PGconn *repl_conn, char *slot_name, PQExpBufferData *error_msg);
bool drop_replication_slot_sql(PGconn *conn, char *slot_name);
bool drop_replication_slot_replprot(PGconn *repl_conn, char *slot_name);
bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
bool drop_replication_slot(PGconn *conn, char *slot_name);
RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
int get_free_replication_slot_count(PGconn *conn, int *max_replication_slots);
int get_free_replication_slot_count(PGconn *conn);
int get_inactive_replication_slots(PGconn *conn, KeyValueList *list);
/* tablespace functions */
@@ -551,14 +490,12 @@ bool get_tablespace_name_by_location(PGconn *conn, const char *location, char *
/* asynchronous query functions */
bool cancel_query(PGconn *conn, int timeout);
int wait_connection_availability(PGconn *conn, int timeout);
int wait_connection_availability(PGconn *conn, long long timeout);
/* node availability functions */
bool is_server_available(const char *conninfo);
bool is_server_available_quiet(const char *conninfo);
bool is_server_available_params(t_conninfo_param_list *param_list);
ExecStatusType connection_ping(PGconn *conn);
ExecStatusType connection_ping_reconnect(PGconn *conn);
/* monitoring functions */
void
@@ -589,23 +526,33 @@ bool get_new_primary(PGconn *conn, int *primary_node_id);
void reset_voting_status(PGconn *conn);
/* replication status functions */
XLogRecPtr get_primary_current_lsn(PGconn *conn);
XLogRecPtr get_node_current_lsn(PGconn *conn);
XLogRecPtr get_current_wal_lsn(PGconn *conn);
XLogRecPtr get_last_wal_receive_location(PGconn *conn);
void init_replication_info(ReplInfo *replication_info);
bool get_replication_info(PGconn *conn, t_server_type node_type, ReplInfo *replication_info);
bool get_replication_info(PGconn *conn, ReplInfo *replication_info);
int get_replication_lag_seconds(PGconn *conn);
TimeLineID get_node_timeline(PGconn *conn, char *timeline_id_str);
void get_node_replication_stats(PGconn *conn, t_node_info *node_info);
NodeAttached is_downstream_node_attached(PGconn *conn, char *node_name, char **node_state);
NodeAttached is_downstream_node_attached_quiet(PGconn *conn, char *node_name, char **node_state);
void set_upstream_last_seen(PGconn *conn, int upstream_node_id);
int get_upstream_last_seen(PGconn *conn, t_server_type node_type);
void get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *node_info);
bool is_downstream_node_attached(PGconn *conn, char *node_name);
bool is_wal_replay_paused(PGconn *conn, bool check_pending_wal);
/* BDR functions */
int get_bdr_version_num(void);
void get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list);
RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info);
bool is_bdr_db(PGconn *conn, PQExpBufferData *output);
bool is_bdr_db_quiet(PGconn *conn);
bool is_active_bdr_node(PGconn *conn, const char *node_name);
bool is_bdr_repmgr(PGconn *conn);
char *get_default_bdr_replication_set(PGconn *conn);
bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
void add_extension_tables_to_bdr_replication_set(PGconn *conn);
bool bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name);
ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name);
void get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf);
/* repmgrd status functions */
CheckStatus get_repmgrd_status(PGconn *conn);
bool am_bdr_failover_handler(PGconn *conn, int node_id);
void unset_bdr_failover_handler(PGconn *conn);
bool bdr_node_has_repmgr_set(PGconn *conn, const char *node_name);
bool bdr_node_set_repmgr_set(PGconn *conn, const char *node_name);
/* miscellaneous debugging functions */
const char *print_node_status(NodeStatus node_status);

125
dirutil.c
View File

@@ -3,7 +3,7 @@
* dirmod.c
* directory handling functions
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -50,7 +50,7 @@ typedef long pgpid_t;
* and tablespace directories.
*/
DataDirState
check_dir(const char *path)
check_dir(char *path)
{
DIR *chkdir = NULL;
struct dirent *file = NULL;
@@ -91,17 +91,12 @@ check_dir(const char *path)
* Create directory with error log message when failing
*/
bool
create_dir(const char *path)
create_dir(char *path)
{
char create_dir_path[MAXPGPATH];
/* mkdir_p() may modify the supplied path */
strncpy(create_dir_path, path, MAXPGPATH);
if (mkdir_p(create_dir_path, 0700) == 0)
if (mkdir_p(path, 0700) == 0)
return true;
log_error(_("unable to create directory \"%s\""), create_dir_path);
log_error(_("unable to create directory \"%s\""), path);
log_detail("%s", strerror(errno));
return false;
@@ -109,59 +104,13 @@ create_dir(const char *path)
bool
set_dir_permissions(const char *path, int server_version_num)
set_dir_permissions(char *path)
{
struct stat stat_buf;
bool no_group_access =
(server_version_num != UNKNOWN_SERVER_VERSION_NUM) &&
(server_version_num < 110000);
/*
* At this point the path should exist, so this check is very
* much just-in-case.
*/
if (stat(path, &stat_buf) != 0)
{
if (errno == ENOENT)
{
log_warning(_("directory \"%s\" does not exist"), path);
}
else
{
log_warning(_("could not read permissions of directory \"%s\""),
path);
log_detail("%s", strerror(errno));
}
return false;
}
/*
* If mode is not 0700 or 0750, attempt to change.
*/
if ((no_group_access == true && (stat_buf.st_mode & (S_IRWXG | S_IRWXO)))
|| (no_group_access == false && (stat_buf.st_mode & (S_IWGRP | S_IRWXO))))
{
/*
* Currently we default to 0700.
* There is no facility to override this directly,
* but the user can manually create the directory with
* the desired permissions.
*/
if (chmod(path, 0700) != 0) {
log_error(_("unable to change permissions of directory \"%s\""), path);
log_detail("%s", strerror(errno));
return false;
}
return true;
}
/* Leave as-is */
return true;
return (chmod(path, 0700) != 0) ? false : true;
}
/* function from initdb.c */
/* source adapted from FreeBSD /src/bin/mkdir/mkdir.c */
@@ -205,7 +154,7 @@ mkdir_p(char *path, mode_t omode)
/*
* POSIX 1003.2: For each dir operand that does not name an
* existing directory, effects equivalent to those caused by the
* following command shall occur:
* following command shall occcur:
*
* mkdir -p -m $(umask -S),u+wx $(dirname dir) && mkdir [-m mode]
* dir
@@ -249,9 +198,9 @@ mkdir_p(char *path, mode_t omode)
bool
is_pg_dir(const char *path)
is_pg_dir(char *path)
{
char dirpath[MAXPGPATH] = "";
char dirpath[MAXPGPATH];
struct stat sb;
/* test pgdata */
@@ -274,7 +223,7 @@ is_pg_dir(const char *path)
* any further useful progress can be made.
*/
PgDirState
is_pg_running(const char *path)
is_pg_running(char *path)
{
long pid;
FILE *pidf;
@@ -289,7 +238,7 @@ is_pg_running(const char *path)
{
/*
* No PID file - PostgreSQL shouldn't be running. From 9.3 (the
* earliest version we care about) removal of the PID file will
* earliesty version we care about) removal of the PID file will
* cause the postmaster to shut down, so it's highly unlikely
* that PostgreSQL will still be running.
*/
@@ -323,8 +272,6 @@ is_pg_running(const char *path)
log_warning(_("invalid data in PostgreSQL PID file \"%s\""), path);
}
fclose(pidf);
return PG_DIR_NOT_RUNNING;
}
@@ -344,13 +291,13 @@ is_pg_running(const char *path)
bool
create_pg_dir(const char *path, bool force)
create_pg_dir(char *path, bool force)
{
/* Check this directory can be used as a PGDATA dir */
switch (check_dir(path))
{
case DIR_NOENT:
/* Directory does not exist, attempt to create it. */
/* directory does not exist, attempt to create it */
log_info(_("creating directory \"%s\"..."), path);
if (!create_dir(path))
@@ -361,23 +308,14 @@ create_pg_dir(const char *path, bool force)
}
break;
case DIR_EMPTY:
/*
* Directory exists but empty, fix permissions and use it.
*
* Note that at this point the caller might not know the server
* version number, so in this case "set_dir_permissions()" will
* accept 0750 as a valid setting. As this is invalid in Pg10 and
* earlier, the caller should call "set_dir_permissions()" again
* when it has the number.
*
* We need to do the permissions check here in any case to catch
* fatal permissions early.
*/
/* exists but empty, fix permissions and use it */
log_info(_("checking and correcting permissions on existing directory \"%s\""),
path);
if (!set_dir_permissions(path, UNKNOWN_SERVER_VERSION_NUM))
if (!set_dir_permissions(path))
{
log_error(_("unable to change permissions of directory \"%s\""), path);
log_detail("%s", strerror(errno));
return false;
}
break;
@@ -392,15 +330,6 @@ create_pg_dir(const char *path, bool force)
{
log_notice(_("-F/--force provided - deleting existing data directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
/* recreate the directory ourselves to ensure permissions are correct */
if (!create_dir(path))
{
log_error(_("unable to create directory \"%s\"..."),
path);
return false;
}
return true;
}
@@ -412,24 +341,14 @@ create_pg_dir(const char *path, bool force)
{
log_notice(_("deleting existing directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
/* recreate the directory ourselves to ensure permissions are correct */
if (!create_dir(path))
{
log_error(_("unable to create directory \"%s\"..."),
path);
return false;
}
return true;
}
return false;
}
break;
case DIR_ERROR:
log_error(_("could not access directory \"%s\"")
, path);
log_detail("%s", strerror(errno));
log_error(_("could not access directory \"%s\": %s"),
path, strerror(errno));
return false;
}
@@ -439,7 +358,7 @@ create_pg_dir(const char *path, bool force)
int
rmdir_recursive(const char *path)
rmdir_recursive(char *path)
{
return nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
}

View File

@@ -1,6 +1,6 @@
/*
* dirutil.h
* Copyright (c) EnterpriseDB Corporation, 2010-2021
* Copyright (c) 2ndQuadrant, 2010-2018
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -35,13 +35,13 @@ typedef enum
} PgDirState;
extern int mkdir_p(char *path, mode_t omode);
extern bool set_dir_permissions(const char *path, int server_version_num);
extern bool set_dir_permissions(char *path);
extern DataDirState check_dir(const char *path);
extern bool create_dir(const char *path);
extern bool is_pg_dir(const char *path);
extern PgDirState is_pg_running(const char *path);
extern bool create_pg_dir(const char *path, bool force);
extern int rmdir_recursive(const char *path);
extern DataDirState check_dir(char *path);
extern bool create_dir(char *path);
extern bool is_pg_dir(char *path);
extern PgDirState is_pg_running(char *path);
extern bool create_pg_dir(char *path, bool force);
extern int rmdir_recursive(char *path);
#endif

8
doc/.gitignore vendored
View File

@@ -1,9 +1,7 @@
HTML.index
bookindex.xml
bookindex.sgml
html-stamp
html/
nochunks.dsl
repmgr.html
version.xml
*.fo
*.pdf
*.sgml
version.sgml

View File

@@ -1,104 +0,0 @@
# Make "html" the default target, since that is what most people tend
# to want to use.
html:
all: html
subdir = doc
repmgr_top_builddir = ..
include $(repmgr_top_builddir)/Makefile.global
XMLINCLUDE = --path .
ifndef XMLLINT
XMLLINT = $(missing) xmllint
endif
ifndef XSLTPROC
XSLTPROC = $(missing) xsltproc
endif
ifndef FOP
FOP = $(missing) fop
endif
override XSLTPROCFLAGS += --stringparam repmgr.version '$(REPMGR_VERSION)'
GENERATED_XML = version.xml
ALLXML := $(wildcard $(srcdir)/*.xml) $(GENERATED_XML)
version.xml: $(repmgr_top_builddir)/repmgr_version.h
{ \
echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \
echo "<!ENTITY releasedate \"$(REPMGR_RELEASE_DATE)\">"; \
} > $@
##
## HTML
##
html: html-stamp
html-stamp: stylesheet.xsl repmgr.xml $(ALLXML)
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/
touch $@
# single-page HTML
repmgr.html: stylesheet-html-nochunk.xsl repmgr.xml $(ALLXML)
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) -o $@ $(wordlist 1,2,$^)
zip: html
cp -r html repmgr-docs-$(REPMGR_VERSION)
zip -r repmgr-docs-$(REPMGR_VERSION).zip repmgr-docs-$(REPMGR_VERSION)
rm -rf repmgr-docs-$(REPMGR_VERSION)
##
## Print
##
repmgr.pdf:
$(error Invalid target; use repmgr-A4.pdf or repmgr-US.pdf as targets)
# Standard paper size
repmgr-A4.fo: stylesheet-fo.xsl repmgr.xml $(ALLXML)
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
repmgr-A4.pdf: repmgr-A4.fo
$(FOP) -fo $< -pdf $@
# North American paper size
repmgr-US.fo: stylesheet-fo.xsl repmgr.xml $(ALLXML)
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
repmgr-US.pdf: repmgr-US.fo
$(FOP) -fo $< -pdf $@
install: html
@$(MKDIR_P) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
@$(INSTALL_DATA) $(wildcard html/*.html) $(wildcard html/*.css) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
@echo Installed docs to $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
clean:
rm -f html-stamp
rm -f HTML.index $(GENERATED_XML)
rm -f repmgr.html
rm -f repmgr-A4.pdf
rm -f repmgr-US.pdf
rm -f *.fo
rm -f html/*
maintainer-clean:
rm -rf html
.PHONY: html

76
doc/Makefile.in Normal file
View File

@@ -0,0 +1,76 @@
repmgr_subdir = doc
repmgr_top_builddir = ..
include $(repmgr_top_builddir)/Makefile.global
ifndef JADE
JADE = $(missing) jade
endif
SGMLINCLUDE = -D . -D ${srcdir}
SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
JADE.html.call = $(JADE) $(JADEFLAGS) $(SPFLAGS) $(SGMLINCLUDE) $(CATALOG) -t sgml -i output-html
ALLSGML := $(wildcard $(srcdir)/*.sgml)
# to build bookindex
ALMOSTALLSGML := $(filter-out %bookindex.sgml,$(ALLSGML))
GENERATED_SGML = version.sgml bookindex.sgml
Makefile: Makefile.in
cd $(repmgr_top_builddir) && ./config.status doc/Makefile
all: html
html: html-stamp
html-stamp: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
$(MKDIR_P) html
$(JADE.html.call) -d stylesheet.dsl -i include-index $<
cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/
touch $@
repmgr.html: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
sed '/html-index-filename/a\
(define nochunks #t)' <stylesheet.dsl >nochunks.dsl
$(JADE.html.call) -d nochunks.dsl -i include-index $< >repmgr.html
version.sgml: ${repmgr_top_builddir}/repmgr_version.h
{ \
echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \
} > $@
HTML.index: repmgr.sgml $(ALMOSTALLSGML) stylesheet.dsl
@$(MKDIR_P) html
$(JADE.html.call) -d stylesheet.dsl -V html-index $<
website-docs.css:
@$(MKDIR_P) html
curl http://www.postgresql.org/media/css/docs.css > ${srcdir}/website-docs.css
bookindex.sgml: HTML.index
ifdef COLLATEINDEX
LC_ALL=C $(PERL) $(COLLATEINDEX) -f -g -i 'bookindex' -o $@ $<
else
@$(missing) collateindex.pl $< $@
endif
clean:
rm -f html-stamp
rm -f HTML.index $(GENERATED_SGML)
maintainer-clean:
rm -rf html
rm -rf Makefile
zip: html
cp -r html repmgr-docs-$(REPMGR_VERSION)
zip -r repmgr-docs-$(REPMGR_VERSION).zip repmgr-docs-$(REPMGR_VERSION)
rm -rf repmgr-docs-$(REPMGR_VERSION)
install: html
@$(MKDIR_P) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
@$(INSTALL_DATA) $(wildcard html/*.html) $(wildcard html/*.css) $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
@echo Installed docs to $(DESTDIR)$(docdir)/$(docmoduledir)/repmgr
.PHONY: html all

357
doc/appendix-faq.sgml Normal file
View File

@@ -0,0 +1,357 @@
<appendix id="appendix-faq" xreflabel="FAQ">
<indexterm>
<primary>FAQ (Frequently Asked Questions)</primary>
</indexterm>
<title>FAQ (Frequently Asked Questions)</title>
<sect1 id="faq-general" xreflabel="General">
<title>General</title>
<sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
<title>What's the difference between the repmgr versions?</title>
<para>
&repmgr; 4 is a complete rewrite of the existing &repmgr; code base
and implements &repmgr; as a PostgreSQL extension. It
supports all PostgreSQL versions from 9.3 (although some &repmgr;
features are not available for PostgreSQL 9.3 and 9.4).
</para>
<para>
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via <application>repmgrd</application>, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series will no longer be actively maintained.
</para>
<para>
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
no longer maintained.
</para>
</sect2>
<sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
<title>What's the advantage of using replication slots?</title>
<para>
Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed
by all standby servers. This makes WAL file management much easier,
and if used &repmgr; will no longer insist on a fixed minimum number
(default: 5000) of WAL files being retained.
</para>
<para>
However this does mean that if a standby is no longer connected to the
primary, the presence of the replication slot will cause WAL files
to be retained indefinitely.
</para>
</sect2>
<sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
<title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
<para>
Normally at least same number as the number of standbys which will connect
to the node. Note that changes to <varname>max_replication_slots</varname> require a server
restart to take effect, and as there is no particular penalty for unused
replication slots, setting a higher figure will make adding new nodes
easier.
</para>
</sect2>
<sect2 id="faq-hash-index" xreflabel="Hash indexes">
<title>Does &repmgr; support hash indexes?</title>
<para>
Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
for use in streaming replication in PostgreSQL 9.6 and earlier. See the
<ulink url="https://www.postgresql.org/docs/9.6/static/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
for details.
</para>
<para>
From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
in a streaming replication cluster.
</para>
</sect2>
<sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
<title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
<para>
For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
<link linkend="performing-switchover">switchover</link> promoting it to a primary,
then upgrade the former primary.
</para>
<para>
For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade</ulink>
and recloning standbys from this.
</para>
<para>
To minimize downtime during major upgrades, for more recent PostgreSQL
versions (PostgreSQL 9.4 and later),
<ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
</para>
</sect2>
<sect2 id="faq-libdir-repmgr-error">
<title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
<para>
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
<sect2 id="faq-old-packages">
<title>How can I obtain old versions of &repmgr; packages?</title>
<para>
See appendix <xref linkend="packages-old-versions"> for details.
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgr" xreflabel="repmgr">
<title><command>repmgr</command></title>
<sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
<title>Can I register an existing PostgreSQL server with repmgr?</title>
<para>
Yes, any existing PostgreSQL server which is part of the same replication
cluster can be registered with &repmgr;. There's no requirement for a
standby to have been cloned using &repmgr;.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-other-source" >
<title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
<para>
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
<command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
can be used to create the correct <filename>recovery.conf</filename> file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
<link linkend="repmgr-standby-register">register the node</link> as usual.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf" >
<title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
<para>
See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
</para>
</sect2>
<sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
<title>How can a failed primary be re-added as a standby?</title>
<para>
This is a two-stage process. First, the failed primary's data directory
must be re-synced with the current primary; secondly the failed primary
needs to be re-registered as a standby.
</para>
<para>
It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
directory, which will usually be much
faster than re-cloning the server. However <command>pg_rewind</command> can only
be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
data checksums were enabled when the cluster was initialized.
</para>
<para>
Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
</para>
<para>
&repmgr; provides the command <command>repmgr node rejoin</command> which can
optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin">
documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind">.
</para>
<para>
If <command>pg_rewind</command> cannot be used, then the data directory will need
to be re-cloned from scratch.
</para>
</sect2>
<sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
<title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
<para>
Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
with the <literal>--dry-run</literal> option; this will report any configuration problems
which need to be rectified.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
<title>When cloning a standby, how can I get &repmgr; to copy
<filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
directory in <filename>/etc</filename>?</title>
<para>
Use the command line option <literal>--copy-external-config-files</literal>. For more details
see <xref linkend="repmgr-standby-clone-config-file-copying">.
</para>
</sect2>
<sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using <application>repmgrd</application>?</title>
<para>
No, the <literal>repmgr</literal> shared library is only needed when running <application>repmgrd</application>.
If you later decide to run <application>repmgrd</application>, you just need to add
<literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
</para>
</sect2>
<sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
<title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
but <command>repmgr</command>/<application>repmgrd</application> complains it can't connect to the server... Why?</title>
<para>
<command>repmgr</command> and <application>repmgrd</application> need to be able to connect to the repmgr database
with a normal connection to query metadata. The <literal>replication</literal> connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user).
</para>
</sect2>
<sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
<title>When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?</title>
<para>
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If &repmgr; needs to establish a connection to the primary
server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
node, and if necessary scan the replication cluster until it locates the active primary.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
<title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
<para>
Provide the option <literal>--waldir</literal> (<literal>--xlogdir</literal> in PostgreSQL 9.6
and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
For more details see <xref linkend="cloning-advanced-pg-basebackup-options">.
</para>
</sect2>
<sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
<title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
table?</title>
<para>
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the <literal>repmgr.nodes</literal> table.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf">
<title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title>
<para>
This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed.
</para>
<para>
The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
even if the string does not contain any characters which need escaping.
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgrd" xreflabel="repmgrd">
<title><application>repmgrd</application></title>
<sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
<title>How can I prevent a node from ever being promoted to primary?</title>
<para>
In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
<command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
</para>
<para>
Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
be considered as a promotion candidate.
</para>
</sect2>
<sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
<title>Does <application>repmgrd</application> support delayed standbys?</title>
<para>
<application>repmgrd</application> can monitor delayed standbys - those set up with
<varname>recovery_min_apply_delay</varname> set to a non-zero value
in <filename>recovery.conf</filename> - but as it's not currently possible
to directly examine the value applied to the standby, <application>repmgrd</application>
may not be able to properly evaluate the node as a promotion candidate.
</para>
<para>
We recommend that delayed standbys are explicitly excluded from promotion
by setting <varname>priority</varname> to <literal>0</literal> in
<filename>repmgr.conf</filename>.
</para>
<para>
Note that after registering a delayed standby, <application>repmgrd</application> will only start
once the metadata added in the primary node has been replicated.
</para>
</sect2>
<sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
<title>How can I get <application>repmgrd</application> to rotate its logfile?</title>
<para>
Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation">.
</para>
</sect2>
<sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
<title>I've recloned a failed primary as a standby, but <application>repmgrd</application> refuses to start?</title>
<para>
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
<literal>automatic</literal>, which is probably not what you want. <application>repmgrd</application> will start if
<varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
be monitored, if desired.
</para>
</sect2>
<sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
<title>
<application>repmgrd</application> ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
path; see <xref linkend="repmgrd-automatic-failover-configuration"> for more details.
</para>
</sect2>
<sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
<title>
<application>repmgrd</application> aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
</title>
<para>
<application>repmgrd</application> does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, <application>repmgrd</application>
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if <application>repmgrd</application> is not running on other nodes.
</para>
<para>
In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
is out-of-date, which may lead to incorrect failover behaviour.
</para>
<para>
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
starting <application>repmgrd</application>.
</para>
</sect2>
</sect1>
</appendix>

View File

@@ -1,488 +0,0 @@
<appendix id="appendix-faq" xreflabel="FAQ">
<title>FAQ (Frequently Asked Questions)</title>
<indexterm>
<primary>FAQ (Frequently Asked Questions)</primary>
</indexterm>
<sect1 id="faq-general" xreflabel="General">
<title>General</title>
<sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
<title>What's the difference between the repmgr versions?</title>
<para>
&repmgr; 4 is a complete rewrite of the previous &repmgr; code base
and implements &repmgr; as a PostgreSQL extension. It
supports all PostgreSQL versions from 9.3 (although some &repmgr;
features are not available for PostgreSQL 9.3 and 9.4).
</para>
<note>
<para>
&repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides
support for the revised replication configuration mechanism in PostgreSQL 12.
</para>
<para>
Support for PostgreSQL 9.3 is no longer available from &repmgr; 5.2.
</para>
</note>
<para>
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via &repmgrd;, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series is no longer maintained.
</para>
<para>
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
no longer maintained.
</para>
<para>
See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
<title>What's the advantage of using replication slots?</title>
<para>
Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed
by all standby servers. This means standby servers should never
fail due to not being able to retrieve required WAL files from the
primary.
</para>
<para>
However this does mean that if a standby is no longer connected to the
primary, the presence of the replication slot will cause WAL files
to be retained indefinitely, and eventually lead to disk space
exhaustion.
</para>
<tip>
<para>
Our recommended configuration is to configure
<ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
source of WAL files, rather than maintain replication slots for
each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
</para>
</tip>
</sect2>
<sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
<title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
<para>
Normally at least same number as the number of standbys which will connect
to the node. Note that changes to <varname>max_replication_slots</varname> require a server
restart to take effect, and as there is no particular penalty for unused
replication slots, setting a higher figure will make adding new nodes
easier.
</para>
</sect2>
<sect2 id="faq-hash-index" xreflabel="Hash indexes">
<title>Does &repmgr; support hash indexes?</title>
<para>
Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
for use in streaming replication in PostgreSQL 9.6 and earlier. See the
<ulink url="https://www.postgresql.org/docs/9.6/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
for details.
</para>
<para>
From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
in a streaming replication cluster.
</para>
</sect2>
<sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
<title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
<para>
For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
<link linkend="performing-switchover">switchover</link> promoting it to a primary,
then upgrade the former primary.
</para>
<para>
For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html">pg_upgrade</ulink>
and recloning standbys from this.
</para>
<para>
To minimize downtime during major upgrades from PostgreSQL 9.4 and later,
<ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
</para>
</sect2>
<sect2 id="faq-libdir-repmgr-error">
<title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
<para>
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or EnterpriseDB; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
<sect2 id="faq-old-packages">
<title>How can I obtain old versions of &repmgr; packages?</title>
<para>
See appendix <xref linkend="packages-old-versions"/> for details.
</para>
</sect2>
<sect2 id="faq-repmgr-required-for-replication">
<title>Is &repmgr; required for streaming replication?</title>
<para>
No.
</para>
<para>
&repmgr; (together with &repmgrd;) assists with
<emphasis>managing</emphasis> replication. It does not actually perform replication, which
is part of the core PostgreSQL functionality.
</para>
</sect2>
<sect2 id="faq-what-if-repmgr-uninstalled">
<title>Will replication stop working if &repmgr; is uninstalled?</title>
<para>
No. See preceding question.
</para>
</sect2>
<sect2 id="faq-version-mix">
<title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
<para>
Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
&repmgr; (in particular &repmgrd;)
may not run, or run properly, or in the worst case (if different &repmgrd;
versions are running and there are differences in the failover implementation) break
your replication cluster.
</para>
<para>
If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
&repmgr; will function, but we strongly recommend always running the same version
to ensure there are no unexpected surprises, e.g. a newer version behaving slightly
differently to the older version.
</para>
<para>
See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-upgrade-repmgr">
<title>Should I upgrade &repmgr;?</title>
<para>
Yes.
</para>
<para>
We don't release new versions for fun, you know. Upgrading may require a little effort,
but running an older &repmgr; version with bugs which have since been fixed may end up
costing you more effort. The same applies to PostgreSQL itself.
</para>
</sect2>
<sect2 id="faq-repmgr-conf-data-directory">
<title>Why do I need to specify the data directory location in repmgr.conf?</title>
<para>
In some circumstances &repmgr; may need to access a PostgreSQL data
directory while the PostgreSQL server is not running, e.g. to confirm
it shut down cleanly during a <link linkend="performing-switchover">switchover</link>.
</para>
<para>
Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
earlier, where the <literal>repmgr</literal> user is not a superuser; in that
case the <literal>repmgr</literal> user will not be able to access the
<literal>data_directory</literal> configuration setting, access to which is restricted
to superusers.
</para>
<para>
In PostgreSQL 10 and later, non-superusers can be added to the
<ulink url="https://www.postgresql.org/docs/current/default-roles.html">default role</ulink>
<option>pg_read_all_settings</option> (or the meta-role <option>pg_monitor</option>)
which will enable them to read this setting.
</para>
</sect2>
<sect2 id="faq-third-party-packages" xreflabel="Compatibility with third party vendor packages">
<title>Are &repmgr; packages compatible with <literal>$third_party_vendor</literal>'s packages?</title>
<para>
&repmgr; packages provided by EnterpriseDB are compatible with the community-provided PostgreSQL
packages and specified software provided by EnterpriseDB.
</para>
<para>
A number of other vendors provide their own versions of PostgreSQL packages, often with different
package naming schemes and/or file locations.
</para>
<para>
We cannot guarantee that &repmgr; packages will be compatible with these packages.
It may be possible to override package dependencies (e.g. <literal>rpm --nodeps</literal>
for CentOS-based systems or <literal>dpkg --force-depends</literal> for Debian-based systems).
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgr" xreflabel="repmgr">
<title><command>repmgr</command></title>
<sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
<title>Can I register an existing PostgreSQL server with repmgr?</title>
<para>
Yes, any existing PostgreSQL server which is part of the same replication
cluster can be registered with &repmgr;. There's no requirement for a
standby to have been cloned using &repmgr;.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-other-source" >
<title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
<para>
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
<command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>
can be used to create the correct replication configuration file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
<link linkend="repmgr-standby-register">register the node</link> as usual.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf" >
<title>What does &repmgr; write in the replication configuration, and what options can be set there?</title>
<para>
See section <link linkend="repmgr-standby-clone-recovery-conf">Customising replication configuration</link>.
</para>
</sect2>
<sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
<title>How can a failed primary be re-added as a standby?</title>
<para>
This is a two-stage process. First, the failed primary's data directory
must be re-synced with the current primary; secondly the failed primary
needs to be re-registered as a standby.
</para>
<para>
It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
directory, which will usually be much
faster than re-cloning the server. However <command>pg_rewind</command> can only
be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
data checksums were enabled when the cluster was initialized.
</para>
<para>
Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
</para>
<para>
&repmgr; provides the command <command>repmgr node rejoin</command> which can
optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"/>
documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
</para>
<para>
If <command>pg_rewind</command> cannot be used, then the data directory will need
to be re-cloned from scratch.
</para>
</sect2>
<sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
<title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
<para>
Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
with the <literal>--dry-run</literal> option; this will report any configuration problems
which need to be rectified.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
<title>When cloning a standby, how can I get &repmgr; to copy
<filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
directory in <filename>/etc</filename>?</title>
<para>
Use the command line option <literal>--copy-external-config-files</literal>. For more details
see <xref linkend="repmgr-standby-clone-config-file-copying"/>.
</para>
</sect2>
<sect2 id="faq-repmgr-shared-preload-libraries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
<para>
No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;.
If you later decide to run &repmgrd;, you just need to add
<literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
</para>
</sect2>
<sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
<title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title>
<para>
<command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database
with a normal connection to query metadata. The <literal>replication</literal> connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user).
</para>
</sect2>
<sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
<title>When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?</title>
<para>
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If &repmgr; needs to establish a connection to the primary
server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
node, and if necessary scan the replication cluster until it locates the active primary.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
<title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
<para>
Provide the option <literal>--waldir</literal> (<literal>--xlogdir</literal> in PostgreSQL 9.6
and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>.
</para>
<para>
In &repmgr; 5.2 and later, this setting will also be honoured when cloning from Barman.
</para>
</sect2>
<sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
<title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
table?</title>
<para>
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the <literal>repmgr.nodes</literal> table.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in replication.conf">
<title>Why are some values in <filename>recovery.conf</filename> (PostgreSQL 11 and earlier) surrounded by pairs of single quotes?</title>
<para>
This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
are escaped correctly and do not cause errors when the file is parsed.
</para>
<para>
The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
even if the string does not contain any characters which need escaping.
</para>
</sect2>
<sect2 id="faq-repmgr-exclude-metadata-from-dump" xreflabel="Excluding repmgr metadata from pg_dump output">
<title>How can I exclude &repmgr; metadata from <application>pg_dump</application> output?</title>
<para>
Beginning with &repmgr; 5.2, the metadata tables associated with the &repmgr; extension
(<literal>repmgr.nodes</literal>, <literal>repmgr.events</literal> and <literal>repmgr.monitoring_history</literal>)
have been marked as dumpable as they contain configuration and user-generated data.
</para>
<para>
To exclude these from <application>pg_dump</application> output, add the flag <option>--exclude-schema=repmgr</option>.
</para>
<para>
To exclude individual &repmgr; metadata tables from <application>pg_dump</application> output, add the flag
e.g. <option>--exclude-table=repmgr.monitoring_history</option>. This flag can be provided multiple times
to exclude individual tables,
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgrd" xreflabel="repmgrd">
<title>&repmgrd;</title>
<sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
<title>How can I prevent a node from ever being promoted to primary?</title>
<para>
In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
<command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
</para>
<para>
Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
be considered as a promotion candidate.
</para>
</sect2>
<sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
<title>Does &repmgrd; support delayed standbys?</title>
<para>
&repmgrd; can monitor delayed standbys - those set up with
<varname>recovery_min_apply_delay</varname> set to a non-zero value
in the replication configuration. However &repmgrd; does not currently
consider this setting, and therefore may not be able to properly evaluate
the node as a promotion candidate.
</para>
<para>
We recommend that delayed standbys are explicitly excluded from promotion
by setting <varname>priority</varname> to <literal>0</literal> in
<filename>repmgr.conf</filename>.
</para>
<para>
Note that after registering a delayed standby, &repmgrd; will only start
once the metadata added in the primary node has been replicated.
</para>
</sect2>
<sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
<title>How can I get &repmgrd; to rotate its logfile?</title>
<para>
Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation"/>.
</para>
</sect2>
<sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
<title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title>
<para>
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
<literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if
<varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
be monitored, if desired.
</para>
</sect2>
<sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
<title>
&repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
so &repmgr; will not apply <option>pg_bindir</option> even if executing &repmgr;. Always provide the full
path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details.
</para>
</sect2>
<sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
<title>
&repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
</title>
<para>
&repmgrd; does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, &repmgrd;
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if &repmgrd; is not running on other nodes.
</para>
<para>
In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
is out-of-date, which may lead to incorrect failover behaviour.
</para>
<para>
The onus is therefore on the administrator to manually set the cluster to a stable, healthy state before
starting &repmgrd;.
</para>
</sect2>
</sect1>
</appendix>

View File

@@ -1,11 +1,9 @@
<appendix id="appendix-packages" xreflabel="Package details">
<title>&repmgr; package details</title>
<indexterm>
<primary>packages</primary>
</indexterm>
<title>&repmgr; package details</title>
<para>
This section provides technical details about various &repmgr; binary
packages, such as location of the installed binaries and
@@ -50,22 +48,23 @@
<title>CentOS repositories</title>
<para>
&repmgr; packages are available from the public EDB repository, and also the
PostgreSQL community repository. The EDB repository is updated immediately
after each &repmgr; release.
&repmgr; packages are available from the public 2ndQuadrant repository, and also the
PostgreSQL community repository. The 2ndQuadrant repository is updated immediately
after each
&repmgr; release.
</para>
<table id="centos-2ndquadrant-repository">
<title>EDB public repository</title>
<title>2ndQuadrant public repository</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink></entry>
<entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ">https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ</ulink></entry>
<entry><ulink url="https://repmgr.org/docs/4.1/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ">https://repmgr.org/docs/4.1/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ</ulink></entry>
</row>
</tbody>
</tgroup>
@@ -121,7 +120,7 @@
<row>
<entry>Package name example:</entry>
<entry><filename>repmgr11-4.4.0-1.rhel7.x86_64</filename></entry>
<entry><filename>repmgr10-4.0.4-1.rhel7.x86_64</filename></entry>
</row>
<row>
@@ -131,12 +130,12 @@
<row>
<entry>Installation command:</entry>
<entry><literal>yum install repmgr11</literal></entry>
<entry><literal>yum install repmgr10</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/pgsql-11/bin</filename></entry>
<entry><filename>/usr/pgsql-10/bin</filename></entry>
</row>
<row>
@@ -146,22 +145,22 @@
<row>
<entry>Configuration file location:</entry>
<entry><filename>/etc/repmgr/11/repmgr.conf</filename></entry>
<entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
</row>
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/pgsql/11/data</filename></entry>
<entry><filename>/var/lib/pgsql/10/data</filename></entry>
</row>
<row>
<entry>repmgrd service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] repmgr11</command></entry>
<entry><command>systemctl [start|stop|restart|reload] repmgr10</command></entry>
</row>
<row>
<entry>repmgrd service file location:</entry>
<entry><filename>/usr/lib/systemd/system/repmgr11.service</filename></entry>
<entry><filename>/usr/lib/systemd/system/repmgr10.service</filename></entry>
</row>
<row>
@@ -252,26 +251,32 @@
</indexterm>
<para>
&repmgr; <literal>.deb</literal> packages are provided by EDB as well as the
&repmgr; <literal>.deb</literal> packages are provided via the
PostgreSQL Community APT repository, and are available for each community-supported
PostgreSQL version, currently supported Debian releases, and currently supported
Ubuntu LTS releases.
</para>
<sect2 id="packages-apt-repository">
<title>APT repositories</title>
<title>APT repository</title>
<para>
&repmgr; packages are available from the PostgreSQL Community APT repository,
which is updated immediately after each &repmgr; release.
</para>
<table id="apt-2ndquadrant-repository">
<title>EDB public repository</title>
<title>2ndQuadrant public repository</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink></entry>
<entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN">https://repmgr.org/docs/current/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN</ulink></entry>
<entry><ulink url="https://repmgr.org/docs/4.1/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN">https://repmgr.org/docs/4.1/installation-packages.html#INSTALLATION-PACKAGES-DEBIAN</ulink></entry>
</row>
</tbody>
</tgroup>
@@ -284,11 +289,11 @@
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://apt.postgresql.org/">https://apt.postgresql.org/</ulink></entry>
<entry><ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink></entry>
<entry><ulink url="https://wiki.postgresql.org/wiki/Apt)">https://wiki.postgresql.org/wiki/Apt)</ulink></entry>
</row>
</tbody>
</tgroup>
@@ -304,8 +309,8 @@
version number for your installation.
</para>
<para>
See also <xref linkend="repmgrd-configuration-debian-ubuntu"/> for some specifics related
to configuring the &repmgrd; daemon.
See also <xref linkend="repmgrd-configuration-debian-ubuntu"> for some specifics related
to configuring the <application>repmgrd</application> daemon.
</para>
<table id="debian-9-packages">
@@ -316,7 +321,7 @@
<row>
<entry>Package name example:</entry>
<entry><filename>postgresql-11-repmgr</filename></entry>
<entry><filename>postgresql-10-repmgr</filename></entry>
</row>
<row>
@@ -326,12 +331,12 @@
<row>
<entry>Installation command:</entry>
<entry><literal>apt-get install postgresql-11-repmgr</literal></entry>
<entry><literal>apt-get install postgresql-10-repmgr</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/lib/postgresql/11/bin</filename></entry>
<entry><filename>/usr/lib/postgresql/10/bin</filename></entry>
</row>
<row>
@@ -346,12 +351,12 @@
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/postgresql/11/main</filename></entry>
<entry><filename>/var/lib/postgresql/10/main</filename></entry>
</row>
<row>
<entry>PostgreSQL service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] postgresql@11-main</command></entry>
<entry><command>systemctl [start|stop|restart|reload] postgresql@10-main</command></entry>
</row>
@@ -375,11 +380,11 @@
</table>
<note>
<para>
When using Debian packages, instead of using the <application>systemd</application> service
command directly, it's recommended to execute <command>pg_ctlcluster</command>
(as <literal>root</literal>, either directly or via <command>sudo</command>), e.g.:
Instead of using the <application>systemd</application> service command directly,
it's recommended to execute <command>pg_ctlcluster</command> (as <literal>root</literal>,
either directly or via <command>sudo</command>), e.g.:
<programlisting>
<command>pg_ctlcluster 11 main [start|stop|restart|reload]</command></programlisting>
<command>pg_ctlcluster 10 main [start|stop|restart|reload]</command></programlisting>
</para>
<para>
For pre-<application>systemd</application> systems, <command>pg_ctlcluster</command>
@@ -397,11 +402,11 @@
</indexterm>
<indexterm>
<primary>packages</primary>
<secondary>snapshots</secondary>
<secondary>snaphots</secondary>
</indexterm>
<para>
For testing new features and bug fixes, from time to time EDB provides
For testing new features and bug fixes, from time to time 2ndQuadrant provides
so-called &quot;snapshot packages&quot; via its public repository. These packages
are built from the &repmgr; source at a particular point in time, and are not formal
releases.
@@ -413,22 +418,22 @@
</para>
</note>
<para>
To install a snapshot package, it's necessary to install the EDB public snapshot repository,
following the instructions here: <ulink url="https://dl.enterprisedb.com/default/release/site/">https://dl.enterprisedb.com/default/release/site/</ulink> but replace <literal>release</literal> with <literal>snapshot</literal>
To install a snapshot package, it's necessary to install the 2ndQuadrant public snapshot repository,
following the instructions here: <ulink url="https://dl.2ndquadrant.com/default/release/site/">https://dl.2ndquadrant.com/default/release/site/</ulink> but replace <literal>release</literal> with <literal>snapshot</literal>
in the appropriate URL.
</para>
<para>
For example, to install the snapshot RPM repository for PostgreSQL 9.6, execute (as <literal>root</literal>):
<programlisting>
curl https://dl.enterprisedb.com/default/snapshot/get/9.6/rpm | bash</programlisting>
curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | bash</programlisting>
or as a normal user with root sudo access:
<programlisting>
curl https://dl.enterprisedb.com/default/snapshot/get/9.6/rpm | sudo bash</programlisting>
curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | sudo bash</programlisting>
</para>
<para>
Alternatively you can browse the repository here:
<ulink url="https://dl.enterprisedb.com/default/snapshot/browse/">https://dl.enterprisedb.com/default/snapshot/browse/</ulink>.
<ulink url="https://dl.2ndquadrant.com/default/snapshot/browse/">https://dl.2ndquadrant.com/default/snapshot/browse/</ulink>.
</para>
<para>
Once the repository is installed, installing or updating &repmgr; will result in the latest snapshot
@@ -438,7 +443,7 @@ curl https://dl.enterprisedb.com/default/snapshot/get/9.6/rpm | sudo bash</progr
The package name will be formatted like this:
<programlisting>
repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
containing the snapshot build number (here: <literal>320</literal>) and the hash
containg the snapshot build number (here: <literal>320</literal>) and the hash
of the <application>git</application> commit it was built from (here: <literal>g5113ab0</literal>).
</para>
@@ -451,37 +456,52 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
<sect1 id="packages-old-versions" xreflabel="Installing old package versions">
<title>Installing old package versions</title>
<indexterm>
<primary>old packages</primary>
</indexterm>
<indexterm>
<primary>packages</primary>
<secondary>old versions</secondary>
</indexterm>
<indexterm>
<primary>installation</primary>
<secondary>old package versions</secondary>
</indexterm>
<sect2 id="packages-old-versions-debian" xreflabel="old Debian package versions">
<title>Debian/Ubuntu</title>
<para>
An archive of old packages (<literal>3.3.2</literal> and later) for Debian/Ubuntu-based systems is available here:
<ulink url="https://apt-archive.postgresql.org/">https://apt-archive.postgresql.org/</ulink>
<ulink url="http://atalia.postgresql.org/morgue/r/repmgr/">http://atalia.postgresql.org/morgue/r/repmgr/</ulink>
</para>
</sect2>
<sect2 id="packages-old-versions-rhel-centos" xreflabel="old RHEL/CentOS package versions">
<title>RHEL/CentOS</title>
<para>
Old RPM packages (<literal>3.2</literal> and later) can be retrieved from the
(deprecated) 2ndQuadrant repository at
<ulink url="http://packages.2ndquadrant.com/">http://packages.2ndquadrant.com/</ulink>
by installing the appropriate repository RPM:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
<listitem>
<simpara>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
</itemizedlist>
<para>
Old versions can be located with e.g.:
<programlisting>
yum --showduplicates list repmgr96</programlisting>
(substitute the appropriate package name; see <xref linkend="packages-centos"/>) and installed with:
(substitute the appropriate package name; see <xref linkend="packages-centos">) and installed with:
<programlisting>
yum install {package_name}-{version}</programlisting>
where <literal>{package_name}</literal> is the base package name (e.g. <literal>repmgr96</literal>)
@@ -519,13 +539,13 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
char package_conf_file[MAXPGPATH] = "";</programlisting>
</para>
<para>
See also: <xref linkend="configuration-file"/>
See also: <xref linkend="configuration-file">
</para>
</listitem>
<listitem>
<para>
PID file location: the default &repmgrd; PID file
PID file location: the default <application>repmgrd</application> PID file
location can be hard-coded by patching <varname>package_pid_file</varname>
in <filename>repmgrd.c</filename>:
<programlisting>
@@ -533,7 +553,7 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
char package_pid_file[MAXPGPATH] = "";</programlisting>
</para>
<para>
See also: <xref linkend="repmgrd-pid-file"/>
See also: <xref linkend="repmgrd-pid-file">
</para>
</listitem>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,114 +0,0 @@
<appendix id="appendix-support" xreflabel="repmgr support">
<title>&repmgr; support</title>
<indexterm>
<primary>support</primary>
</indexterm>
<para>
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides 24x7
production support for &repmgr; and other PostgreSQL
products, including configuration assistance, installation
verification and training for running a robust replication cluster.
</para>
<para>
For further details see: <ulink url="https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql">Support Center</ulink>
</para>
<para>
A mailing list/forum is provided via Google groups to discuss contributions or issues: <ulink url="https://groups.google.com/group/repmgr">https://groups.google.com/group/repmgr</ulink>.
</para>
<para>
Please report bugs and other issues to: <ulink url="https://github.com/EnterpriseDB/repmgr">https://github.com/EnterpriseDB/repmgr</ulink>.
</para>
<important>
<para>
Please read the <link linkend="appendix-support-reporting-issues">following section</link> before submitting questions or issue reports.
</para>
</important>
<sect1 id="appendix-support-reporting-issues" xreflabel="Reportins Issues">
<title>Reporting Issues</title>
<indexterm>
<primary>support</primary>
<secondary>reporting issues</secondary>
</indexterm>
<para>
When asking questions or reporting issues, it is extremely helpful if the following information is included:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
PostgreSQL version
</simpara>
</listitem>
<listitem>
<simpara>
&repmgr; version
</simpara>
</listitem>
<listitem>
<simpara>
How was &repmgr; installed? From source? From packages? If
so from which repository?
</simpara>
</listitem>
<listitem>
<simpara>
<filename>repmgr.conf</filename> files (suitably anonymized if necessary)
</simpara>
</listitem>
<listitem>
<simpara>
Contents of the <literal>repmgr.nodes</literal> table (suitably anonymized if necessary)
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL 11 and earlier: contents of the <filename>recovery.conf</filename> file
(suitably anonymized if necessary).
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL 12 and later: contents of the <filename>postgresql.auto.conf</filename> file
(suitably anonymized if necessary), and whether or not the PostgreSQL data directory
contains the files <filename>standby.signal</filename> and/or <filename>recovery.signal</filename>.
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
If issues are encountered with a &repmgr; client command, please provide
the output of that command executed with the options
<option>-LDEBUG --verbose</option>, which will ensure &repmgr; emits
the maximum level of logging output.
</para>
<para>
If issues are encountered with &repmgrd;,
please provide relevant extracts from the &repmgr; log files
and if possible the PostgreSQL log itself. Please ensure these
logs do not contain any confidential data.
</para>
<para>
In all cases it is <emphasis>extremely</emphasis> useful to receive
as much detail as possible on how to reliably reproduce
an issue.
</para>
</sect1>
</appendix>

8
doc/bdr-failover.md Normal file
View File

@@ -0,0 +1,8 @@
BDR failover with repmgrd
=========================
This document has been integrated into the main `repmgr` documentation
and is now located here:
> [BDR failover with repmgrd](https://repmgr.org/docs/4.0/repmgrd-bdr.html)

View File

@@ -4,4 +4,4 @@ Changes in repmgr 4
This document has been integrated into the main `repmgr` documentation
and is now located here:
> [Release notes](https://repmgr.org/docs/current/release-4.0.html)
> [Release notes](https://repmgr.org/docs/4.0/release-4.0.html)

View File

@@ -2,8 +2,6 @@
<title>Cloning standbys</title>
<sect1 id="cloning-from-barman" xreflabel="Cloning from Barman">
<title>Cloning a standby from Barman</title>
<indexterm>
<primary>cloning</primary>
<secondary>from Barman</secondary>
@@ -13,9 +11,10 @@
<secondary>cloning a standby</secondary>
</indexterm>
<title>Cloning a standby from Barman</title>
<para>
<xref linkend="repmgr-standby-clone"/> can use
<ulink url="https://www.enterprisedb.com/">EDB</ulink>'s
<xref linkend="repmgr-standby-clone"> can use
<ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s
<ulink url="https://www.pgbarman.org/">Barman</ulink> application
to clone a standby (and also as a fallback source for WAL files).
</para>
@@ -46,32 +45,12 @@
<para>
WAL management on the primary becomes much easier as there's no need
to use replication slots, and <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>)
does not need to be set.
</para>
</listitem>
</itemizedlist>
</para>
<note>
<para>
Currently &repmgr;'s support for cloning from Barman is implemented by using
<productname>rsync</productname> to clone from the Barman server.
</para>
<para>
It is therefore not able to make use of Barman's parallel restore facility, which
is executed on the Barman server and clones to the target server.
</para>
<para>
Barman's parallel restore facility can be used by executing it manually on
the Barman server and configuring replication on the resulting cloned
standby using
<command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>.
</para>
</note>
<sect2 id="cloning-from-barman-prerequisites">
<title>Prerequisites for cloning from Barman</title>
<para>
@@ -80,7 +59,8 @@
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
the Barman catalogue must include at least one valid backup for this server;
the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the
server configured in Barman;
</para>
</listitem>
<listitem>
@@ -91,90 +71,19 @@
</listitem>
<listitem>
<para>
the <varname>barman_server</varname> setting in <filename>repmgr.conf</filename> is the same as the
server configured in Barman.
the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename> is configured to
use a copy of the <command>barman-wal-restore</command> script shipped with the
<literal>barman-cli</literal> package (see section <xref linkend="cloning-from-barman-restore-command">
below).
</para>
</listitem>
<listitem>
<para>
the Barman catalogue includes at least one valid backup for this server.
</para>
</listitem>
</itemizedlist>
</para>
<para>
For example, assuming Barman is located on the host &quot;<literal>barmansrv</literal>&quot;
under the &quot;<literal>barman</literal>&quot; user account,
<filename>repmgr.conf</filename> should contain the following entries:
<programlisting>
barman_host='barman@barmansrv'
barman_server='pg'</programlisting>
</para>
<para>
Here <literal>pg</literal> corresponds to a section in Barman's configuration file for a specific
server backup configuration, which would look something like:
<programlisting>
[pg]
description = "Main cluster"
...
</programlisting>
</para>
<para>
More details on Barman configuration can be found in the
<ulink url="https://docs.pgbarman.org/">Barman documentation</ulink>'s
<ulink url="https://docs.pgbarman.org/#configuration">configuration section</ulink>.
</para>
<note>
<para>
To use a non-default Barman configuration file on the Barman server,
specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
<programlisting>
barman_config='/path/to/barman.conf'</programlisting>
</para>
</note>
<para>
We also recommend configuring the <varname>restore_command</varname> setting in <filename>repmgr.conf</filename>
to use the <command>barman-wal-restore</command> script
(see section <xref linkend="cloning-from-barman-restore-command"/> below).
</para>
<tip>
<simpara>
If you have a non-default SSH configuration on the Barman
server, e.g. using a port other than 22, then you can set those
parameters in a dedicated Host section in <filename>~/.ssh/config</filename>
corresponding to the value of <varname>barman_host</varname> in
<filename>repmgr.conf</filename>. See the <literal>Host</literal>
section in <command>man 5 ssh_config</command> for more details.
</simpara>
</tip>
<para>
If you wish to place WAL files in a location outside the main
PostgreSQL data directory, set <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
<option>pg_basebackup_options</option> to the target directory
(must be an absolute filepath). &repmgr; will create and
symlink to this directory in exactly the same way
<application>pg_basebackup</application> would.
</para>
<para>
It's now possible to clone a standby from Barman, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf -h node1 -U repmgr -d repmgr standby clone
NOTICE: destination directory "/var/lib/postgresql/data" provided
INFO: connecting to Barman server to verify backup for "test_cluster"
INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data"
INFO: creating directory "/var/lib/postgresql/data/repmgr"...
INFO: connecting to Barman server to fetch server parameters
INFO: connecting to source node
DETAIL: current installation size is 30 MB
NOTICE: retrieving backup from Barman...
(...)
NOTICE: standby clone (from Barman) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
</para>
<note>
<simpara>
Barman support is automatically enabled if <varname>barman_server</varname>
@@ -184,155 +93,86 @@ description = "Main cluster"
command line option.
</simpara>
</note>
<tip>
<simpara>
If you have a non-default SSH configuration on the Barman
server, e.g. using a port other than 22, then you can set those
parameters in a dedicated Host section in <filename>~/.ssh/config</filename>
corresponding to the value of<varname>barman_host</varname> in
<filename>repmgr.conf</filename>. See the <literal>Host</literal>
section in <command>man 5 ssh_config</command> for more details.
</simpara>
</tip>
<para>
It's now possible to clone a standby from Barman, e.g.:
<programlisting>
NOTICE: using configuration file "/etc/repmgr.conf"
NOTICE: destination directory "/var/lib/postgresql/data" provided
INFO: connecting to Barman server to verify backup for test_cluster
INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/data"
INFO: creating directory "/var/lib/postgresql/data/repmgr"...
INFO: connecting to Barman server to fetch server parameters
INFO: connecting to upstream node
INFO: connected to source node, checking its state
INFO: successfully connected to source node
DETAIL: current installation size is 29 MB
NOTICE: retrieving backup from Barman...
receiving file list ...
(...)
NOTICE: standby clone (from Barman) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /var/lib/postgresql/data start</programlisting>
</para>
</sect2>
<sect2 id="cloning-from-barman-restore-command" xreflabel="Using Barman as a WAL file source">
<title>Using Barman as a WAL file source</title>
<indexterm>
<indexterm>
<primary>Barman</primary>
<secondary>fetching archived WAL</secondary>
</indexterm>
<title>Using Barman as a WAL file source</title>
<para>
As a fallback in case streaming replication is interrupted, PostgreSQL can optionally
retrieve WAL files from an archive, such as that provided by Barman. This is done by
setting <varname>restore_command</varname> in the replication configuration to
setting <varname>restore_command</varname> in <filename>recovery.conf</filename> to
a valid shell command which can retrieve a specified WAL file from the archive.
</para>
<para>
<command>barman-wal-restore</command> is a Python script provided as part of the <literal>barman-cli</literal>
package (Barman 2.0 ~ 2.7) or as part of the core Barman distribution (Barman 2.8 and later).
package (Barman 2.0 and later; for Barman 1.x the script is provided separately as
<command>barman-wal-restore.py</command>) which performs this function for Barman.
</para>
<para>
To use <command>barman-wal-restore</command> with &repmgr;,
assuming Barman is located on the host &quot;<literal>barmansrv</literal>&quot;
under the &quot;<literal>barman</literal>&quot; user account,
To use <command>barman-wal-restore</command> with &repmgr;
and assuming Barman is located on the <literal>barmansrv</literal> host
and that <command>barman-wal-restore</command> is located as an executable at
<filename>/usr/bin/barman-wal-restore</filename>,
<filename>repmgr.conf</filename> should include the following lines:
<programlisting>
barman_host='barman@barmansrv'
barman_server='pg'
restore_command='/usr/bin/barman-wal-restore barmansrv pg %f %p'</programlisting>
barman_host=barmansrv
barman_server=somedb
restore_command=/usr/bin/barman-wal-restore barmansrv somedb %f %p</programlisting>
</para>
<note>
<simpara>
<command>barman-wal-restore</command> supports command line switches to
control parallelism (<literal>--parallel=N</literal>) and compression
(<literal>--bzip2</literal>, <literal>--gzip</literal>).
control parallelism (<literal>--parallel=N</literal>) and compression (
<literal>--bzip2</literal>, <literal>--gzip</literal>).
</simpara>
</note>
<note>
<para>
To use a non-default Barman configuration file on the Barman server,
specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
<programlisting>
barman_config=/path/to/barman.conf</programlisting>
</para>
</note>
</sect2>
<sect2 id="cloning-from-barman-pg_backupapi-mode" xreflabel="Using Barman through its API (pg-backup-api)">
<title>Using Barman through its API (pg-backup-api)</title>
<indexterm>
<primary>cloning</primary>
<secondary>pg-backup-api</secondary>
</indexterm>
<para>
You can find information on how to install and setup pg-backup-api in
<ulink url="https://www.enterprisedb.com/docs/supported-open-source/barman/pg-backup-api/">the pg-backup-api
documentation</ulink>.
</para>
<para>
This mode (`pg-backupapi`) was introduced in v5.4.0 as a way to further integrate with Barman letting Barman
handle the restore. This also reduces the ssh keys that need to share between the backup and postgres nodes.
As long as you have access to the API service by HTTP calls, you could perform recoveries right away.
You just need to instruct Barman through the API which backup you need and on which node the backup needs to
to be restored on.
</para>
<para>
In order to enable <literal>pg_backupapi mode</literal> support for <command>repmgr standby clone</command>,
you need the following lines in repmgr.conf:
<itemizedlist spacing="compact" mark="bullet">
<listitem><para>pg_backupapi_host: Where pg-backup-api is hosted</para></listitem>
<listitem><para>pg_backupapi_node_name: Name of the server as understood by Barman</para></listitem>
<listitem><para>pg_backupapi_remote_ssh_command: How Barman will be connecting as to the node</para></listitem>
<listitem><para>pg_backupapi_backup_id: ID of the existing backup you need to restore</para></listitem>
</itemizedlist>
This is an example of how repmgr.conf would look like:
<programlisting>
pg_backupapi_host = '192.168.122.154'
pg_backupapi_node_name = 'burrito'
pg_backupapi_remote_ssh_command = 'ssh john_doe@192.168.122.1'
pg_backupapi_backup_id = '20230223T093201'
</programlisting>
</para>
<para>
<literal>pg_backupapi_host</literal> is the variable name that enables this mode, and when you set it,
all the rest of the above variables are required. Also, remember that this service is just an interface
between Barman and repmgr, hence if something fails during a recovery, you should check Barman's logs upon
why the process couldn't finish properly.
</para>
<note>
<simpara>
Despite in Barman you can define shortcuts like "lastest" or "oldest", they are not supported for the
time being in pg-backup-api. These shortcuts will be supported in a future release.
</simpara>
</note>
<para>
This is a real example of repmgr's output cloning with the API. Note that during this operation, we stopped
the service for a little while and repmgr had to retry but that doesn't affect the final outcome. The primary
is listening on localhost's port 6001:
<programlisting>
$ repmgr -f ~/nodes/node_3/repmgr.conf standby clone -U repmgr -p 6001 -h localhost
NOTICE: destination directory "/home/mario/nodes/node_3/data" provided
INFO: Attempting to use `pg_backupapi` new restore mode
INFO: connecting to source node
DETAIL: connection string is: user=repmgr port=6001 host=localhost
DETAIL: current installation size is 8541 MB
DEBUG: 1 node records returned by source node
DEBUG: connecting to: "user=repmgr dbname=repmgr host=localhost port=6001 connect_timeout=2 fallback_application_name=repmgr options=-csearch_path="
DEBUG: upstream_node_id determined as 1
INFO: Attempting to use `pg_backupapi` new restore mode
INFO: replication slot usage not requested; no replication slot will be set up for this standby
NOTICE: starting backup (using pg_backupapi)...
INFO: Success creating the task: operation id '20230309T150647'
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
Incorrect reply received for that operation ID.
INFO: Retrying...
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status IN_PROGRESS
INFO: status DONE
NOTICE: standby clone (from pg_backupapi) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /home/mario/nodes/node_3/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
</programlisting>
</para>
</sect2> <!--END cloning-from-barman-pg_backupapi-mode !-->
</sect1>
<sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots">
<title>Cloning and replication slots</title>
<sect1 id="cloning-replication-slots" xreflabel="Cloning and replication slots">
<indexterm>
<primary>cloning</primary>
<secondary>replication slots</secondary>
@@ -342,13 +182,13 @@ HINT: after starting the server, you need to register this standby with "repmgr
<primary>replication slots</primary>
<secondary>cloning</secondary>
</indexterm>
<title>Cloning and replication slots</title>
<para>
Replication slots were introduced with PostgreSQL 9.4 and are designed to ensure
that any standby connected to the primary using a replication slot will always
be able to retrieve the required WAL files. This removes the need to manually
manage WAL file retention by estimating the number of WAL files that need to
be maintained on the primary using <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>).
be maintained on the primary using <varname>wal_keep_segments</varname>.
Do however be aware that if a standby is disconnected, WAL will continue to
accumulate on the primary until either the standby reconnects or the replication
slot is dropped.
@@ -402,43 +242,41 @@ HINT: after starting the server, you need to register this standby with "repmgr
build up indefinitely, possibly leading to server failure.
</simpara>
<simpara>
As an alternative we recommend using EDB's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
which offloads WAL management to a separate server, removing the requirement to use a replication
slot for each individual standby to reserve WAL. See section <xref linkend="cloning-from-barman"/>
As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
which offloads WAL management to a separate server, negating the need to use replication
slots to reserve WAL. See section <xref linkend="cloning-from-barman">
for more details on using &repmgr; together with Barman.
</simpara>
</tip>
</sect1>
<sect1 id="cloning-cascading" xreflabel="Cloning and cascading replication">
<title>Cloning and cascading replication</title>
<indexterm>
<primary>cloning</primary>
<secondary>cascading replication</secondary>
</indexterm>
<title>Cloning and cascading replication</title>
<para>
Cascading replication, introduced with PostgreSQL 9.2, enables a standby server
to replicate from another standby server rather than directly from the primary,
meaning replication changes "cascade" down through a hierarchy of servers. This
can be used to reduce load on the primary and minimize bandwidth usage between
can be used to reduce load on the primary and minimize bandwith usage between
sites. For more details, see the
<ulink url="https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION">
<ulink url="https://www.postgresql.org/docs/current/static/warm-standby.html#CASCADING-REPLICATION">
PostgreSQL cascading replication documentation</ulink>.
</para>
<para>
&repmgr; supports cascading replication. When cloning a standby,
set the command-line parameter <literal>--upstream-node-id</literal> to the
<varname>node_id</varname> of the server the standby should connect to, and
&repmgr; will create a replication configuration file to point to it. Note
&repmgr; will create <filename>recovery.conf</filename> to point to it. Note
that if <literal>--upstream-node-id</literal> is not explicitly provided,
&repmgr; will set the standby's replication configuration to
&repmgr; will set the standby's <filename>recovery.conf</filename> to
point to the primary node.
</para>
<para>
To demonstrate cascading replication, first ensure you have a primary and standby
set up as shown in the <xref linkend="quickstart"/>.
set up as shown in the <xref linkend="quickstart">.
Then create an additional standby server with <filename>repmgr.conf</filename> looking
like this:
<programlisting>
@@ -494,18 +332,18 @@ HINT: after starting the server, you need to register this standby with "repmgr
cluster, you may wish to clone a downstream standby whose upstream node
does not yet exist. In this case you can clone from the primary (or
another upstream node); provide the parameter <literal>--upstream-conninfo</literal>
to explicitly set the upstream's <varname>primary_conninfo</varname> string
in the replication configuration.
to explictly set the upstream's <varname>primary_conninfo</varname> string
in <filename>recovery.conf</filename>.
</simpara>
</tip>
</sect1>
<sect1 id="cloning-advanced" xreflabel="Advanced cloning options">
<title>Advanced cloning options</title>
<indexterm>
<primary>cloning</primary>
<secondary>advanced options</secondary>
</indexterm>
<title>Advanced cloning options</title>
<sect2 id="cloning-advanced-pg-basebackup-options" xreflabel="pg_basebackup options when cloning a standby">
<title>pg_basebackup options when cloning a standby</title>
@@ -527,7 +365,7 @@ HINT: after starting the server, you need to register this standby with "repmgr
<simpara>
If <application>Barman</application> is set up for the cluster, it's possible to
clone the standby directly from Barman, without any impact on the server the standby
is being cloned from. For more details see <xref linkend="cloning-from-barman"/>.
is being cloned from. For more details see <xref linkend="cloning-from-barman">.
</simpara>
</tip>
<para>
@@ -552,15 +390,8 @@ HINT: after starting the server, you need to register this standby with "repmgr
WAL directory. Any WALs generated during the cloning process will be copied here, and
a symlink will automatically be created from the main data directory.
</para>
<tip>
<para>
The <literal>--waldir</literal> (<literal>--xlogdir</literal>) option,
if present in <varname>pg_basebackup_options</varname>, will be honoured by &repmgr;
when cloning from Barman (&repmgr; 5.2 and later).
</para>
</tip>
<para>
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
for more details of available options.
</para>
</sect2>
@@ -579,8 +410,10 @@ HINT: after starting the server, you need to register this standby with "repmgr
<para>
The recommended way to do this is to store the password in the <literal>postgres</literal> system
user's <filename>~/.pgpass</filename> file. For more information on using the password file, see
the documentation section <xref linkend="configuration-password-file"/>.
user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the
environment variable <varname>PGPASSWORD</varname>, however this is not recommended for
security reasons. For more details see the
<ulink url="https://www.postgresql.org/docs/current/static/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
</para>
<note>
@@ -594,13 +427,26 @@ HINT: after starting the server, you need to register this standby with "repmgr
</note>
<para>
If, for whatever reason, you wish to include the password in the replication configuration file,
If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>,
set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in
<filename>repmgr.conf</filename>. This will read a password set in <varname>PGPASSWORD</varname>
(but not <filename>~/.pgpass</filename>) and place it into the <varname>primary_conninfo</varname>
string in the replication configuration. Note that <varname>PGPASSWORD</varname>
will need to be set during any action which causes the replication configuration file to be
rewritten, e.g. <xref linkend="repmgr-standby-follow"/>.
string in <filename>recovery.conf</filename>. Note that <varname>PGPASSWORD</varname>
will need to be set during any action which causes <filename>recovery.conf</filename> to be
rewritten, e.g. <xref linkend="repmgr-standby-follow">.
</para>
<para>
It is of course also possible to include the password value in the <varname>conninfo</varname>
string for each node, but this is obviously a security risk and should be avoided.
</para>
<para>
From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname>
parameter in connection strings, which can be used to specify a password file other than
the default <filename>~/.pgpass</filename>.
</para>
<para>
To have &repmgr; write a custom password file in <varname>primary_conninfo</varname>,
specify its location in <varname>passfile</varname> in <filename>repmgr.conf</filename>.
</para>
</sect2>
@@ -611,40 +457,12 @@ HINT: after starting the server, you need to register this standby with "repmgr
user (in addition to the user who manages the &repmgr; metadata). In this case,
the replication user should be set in <filename>repmgr.conf</filename> via the parameter
<varname>replication_user</varname>; &repmgr; will use this value when making
replication connections and generating the replication configuration. This
replication connections and generating <filename>recovery.conf</filename>. This
value will also be stored in the parameter <literal>repmgr.nodes</literal>
table for each node; it no longer needs to be explicitly specified when
cloning a node or executing <xref linkend="repmgr-standby-follow"/>.
cloning a node or executing <xref linkend="repmgr-standby-follow">.
</para>
</sect2>
<sect2 id="cloning-advanced-tablespace-mapping" xreflabel="Tablespace mapping">
<title>Tablespace mapping</title>
<indexterm>
<primary>tablespace mapping</primary>
</indexterm>
<para>
&repmgr; provides a <option>tablespace_mapping</option> configuration
file option, which will makes it possible to map the tablespace on the source node to
a different location on the local node.
</para>
<para>
To use this, add <option>tablespace_mapping</option> to <filename>repmgr.conf</filename>
like this:
<programlisting>
tablespace_mapping='/var/lib/pgsql/tblspc1=/data/pgsql/tblspc1'
</programlisting>
</para>
<para>
where the left-hand value represents the tablespace on the source node,
and the right-hand value represents the tablespace on the standby to be cloned.
</para>
<para>
This parameter can be provided multiple times.
</para>
</sect2>
</sect1>

View File

@@ -1,6 +1,4 @@
<sect1 id="configuration-file-log-settings" xreflabel="log settings">
<title>Log settings</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>log settings</secondary>
@@ -9,9 +7,10 @@
<primary>log settings</primary>
<secondary>configuration in repmgr.conf</secondary>
</indexterm>
<title>Log settings</title>
<para>
By default, &repmgr; and &repmgrd; write log output to
By default, &repmgr; and <application>repmgrd</application> write log output to
<literal>STDERR</literal>. An alternative log destination can be specified
(either a file or <literal>syslog</literal>).
</para>
@@ -25,7 +24,7 @@
<para>
This behaviour can be overriden with the command line option <option>--log-to-file</option>,
which will redirect all logging output to the configured log destination. This is recommended
when &repmgr; is executed by another application, particularly &repmgrd;,
when &repmgr; is executed by another application, particularly <application>repmgrd</application>,
to enable log output generated by the &repmgr; application to be stored for later reference.
</para>
</note>
@@ -33,11 +32,12 @@
<variablelist>
<varlistentry id="repmgr-conf-log-level" xreflabel="log_level">
<term><varname>log_level</varname> (<type>string</type>)</term>
<listitem>
<term><varname>log_level</varname> (<type>string</type>)
<indexterm>
<primary><varname>log_level</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
One of <option>DEBUG</option>, <option>INFO</option>, <option>NOTICE</option>,
<option>WARNING</option>, <option>ERROR</option>, <option>ALERT</option>, <option>CRIT</option>
@@ -76,11 +76,11 @@
</term>
<listitem>
<para>
If <xref linkend="repmgr-conf-log-facility"/> is set to <option>STDERR</option>, log output
If <xref linkend="repmgr-conf-log-facility"> is set to <option>STDERR</option>, log output
can be redirected to the specified file.
</para>
<para>
See <xref linkend="repmgrd-log-rotation"/> for information on configuring log rotation.
See <xref linkend="repmgrd-log-rotation"> for information on configuring log rotation.
</para>
</listitem>
</varlistentry>
@@ -93,12 +93,12 @@
</term>
<listitem>
<para>
This setting causes &repmgrd; to emit a status log
This setting causes <application>repmgrd</application> to emit a status log
line at the specified interval (in seconds, default <literal>300</literal>)
describing &repmgrd;'s current state, e.g.:
describing <application>repmgrd</application>'s current state, e.g.:
</para>
<programlisting>
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
</listitem>
</varlistentry>

View File

@@ -1,189 +0,0 @@
<sect1 id="configuration-file-optional-settings" xreflabel="optional configuration file settings">
<title>Optional configuration file settings</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>optional settings</secondary>
</indexterm>
<note>
<simpara>
This section documents a subset of optional configuration settings; for a full
and annotated view of all configuration options see the
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">sample repmgr.conf file</ulink>
</simpara>
</note>
<variablelist>
<varlistentry id="repmgr-conf-config-directory" xreflabel="config_directory">
<term><varname>config_directory</varname> (<type>string</type>)
<indexterm>
<primary><varname>config_directory</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
If PostgreSQL configuration files are located outside the data
directory, specify the directory where the main
<filename>postgresql.conf</filename> file is located.
</para>
<para>
This enables explicit provision of an external configuration file
directory, which if set will be passed to <command>pg_ctl</command> as the
<option>-D</option> parameter. Otherwise <command>pg_ctl</command> will
default to using the data directory, which will cause some operations
to fail if the configuration files are not present there.
</para>
<note>
<para>
This is implemented primarily for feature completeness and for
development/testing purposes. Users who have installed &repmgr; from
a package should <emphasis>not</emphasis> rely on to stop/start/restart PostgreSQL,
instead they should set the appropriate <option>service_..._command</option>
for their operating system. For more details see
<xref linkend="configuration-file-service-commands"/>.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-replication-user" xreflabel="replication_user">
<term><varname>replication_user</varname> (<type>string</type>)
<indexterm>
<primary><varname>replication_user</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
PostgreSQL user to make replication connections with.
If not set defaults, to the user defined in <xref linkend="repmgr-conf-conninfo"/>.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-replication-type" xreflabel="replication_type">
<term><varname>replication_type</varname> (<type>string</type>)
<indexterm>
<primary><varname>replication_type</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Must be <literal>physical</literal> (the default).
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-location" xreflabel="location">
<term><varname>location</varname> (<type>string</type>)
<indexterm>
<primary><varname>location</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
An arbitrary string defining the location of the node; this
is used during failover to check visibility of the
current primary node.
</para>
<para>
For more details see <xref linkend="repmgrd-network-split"/>.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-use-replication-slots" xreflabel="use_replication_slots">
<term><varname>use_replication_slots</varname> (<type>boolean</type>)
<indexterm>
<primary><varname>use_replication_slots</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Whether to use physical replication slots.
</para>
<note>
<para>
When using replication slots,
<varname>max_replication_slots</varname> should be configured for
at least the number of standbys which will connect
to the primary.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-ssh-options" xreflabel="ssh_options">
<term><varname>ssh_options</varname> (<type>string</type>)
<indexterm>
<primary><varname>ssh_options</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Options to append to the <command>ssh</command> command when executed
by &repmgr;.
</para>
<para>
We recommend adding <literal>-q</literal> to suppress any superfluous
SSH chatter such as login banners, and also an explicit
<option>ConnectTimeout</option> value,
e.g.:
<programlisting>
ssh_options='-q -o ConnectTimeout=10'</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-pg-bindir" xreflabel="pg_bindir">
<term><varname>pg_bindir</varname> (<type>string</type>)
<indexterm>
<primary><varname>pg_bindir</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Path to the PostgreSQL binary directory (location of <application>pg_ctl</application>,
<application>pg_basebackup</application> etc.). Only required
if these are not in the system <varname>PATH</varname>.
</para>
<tip>
<para>
When &repmgr; is executed via <application>SSH</application> (e.g. when running
<command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>,
<command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command> or
<command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>,
or if it is executed as cronjob), a login shell will not be used and only the
default system <varname>PATH</varname> will be set. Therefore it's recommended to set
<varname>pg_bindir</varname> so &repmgr; can correctly invoke binaries on a remote
system and avoid potential path issues.
</para>
</tip>
<para>
Debian/Ubuntu users: you will probably need to set this to the directory where
<application>pg_ctl</application> is located, e.g. <filename>/usr/lib/postgresql/9.6/bin/</filename>.
</para>
<para>
<emphasis>NOTE</emphasis>: <varname>pg_bindir</varname> is only used when &repmgr; directly
executes PostgreSQL binaries; any user-defined scripts
<emphasis>must</emphasis> be specified with the full path.
</para>
</listitem>
</varlistentry>
</variablelist>
<tip>
<simpara>
See the <ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">sample repmgr.conf file</ulink>
for a full and annotated view of all configuration options.
</simpara>
</tip>
</sect1>

View File

@@ -1,12 +1,10 @@
<sect1 id="configuration-file-settings" xreflabel="required configuration file settings">
<title>Required configuration file settings</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>required settings</secondary>
</indexterm>
<title>Required configuration file settings</title>
<para>
Each <filename>repmgr.conf</filename> file must contain the following parameters:
</para>
@@ -41,10 +39,6 @@
called <varname>standby1</varname> (for example), things will be confusing
to say the least.
</para>
<para>
The string's maximum length is 63 characters and it should
contain only printable ASCII characters.
</para>
</listitem>
</varlistentry>
@@ -62,7 +56,7 @@
</para>
<para>
For details on conninfo strings, see section <ulink
url="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING">Connection Strings</ulink>
url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">Connection Strings</>
in the PosgreSQL documentation.
</para>
<para>
@@ -70,19 +64,19 @@
<varname>connect_timeout</varname> in the <varname>conninfo</varname>
string to determine the length of time which elapses before a network
connection attempt is abandoned; for details see <ulink
url="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT">
the PostgreSQL documentation</ulink>.
url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT">
the PostgreSQL documentation</>.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-data-directory" xreflabel="data_directory">
<term><varname>data_directory</varname> (<type>string</type>)</term>
<term><varname>data_directory</varname> (<type>string</type>)
<indexterm>
<primary><varname>data_directory</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<indexterm>
<primary><varname>data_directory</varname> configuration file parameter</primary>
</indexterm>
<para>
The node's data directory. This is needed by repmgr
when performing operations when the PostgreSQL instance
@@ -96,9 +90,33 @@
</variablelist>
</para>
<para>
See <xref linkend="configuration-file-optional-settings"/> for further configuration options.
</para>
<para>
For a full list of annotated configuration items, see the file
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
</para>
<para>
For <application>repmgrd</application>-specific settings, see <xref linkend="repmgrd-configuration">.
</para>
<note>
<para>
The following parameters in the configuration file can be overridden with
command line options:
<itemizedlist>
<listitem>
<simpara>
<literal>-L/--log-level</literal> overrides <literal>log_level</literal> in
<filename>repmgr.conf</filename>
</simpara>
</listitem>
<listitem>
<simpara>
<literal>-b/--pg_bindir</literal> overrides <literal>pg_bindir</literal> in
<filename>repmgr.conf</filename>
</simpara>
</listitem>
</itemizedlist>
</para>
</note>
</sect1>

View File

@@ -1,6 +1,4 @@
<sect1 id="configuration-file-service-commands" xreflabel="service command settings">
<title>Service command settings</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>service command settings</secondary>
@@ -9,9 +7,10 @@
<primary>service command settings</primary>
<secondary>configuration in repmgr.conf</secondary>
</indexterm>
<title>Service command settings</title>
<para>
In some circumstances, &repmgr; (and &repmgrd;) need to
In some circumstances, &repmgr; (and <application>repmgrd</application>) need to
be able to stop, start or restart PostgreSQL. &repmgr; commands which need to do this
include <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>,
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link> and
@@ -27,9 +26,7 @@
<note>
<para>
If using <application>systemd</application>, ensure you have <varname>RemoveIPC</varname> set to <literal>off</literal>.
See the <ulink url="https://www.postgresql.org/docs/current/index.html">PostgreSQL documentation</ulink> section
<ulink url="https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC">systemd RemoveIPC</ulink>
and also the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
See the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
entry in the <ulink url="https://wiki.postgresql.org/wiki/Main_Page">PostgreSQL wiki</ulink> for details.
</para>
</note>
@@ -71,18 +68,18 @@
</para>
<para>
Do not confuse this with <varname>promote_command</varname>, which is used
by &repmgrd; to execute <xref linkend="repmgr-standby-promote"/>.
by <application>repmgrd</application> to execute <xref linkend="repmgr-standby-promote">.
</para>
</note>
<para>
To confirm which command &repmgr; will execute for each action, use
<command><link linkend="repmgr-node-service">repmgr node service --list-actions --action=...</link></command>, e.g.:
<command>repmgr node service --list --action=...</command>, e.g.:
<programlisting>
repmgr -f /etc/repmgr.conf node service --list-actions --action=stop
repmgr -f /etc/repmgr.conf node service --list-actions --action=start
repmgr -f /etc/repmgr.conf node service --list-actions --action=restart
repmgr -f /etc/repmgr.conf node service --list-actions --action=reload</programlisting>
repmgr -f /etc/repmgr.conf node service --list --action=stop
repmgr -f /etc/repmgr.conf node service --list --action=start
repmgr -f /etc/repmgr.conf node service --list --action=restart
repmgr -f /etc/repmgr.conf node service --list --action=reload</programlisting>
</para>
<para>

View File

@@ -0,0 +1,69 @@
<sect1 id="configuration-file" xreflabel="configuration file location">
<indexterm>
<primary>repmgr.conf</primary>
<secondary>location</secondary>
</indexterm>
<indexterm>
<primary>configuration</primary>
<secondary>repmgr.conf location</secondary>
</indexterm>
<title>Configuration file location</title>
<para>
<application>repmgr</application> and <application>repmgrd</application>
use a common configuration file, by default called
<filename>repmgr.conf</filename> (although any name can be used if explicitly specified).
<filename>repmgr.conf</filename> must contain a number of required parameters, including
the database connection string for the local node and the location
of its data directory; other values will be inferred from defaults if
not explicitly supplied. See section <xref linkend="configuration-file-settings">
for more details.
</para>
<para>
The configuration file will be searched for in the following locations:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>a configuration file specified by the <literal>-f/--config-file</literal> command line option</para>
</listitem>
<listitem>
<para>
a location specified by the package maintainer (if <application>repmgr</application>
as installed from a package and the package maintainer has specified the configuration
file location)
</para>
</listitem>
<listitem>
<para><filename>repmgr.conf</filename> in the local directory</para>
</listitem>
<listitem>
<para><filename>/etc/repmgr.conf</filename></para>
</listitem>
<listitem>
<para>the directory reported by <application>pg_config --sysconfdir</application></para>
</listitem>
</itemizedlist>
</para>
<para>
Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
an error will be raised if it is not found or not readable, and no attempt will be made to
check default locations; this is to prevent <application>repmgr</application> unexpectedly
reading the wrong configuraton file.
</para>
<note>
<para>
If providing the configuration file location with <literal>-f/--config-file</literal>,
avoid using a relative path, particularly when executing <xref linkend="repmgr-primary-register">
and <xref linkend="repmgr-standby-register">, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
a relative path into an absolute one, but this may not be the same as the path you
would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
</sect1>

View File

@@ -1,315 +0,0 @@
<sect1 id="configuration-file" xreflabel="configuration file">
<title>Configuration file</title>
<indexterm>
<primary>repmgr.conf</primary>
</indexterm>
<indexterm>
<primary>configuration</primary>
<secondary>repmgr.conf</secondary>
</indexterm>
<para>
<application>repmgr</application> and &repmgrd;
use a common configuration file, by default called
<filename>repmgr.conf</filename> (although any name can be used if explicitly specified).
<filename>repmgr.conf</filename> must contain a number of required parameters, including
the database connection string for the local node and the location
of its data directory; other values will be inferred from defaults if
not explicitly supplied. See section <xref linkend="configuration-file-settings"/>
for more details.
</para>
<sect2 id="configuration-file-format" xreflabel="configuration file format">
<title>Configuration file format</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>format</secondary>
</indexterm>
<para>
<filename>repmgr.conf</filename> is a plain text file with one parameter/value
combination per line.
</para>
<para>
Whitespace is insignificant (except within a quoted parameter value) and blank lines are ignored.
Hash marks (<literal>#</literal>) designate the remainder of the line as a comment.
Parameter values that are not simple identifiers or numbers should be single-quoted.
</para>
<para>
To embed a single quote in a parameter value, write either two quotes (preferred) or backslash-quote.
</para>
<para>
Example of a valid <filename>repmgr.conf</filename> file:
<programlisting>
# repmgr.conf
node_id=1
node_name= node1
conninfo ='host=node1 dbname=repmgr user=repmgr connect_timeout=2'
data_directory = '/var/lib/pgsql/12/data'</programlisting>
</para>
<note>
<para>
Beginning with <link linkend="release-5.0">repmgr 5.0</link>, configuration
file parsing has been tightened up and now matches the way PostgreSQL
itself parses configuration files.
</para>
<para>
This means <filename>repmgr.conf</filename> files used with earlier &repmgr;
versions may need slight modification before they can be used with &repmgr; 5
and later.
</para>
<para>
The main change is that &repmgr; requires most string values to be
enclosed in single quotes. For example, this was previously valid:
<programlisting>
conninfo=host=node1 user=repmgr dbname=repmgr connect_timeout=2</programlisting>
but must now be changed to:
<programlisting>
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>
</para>
</note>
<sect3 id="configuration-file-include-directives" xreflabel="configuration file include directives">
<title>Configuration file include directives</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>include directives</secondary>
</indexterm>
<para>
From &repmgr; 5.2, the configuration file can contain the following include directives:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>include</option>: include the specified file,
either as an absolute path or path relative to the current file
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_if_exists</option>: include the specified file.
The file is specified as an absolute path or path relative to the current file.
However, if it does not exist, an error will not be raised.
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_dir</option>: include files in the specified directory
which have the <filename>.conf</filename> suffix.
The directory is specified either as an absolute path or path
relative to the current file
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
These behave in exactly the same way as the PostgreSQL configuration file processing;
see the <ulink url="https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES">PostgreSQL documentation</ulink>
for additional details.
</para>
</sect3>
</sect2>
<sect2 id="configuration-file-items" xreflabel="configuration file items">
<title>Configuration file items</title>
<para>
The following sections document some sections of the configuration file:
<itemizedlist>
<listitem>
<simpara>
<xref linkend="configuration-file-settings"/>
</simpara>
</listitem>
<listitem>
<simpara>
<xref linkend="configuration-file-optional-settings"/>
</simpara>
</listitem>
<listitem>
<simpara>
<xref linkend="configuration-file-log-settings"/>
</simpara>
</listitem>
<listitem>
<simpara>
<xref linkend="configuration-file-service-commands"/>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
For a full list of annotated configuration items, see the file
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
</para>
<para>
For &repmgrd;-specific settings, see <xref linkend="repmgrd-configuration"/>.
</para>
<note>
<para>
The following parameters in the configuration file can be overridden with
command line options:
<itemizedlist>
<listitem>
<simpara>
<literal>-L/--log-level</literal> overrides <literal>log_level</literal> in
<filename>repmgr.conf</filename>
</simpara>
</listitem>
<listitem>
<simpara>
<literal>-b/--pg_bindir</literal> overrides <literal>pg_bindir</literal> in
<filename>repmgr.conf</filename>
</simpara>
</listitem>
</itemizedlist>
</para>
</note>
</sect2>
<sect2 id="configuration-file-location" xreflabel="configuration file location">
<title>Configuration file location</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>location</secondary>
</indexterm>
<para>
The configuration file will be searched for in the following locations:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>a configuration file specified by the <literal>-f/--config-file</literal> command line option</para>
</listitem>
<listitem>
<para>
a location specified by the package maintainer (if <application>repmgr</application>
as installed from a package and the package maintainer has specified the configuration
file location)
</para>
</listitem>
<listitem>
<para><filename>repmgr.conf</filename> in the local directory</para>
</listitem>
<listitem>
<para><filename>/etc/repmgr.conf</filename></para>
</listitem>
<listitem>
<para>the directory reported by <application>pg_config --sysconfdir</application></para>
</listitem>
</itemizedlist>
</para>
<para>
In examples provided in this documentation, it is assumed the configuration file is located
at <filename>/etc/repmgr.conf</filename>. If &repmgr; is installed from a package, the
configuration file will probably be located at another location specified by the packager;
see appendix <xref linkend="appendix-packages"/> for configuration file locations in
different packaging systems.
</para>
<para>
Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
an error will be raised if it is not found or not readable, and no attempt will be made to
check default locations; this is to prevent <application>repmgr</application> unexpectedly
reading the wrong configuration file.
</para>
<note>
<para>
If providing the configuration file location with <literal>-f/--config-file</literal>,
avoid using a relative path, particularly when executing <xref linkend="repmgr-primary-register"/>
and <xref linkend="repmgr-standby-register"/>, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover"/>). &repmgr; will attempt to convert the
a relative path into an absolute one, but this may not be the same as the path you
would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
</sect2>
<sect2 id="configuration-file-postgresql-major-upgrades" xreflabel="configuration file and PostgreSQL major version upgrades">
<title>Configuration file and PostgreSQL major version upgrades</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>PostgreSQL major version upgrades</secondary>
</indexterm>
<para>
When upgrading the PostgreSQL cluster to a new major version, <filename>repmgr.conf</filename>
will probably needed to be updated.
</para>
<para>
Usually <option>pg_bindir</option> and <option>data_directory</option> will need to be modified,
particularly if the default package locations are used, as these usually change.
</para>
<para>
It's also possible the location of <filename>repmgr.conf</filename> itself will change
(e.g. from <filename>/etc/repmgr/11/repmgr.conf</filename> to <filename>/etc/repmgr/12/repmgr.conf</filename>).
This is stored as part of the &repmgr; metadata and is used by &repmgr; to execute &repmgr; remotely
(e.g. during a <link linkend="performing-switchover">switchover operation</link>).
</para>
<para>
If the content and/or location of <filename>repmgr.conf</filename> has changed, the &repmgr; metadata
needs to be updated to reflect this. The &repmgr; metadata can be updated on each node with:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<link linkend="repmgr-primary-register">
<command>repmgr primary register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-standby-register">
<command>repmgr standby register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-witness-register">
<command>repmgr witness register --force -f /path/to/repmgr.conf -h primary_host</command>
</link>
</simpara>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>

View File

@@ -1,175 +0,0 @@
<sect1 id="configuration-password-management" xreflabel="password management">
<title>Password Management</title>
<indexterm>
<primary>passwords</primary>
</indexterm>
<sect2 id="configuration-password-management-options" xreflabel="password management options">
<title>Password Management Options</title>
<indexterm>
<primary>passwords</primary>
<secondary>options for managing</secondary>
</indexterm>
<para>
For security purposes it's desirable to protect database access using a password.
</para>
<para>
PostgreSQL has three ways of providing a password:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
including the password in the <option>conninfo</option> string
(e.g. &quot;<literal>host=node1 dbname=repmgr user=repmgr password=foo</literal>&quot;)
</simpara>
</listitem>
<listitem>
<simpara>
exporting the password as an environment variable (<envar>PGPASSWORD</envar>)
</simpara>
</listitem>
<listitem>
<simpara>
storing the password in a dedicated password file
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
We strongly advise against including the password in the <option>conninfo</option> string, as
this will result in the database password being exposed in various places, including in the
<filename>repmgr.conf</filename> file, the <literal>repmgr.nodes</literal> table, any output
generated by &repmgr; which lists the node <option>conninfo</option> strings (e.g.
<link linkend="repmgr-cluster-show">repmgr cluster show</link>) and in the &repmgr; log file,
particularly at <option>log_level=DEBUG</option>.
</para>
<note>
<para>
Currently &repmgr; does not fully support use of the <option>password</option> option in the
<option>conninfo</option> string.
</para>
</note>
<para>
Exporting the password as an environment variable (<envar>PGPASSWORD</envar>) is considered
less insecure, but the PostgreSQL documentation explicitly recommends against doing this:
<blockquote>
<attribution><ulink url="https://www.postgresql.org/docs/current/libpq-envars.html">Environment Variables</ulink></attribution>
<para>
<envar>PGPASSWORD</envar> behaves the same as the <option>password</option>
connection parameter. Use of this environment variable
is not recommended for security reasons, as some operating systems
allow non-root users to see process environment variables via
<application>ps</application>; instead consider using a password file.
</para>
</blockquote>
</para>
<para>
The most secure option for managing passwords is to use a dedicated password file; see the following
section for more details.
</para>
</sect2>
<sect2 id="configuration-password-file" xreflabel="password file">
<title>Using a password file</title>
<indexterm>
<primary>pgpass</primary>
</indexterm>
<indexterm>
<primary>.pgpass</primary>
</indexterm>
<indexterm>
<primary>passwords</primary>
<secondary>using a password file</secondary>
</indexterm>
<para>
The most secure way of storing passwords is in a password file,
which by default is <filename>~/.pgpass</filename>. This file
can only be read by the system user who owns the file, and
PostgreSQL will refuse to use the file unless read/write
permissions are restricted to the file owner. The password(s)
contained in the file will not be directly accessed by
&repmgr; (or any other libpq-based client software such as <application>psql</application>).
</para>
<para>
For full details see the
<ulink url="https://www.postgresql.org/docs/current/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
</para>
<para>
For use with &repmgr;, the <filename>~/.pgpass</filename> must two entries for each
node in the replication cluster: one for the &repmgr; user who accesses the &repmgr; metadatabase,
and one for replication connections (regardless of whether a dedicated replication user is used).
The file must be present on each node in the replication cluster.
</para>
<para>
A <filename>~/.pgpass</filename> file for a 3-node cluster where the <literal>repmgr</literal> database user
is used for both for accessing the &repmgr; metadatabase and for replication connections would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:replication:repmgr:foo
node2:5432:repmgr:repmgr:foo
node2:5432:replication:repmgr:foo
node3:5432:repmgr:repmgr:foo
node3:5432:replication:repmgr:foo</programlisting>
If a dedicated replication user (here: <literal>repluser</literal>) is in use, the file would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:replication:repluser:foo
node2:5432:repmgr:repmgr:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:replication:repluser:foo</programlisting>
If you are planning to use the <option>-S</option>/<option>--superuser</option> option,
there must also be an entry enabling the superuser to connect to the &repmgr; database.
Assuming the superuser is <literal>postgres</literal>, the file would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:repmgr:postgres:foo
node1:5432:replication:repluser:foo
node2:5432:repmgr:repmgr:foo
node2:5432:repmgr:postgres:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:repmgr:postgres:foo
node3:5432:replication:repluser:foo</programlisting>
</para>
<para>
The <filename>~/.pgpass</filename> file can be simplified with the use of wildcards if
there is no requirement to restrict provision of passwords to particular hosts, ports
or databases. The preceding file could then be formatted like this:
<programlisting>
*:*:*:repmgr:foo
*:*:*:postgres:foo
</programlisting>
</para>
<note>
<para>
It's possible to specify an alternative location for the <filename>~/.pgpass</filename> file, either via
the environment variable <envar>PGPASSFILE</envar>, or (from PostgreSQL 9.6) using the
<varname>passfile</varname> parameter in connection strings.
</para>
<para>
If using the <varname>passfile</varname> parameter, it's essential to ensure the file is in the same
location on all nodes, as when connecting to a remote node, the file referenced is the one on the
local node.
</para>
<para>
Additionally, you <emphasis>must</emphasis> specify the passfile location in <filename>repmgr.conf</filename>
with the <option>passfile</option> option so &repmgr; can write the correct path when creating the
<option>primary_conninfo</option> parameter for replication configuration on standbys.
</para>
</note>
</sect2>
</sect1>

View File

@@ -1,194 +0,0 @@
<sect1 id="configuration-permissions" xreflabel="Database user permissions">
<title>repmgr database user permissions</title>
<indexterm>
<primary>configuration</primary>
<secondary>database user permissions</secondary>
</indexterm>
<para>
If the &repmgr; database user (the PostgreSQL user defined in the
<varname>conninfo</varname> setting is a superuser, no further user permissions need
to be granted.
</para>
<sect2 id="configuration-permissions-no-superuser" xreflabel="Non-super user permissions">
<title>repmgr user as a non-superuser</title>
<para>
In principle the &repmgr; database user does not need to be a superuser.
In this case the &repmgr; will need to be granted execution permissions on certain
functions, and membership of certain roles. However be aware that &repmgr; does
expect to be able to execute certain commands which are restricted to superusers;
in this case either a superuser must be specified with the <option>-S</option>/<option>--superuser</option>
(where available) option, or the corresponding action should be executed manually as a superuser.
</para>
<para>
The following sections describe the actions needed to use &repmgr; with a non-superuser,
and relevant caveats.
</para>
<sect3 id="configuration-permissions-replication" xreflabel="Replication role">
<title>Replication role</title>
<para>
&repmgr; requires a database user with the <literal>REPLICATION</literal> role
to be able to create a replication connection and (if configured) to administer
replication slots.
</para>
<para>
By default this is the database user defined in the <varname>conninfo</varname>
setting. This user can be:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
a superuser
</simpara>
</listitem>
<listitem>
<simpara>
a non-superuser with the <literal>REPLICATION</literal> role
</simpara>
</listitem>
<listitem>
<simpara>
another user defined in the <filename>repmgr.conf</filename> parameter <varname>replication_user</varname> with the <literal>REPLICATION</literal> role
</simpara>
</listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="configuration-permissions-roles" xreflabel="Database roles for non-superusers">
<title>Database roles</title>
<para>
A non-superuser &repmgr; database user should be a member of the following
<ulink url="https://www.postgresql.org/docs/current/predefined-roles.html">predefined roles</ulink>
(PostgreSQL 10 and later):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<varname>pg_read_all_stats</varname>
(to read the <varname>status</varname> column of <literal>pg_stat_replication</literal>
and execute <function>pg_database_size()</function> on all databases)
</simpara>
</listitem>
<listitem>
<simpara>
<varname>pg_read_all_settings</varname> (to access the <varname>data_directory</varname> setting)
</simpara>
</listitem>
</itemizedlist>
Alternatively the meta-role <varname>pg_monitor</varname> can be granted, which includes membership
of the above predefined roles.
</para>
<para>
Membership of these roles can be granted with e.g. <command>GRANT pg_read_all_stats TO repmgr</command>.
</para>
<para>
Users of PostgreSQL 9.6 or earlier should upgrade to a supported PostgreSQL version, or provide
the <option>-S</option>/<option>--superuser</option> where available.
</para>
</sect3>
<sect3 id="configuration-permissions-extension" xreflabel="Extension creation">
<title>Extension creation</title>
<para>
&repmgr; requires that the database defined in the <varname>conninfo</varname>
setting contains the <literal>repmgr</literal> extension. The database user defined in the
<varname>conninfo</varname> setting must be able to access this database and
the database objects contained within the extension.
</para>
<para>
The <literal>repmgr</literal> extension can only be installed by a superuser.
If the &repmgr; user is a superuser, &repmgr; will create the extension automatically.
</para>
<para>
Alternatively, the extension can be created manually by a superuser
(with &quot;<command>CREATE EXTENSION repmgr</command>&quot;) before executing
<link linkend="repmgr-primary-register">repmgr primary register</link>.
</para>
</sect3>
<sect3 id="configuration-permissions-functions" xreflabel="Function permissions for non-superusers">
<title>Function permissions</title>
<para>
If the &repmgr; database user is not a superuser, <literal>EXECUTE</literal> permission should be
granted on the following function:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<function>pg_wal_replay_resume()</function> (required by &repmgrd; during failover operations;
if permission is not granted, the failoved process may not function reliably if a node
has WAL replay paused)
</simpara>
</listitem>
<listitem>
<simpara>
<function>pg_promote()</function> (PostgreSQL 12 and later; if permission is not granted,
&repmgr; will fall back to <command>pg_ctl promote</command>)
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
<literal>EXECUTE</literal> permission on functions can be granted with e.g.:
<command>GRANT EXECUTE ON FUNCTION pg_catalog.pg_wal_replay_resume() TO repmgr</command>.
</para>
</sect3>
<sect3 id="configuration-permissions-superuser-required" xreflabel="repmgr actions requiring a superuser">
<title>repmgr actions requiring a superuser</title>
<para>
In some circumstances, &repmgr; may need to perform an operation which cannot be delegated to a
non-superuser.
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
The <command>CHECKPOINT</command> command is executed by
<link linkend="repmgr-standby-switchover">repmgr standby switchover</link>. This can only
be executed by a superuser; if the &repmgr; user is not a superuser,
the <option>-S</option>/<option>--superuser</option> should be used.
</simpara>
<simpara>
If &repmgr; is not able to execute <command>CHECKPOINT</command>,
there is a risk that the demotion candidate may not be able to shut down as smoothly as might otherwise
have been the case.
</simpara>
</listitem>
<listitem>
<simpara>
The <command>ALTER SYSTEM</command> is executed by &repmgrd; if
<varname>standby_disconnect_on_failover</varname> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>. Until PostgreSQL 14 <command>ALTER SYSTEM</command> can only be executed by
a superuser; if the &repmgr; user is not a superuser, this functionality will not be available.
From PostgreSQL 15 a specific ALTER SYSTEM privilege can be granted with e.g.
<command>GRANT ALTER SYSTEM ON PARAMETER wal_retrieve_retry_interval TO repmgr</command>.
</simpara>
</listitem>
</itemizedlist>
</para>
</sect3>
<sect3 id="configuration-permissions-superuser-option" xreflabel="repmgr commands with --superuser option">
<title>repmgr commands with --superuser option</title>
<para>
The following repmgr commands provide the <option>-S</option>/<option>--superuser</option> option:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><link linkend="repmgr-standby-clone">repmgr standby clone</link> (to be able to copy configuration files outside of the data directory if <option>--copy-external-config-files</option> provided)</simpara>
</listitem>
<listitem>
<simpara><link linkend="repmgr-standby-switchover">repmgr standby switchover</link> (to execute <command>CHECKPOINT</command>)</simpara>
</listitem>
<listitem>
<simpara><link linkend="repmgr-node-check">repmgr node check</link> (to execute <command>repmgr node check --data-directory-config</command>; note this is also called by <link linkend="repmgr-standby-switchover">repmgr standby switchover</link>)</simpara>
</listitem>
<listitem>
<simpara><link linkend="repmgr-node-service">repmgr node service</link> (to execute <command>CHECKPOINT</command> via the <option>--checkpoint</option>; note this is also called by <link linkend="repmgr-standby-switchover">repmgr standby switchover</link>)</simpara>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
</sect1>

26
doc/configuration.sgml Normal file
View File

@@ -0,0 +1,26 @@
<chapter id="configuration" xreflabel="Configuration">
<title>repmgr configuration</title>
&configuration-file;
&configuration-file-required-settings;
&configuration-file-log-settings;
&configuration-file-service-commands;
<sect1 id="configuration-permissions" xreflabel="Database user permissions">
<indexterm>
<primary>configuration</primary>
<secondary>database user permissions</secondary>
</indexterm>
<title>repmgr database user permissions</title>
<para>
&repmgr; will create an extension database containing objects
for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname>
setting must be able to access all objects. Additionally, superuser permissions
are required to install the &repmgr; extension. The easiest way to do this
is create the &repmgr; user as a superuser, however if this is not
desirable, the &repmgr; user can be created as a normal user and a
superuser specified with <literal>--superuser</literal> when registering a &repmgr; node.
</para>
</sect1>
</chapter>

View File

@@ -1,335 +0,0 @@
<chapter id="configuration" xreflabel="Configuration">
<title>repmgr configuration</title>
<sect1 id="configuration-prerequisites" xreflabel="Prerequisites for configuration">
<title>Prerequisites for configuration</title>
<indexterm>
<primary>configuration</primary>
<secondary>prerequisites</secondary>
</indexterm>
<indexterm>
<primary>configuration</primary>
<secondary>ssh</secondary>
</indexterm>
<para>
Following software must be installed on both servers:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><application>PostgreSQL</application></simpara>
</listitem>
<listitem>
<simpara>
<application>repmgr</application>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
At network level, connections between the PostgreSQL port (default: <literal>5432</literal>)
must be possible between all nodes.
</para>
<para>
Passwordless <command>SSH</command> connectivity between all servers in the replication cluster
is not required, but is necessary in the following cases:
<itemizedlist>
<listitem>
<simpara>if you need &repmgr; to copy configuration files from outside the PostgreSQL
data directory (as is the case with e.g. <link linkend="packages-debian-ubuntu">Debian packages</link>);
in this case <command>rsync</command> must also be installed on all servers.
</simpara>
</listitem>
<listitem>
<simpara>to perform <link linkend="performing-switchover">switchover operations</link></simpara>
</listitem>
<listitem>
<simpara>
when executing <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command>
and <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
</simpara>
</listitem>
</itemizedlist>
</para>
<tip>
<simpara>
Consider setting <varname>ConnectTimeout</varname> to a low value in your SSH configuration.
This will make it faster to detect any SSH connection errors.
</simpara>
</tip>
<sect2 id="configuration-postgresql" xreflabel="PostgreSQL configuration">
<title>PostgreSQL configuration for &repmgr;</title>
<indexterm>
<primary>configuration</primary>
<secondary>PostgreSQL</secondary>
</indexterm>
<indexterm>
<primary>PostgreSQL configuration</primary>
</indexterm>
<para>
The following PostgreSQL configuration parameters may need to be changed in order
for &repmgr; (and replication itself) to function correctly.
</para>
<variablelist>
<varlistentry>
<term><option>hot_standby</option></term>
<listitem>
<indexterm>
<primary>hot_standby</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
<option>hot_standby</option> must always be set to <literal>on</literal>, as &repmgr; needs
to be able to connect to each server it manages.
</para>
<para>
Note that <option>hot_standby</option> defaults to <literal>on</literal> from PostgreSQL 10
and later; in PostgreSQL 9.6 and earlier, the default was <literal>off</literal>.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-HOT-STANDBY">hot_standby</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>wal_level</option></term>
<listitem>
<indexterm>
<primary>wal_level</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
<option>wal_level</option> must be one of <option>replica</option> or <option>logical</option>
(PostgreSQL 9.5 and earlier: one of <option>hot_standby</option> or <option>logical</option>).
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL">wal_level</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>max_wal_senders</option></term>
<listitem>
<indexterm>
<primary>max_wal_senders</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
<option>max_wal_senders</option> must be set to a value of <literal>2</literal> or greater.
In general you will need one WAL sender for each standby which will attach to the PostgreSQL
instance; additionally &repmgr; will require two free WAL senders in order to clone further
standbys.
</para>
<para>
<option>max_wal_senders</option> should be set to an appropriate value on all PostgreSQL
instances in the replication cluster which may potentially become a primary server or
(in cascading replication) the upstream server of a standby.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-WAL-SENDERS">max_wal_senders</ulink>.
</para>
<note>
<para>
From <productname>PostgreSQL 12</productname>, <option>max_wal_senders</option>
<emphasis>must</emphasis> be set to the same or a higher value as the primary node
(at the time the node was cloned), otherwise the standby will refuse
to start (unless <option>hot_standby</option> is set to <literal>off</literal>, which
will prevent the node from accepting queries).
</para>
</note>
</listitem>
</varlistentry>
<varlistentry>
<term><option>max_replication_slots</option></term>
<listitem>
<indexterm>
<primary>max_replication_slots</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
If you are intending to use replication slots, <option>max_replication_slots</option>
must be set to a non-zero value.
</para>
<para>
<option>max_replication_slots</option> should be set to an appropriate value on all PostgreSQL
instances in the replication cluster which may potentially become a primary server or
(in cascading replication) the upstream server of a standby.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS">max_replication_slots</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>wal_log_hints</option></term>
<listitem>
<indexterm>
<primary>wal_log_hints</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>If you are intending to use <application>pg_rewind</application>,
and the cluster was not initialised using data checksums, you may want to consider enabling
<option>wal_log_hints</option>.
</para>
<para>
For more details see <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LOG-HINTS">wal_log_hints</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>archive_mode</option></term>
<listitem>
<indexterm>
<primary>archive_mode</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
We suggest setting <option>archive_mode</option> to <literal>on</literal> (and
<option>archive_command</option> to <literal>/bin/true</literal>; see below)
even if you are currently not planning to use WAL file archiving.
</para>
<para>
This will make it simpler to set up WAL file archiving if it is ever required,
as changes to <option>archive_mode</option> require a full PostgreSQL server
restart, while <option>archive_command</option> changes can be applied via a normal
configuration reload.
</para>
<para>
However, &repmgr; itself does not require WAL file archiving.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE">archive_mode</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>archive_command</option></term>
<listitem>
<indexterm>
<primary>archive_command</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
If you have set <option>archive_mode</option> to <literal>on</literal> but are not currently planning
to use WAL file archiving, set <option>archive_command</option> to a command which does nothing but returns
<literal>true</literal>, such as <command>/bin/true</command>. See above for details.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-COMMAND">archive_command</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>wal_keep_segments</option> / <option>wal_keep_size</option></term>
<listitem>
<indexterm>
<primary>wal_keep_segments</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<indexterm>
<primary>wal_keep_size</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
Normally there is no need to set <option>wal_keep_segments</option>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>; default: <literal>0</literal>),
as it is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL
segments are available to standbys. Replication slots and/or an archiving solution
such as Barman are recommended to ensure standbys have a reliable
source of WAL segments at all times.
</para>
<para>
The only reason ever to set <option>wal_keep_segments</option> / <option>wal_keep_size</option>
is you have you have configured <option>pg_basebackup_options</option>
in <filename>repmgr.conf</filename> to include the setting <literal>--wal-method=fetch</literal>
(PostgreSQL 9.6 and earlier: <literal>--xlog-method=fetch</literal>)
<emphasis>and</emphasis> you have <emphasis>not</emphasis> set <option>restore_command</option>
in <filename>repmgr.conf</filename> to fetch WAL files from a reliable source such as Barman,
in which case you'll need to set <option>wal_keep_segments</option>
to a sufficiently high number to ensure that all WAL files required by the standby
are retained. However we do not recommend WAL retention in this way.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS">wal_keep_segments</ulink>.
<!--
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SIZE">wal_keep_size</ulink>.
-->
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
See also the <link linkend="quickstart-postgresql-configuration">PostgreSQL configuration</link> section in the
<link linkend="quickstart">Quick-start guide</link>.
</para>
</sect2>
</sect1>
&configuration-file;
&configuration-file-required-settings;
&configuration-file-optional-settings;
&configuration-file-log-settings;
&configuration-file-service-commands;
&configuration-permissions;
&configuration-password-management;
</chapter>

View File

@@ -0,0 +1,93 @@
<chapter id="using-witness-server">
<indexterm>
<primary>witness server</primary>
<seealso>Using a witness server with repmgrd</seealso>
</indexterm>
<title>Using a witness server</title>
<para>
A <xref linkend="witness-server"> is a normal PostgreSQL instance which
is not part of the streaming replication cluster; its purpose is, if a
failover situation occurs, to provide proof that the primary server
itself is unavailable.
</para>
<para>
A typical use case for a witness server is a two-node streaming replication
setup, where the primary and standby are in different locations (data centres).
By creating a witness server in the same location (data centre) as the primary,
if the primary becomes unavailable it's possible for the standby to decide whether
it can promote itself without risking a "split brain" scenario: if it can't see either the
witness or the primary server, it's likely there's a network-level interruption
and it should not promote itself. If it can seen the witness but not the primary,
this proves there is no network interruption and the primary itself is unavailable,
and it can therefore promote itself (and ideally take action to fence the
former primary).
</para>
<note>
<para>
<emphasis>Never</emphasis> install a witness server on the same physical host
as another node in the replication cluster managed by &repmgr; - it's essential
the witness is not affected in any way by failure of another node.
</para>
</note>
<para>
For more complex replication scenarios,e.g. with multiple datacentres, it may
be preferable to use location-based failover, which ensures that only nodes
in the same location as the primary will ever be promotion candidates;
see <xref linkend="repmgrd-network-split"> for more details.
</para>
<note>
<simpara>
A witness server will only be useful if <application>repmgrd</application>
is in use.
</simpara>
</note>
<sect1 id="creating-witness-server">
<title>Creating a witness server</title>
<para>
To create a witness server, set up a normal PostgreSQL instance on a server
in the same physical location as the cluster's primary server.
</para>
<para>
This instance should *not* be on the same physical host as the primary server,
as otherwise if the primary server fails due to hardware issues, the witness
server will be lost too.
</para>
<note>
<simpara>
&repmgr; 3.3 and earlier provided a <command>repmgr create witness</command>
command, which would automatically create a PostgreSQL instance. However
this often resulted in an unsatisfactory, hard-to-customise instance.
</simpara>
</note>
<para>
The witness server should be configured in the same way as a normal
&repmgr; node; see section <xref linkend="configuration">.
</para>
<para>
Register the witness server with <xref linkend="repmgr-witness-register">.
This will create the &repmgr; extension on the witness server, and make
a copy of the &repmgr; metadata.
</para>
<note>
<simpara>
As the witness server is not part of the replication cluster, further
changes to the &repmgr; metadata will be synchronised by
<application>repmgrd</application>.
</simpara>
</note>
<para>
Once the witness server has been configured, <application>repmgrd</application>
should be started; for more details see <xref linkend="repmgrd-witness-server">.
</para>
<para>
To unregister a witness server, use <xref linkend="repmgr-witness-unregister">.
</para>
</sect1>
</chapter>

View File

@@ -1,12 +1,12 @@
<chapter id="event-notifications" xreflabel="event notifications">
<title>Event Notifications</title>
<indexterm>
<primary>event notifications</primary>
</indexterm>
<title>Event Notifications</title>
<para>
Each time &repmgr; or &repmgrd; perform a significant event, a record
Each time &repmgr; or <application>repmgrd</application> perform a significant event, a record
of that event is written into the <literal>repmgr.events</literal> table together with
a timestamp, an indication of failure or success, and further details
if appropriate. This is useful for gaining an overview of events
@@ -27,7 +27,7 @@
(3 rows)</programlisting>
</para>
<para>
Alternatively, use <xref linkend="repmgr-cluster-event"/> to output a
Alternatively, use <xref linkend="repmgr-cluster-event"> to output a
formatted list of events.
</para>
<para>
@@ -88,15 +88,15 @@
<para>
The values provided for <literal>%t</literal> and <literal>%d</literal>
may contain spaces, so should be quoted in the provided command
will probably contain spaces, so should be quoted in the provided command
configuration, e.g.:
<programlisting>
event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'</programlisting>
event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
</programlisting>
</para>
<para>
The following parameters are provided for a subset of event notifications; their meaning may
change according to context:
The following parameters are provided for a subset of event notifications:
</para>
<variablelist>
@@ -104,13 +104,10 @@
<term><option>%p</option></term>
<listitem>
<para>
node ID of the current primary (<xref linkend="repmgr-standby-register"/> and <xref linkend="repmgr-standby-follow"/>)
node ID of the current primary (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"/> only)
</para>
<para>
node ID of the former primary (<literal>repmgrd_failover_promote</literal> only)
node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"> only)
</para>
</listitem>
</varlistentry>
@@ -119,7 +116,11 @@
<listitem>
<para>
<literal>conninfo</literal> string of the primary node
(<xref linkend="repmgr-standby-register"/> and <xref linkend="repmgr-standby-follow"/>)
(<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
<literal>conninfo</literal> string of the next available node
(<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
@@ -128,7 +129,10 @@
<term><option>%a</option></term>
<listitem>
<para>
name of the current primary node (<xref linkend="repmgr-standby-register"/> and <xref linkend="repmgr-standby-follow"/>)
name of the current primary node (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
@@ -137,16 +141,13 @@
<para>
The values provided for <literal>%c</literal> and <literal>%a</literal>
may contain spaces, so should always be quoted.
will probably contain spaces, so should always be quoted.
</para>
<para>
By default, all notification types will be passed to the designated script;
the notification types can be filtered to explicitly named ones using the
<varname>event_notifications</varname> parameter, e.g.:
<programlisting>
event_notifications='primary_register,standby_register,witness_register'</programlisting>
<varname>event_notifications</varname> parameter.
</para>
<para>
@@ -204,7 +205,7 @@
</para>
<para>
Events generated by &repmgrd; (streaming replication mode):
Events generated by <application>repmgrd</application> (streaming replication mode):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
@@ -254,20 +255,29 @@
<simpara><literal>standby_recovery</literal></simpara>
</listitem>
</itemizedlist>
</para>
<para>
Events generated by <application>repmgrd</application> (BDR mode):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal><link linkend="repmgrd-primary-child-disconnection-events">child_node_disconnect</link></literal></simpara>
<simpara><literal>bdr_failover</literal></simpara>
</listitem>
<listitem>
<simpara><literal><link linkend="repmgrd-primary-child-disconnection-events">child_node_reconnect</link></literal></simpara>
<simpara><literal>bdr_reconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal><link linkend="repmgrd-primary-child-disconnection-events">child_node_new_connect</link></literal></simpara>
<simpara><literal>bdr_recovery</literal></simpara>
</listitem>
<listitem>
<simpara><literal><link linkend="repmgrd-primary-child-disconnection-events">child_nodes_disconnect_command</link></literal></simpara>
<simpara><literal>bdr_register</literal></simpara>
</listitem>
<listitem>
<simpara><literal>bdr_unregister</literal></simpara>
</listitem>
</itemizedlist>
</itemizedlist>
</para>
<para>

92
doc/filelist.sgml Normal file
View File

@@ -0,0 +1,92 @@
<!-- doc/filelist.sgml -->
<!ENTITY legal SYSTEM "legal.sgml">
<!ENTITY bookindex SYSTEM "bookindex.sgml">
<!--
Some parts of the documentation are also source for some plain-text
files used during installation. To selectively ignore or include
some parts (e.g., external xref's) when generating these files we use
these parameter entities. See also standalone-install.sgml.
-->
<!ENTITY % standalone-ignore "INCLUDE">
<!ENTITY % standalone-include "IGNORE">
<!-- doc/filelist.sgml -->
<!--
By default, no index is included. Use -i include-index on the command line
to include it.
-->
<!ENTITY % include-index "IGNORE">
<!--
Create empty index element for processing by XSLT stylesheet.
-->
<!ENTITY % include-xslt-index "IGNORE">
<!--
Include external documentation sections
-->
<!ENTITY overview SYSTEM "overview.sgml">
<!ENTITY install SYSTEM "install.sgml">
<!ENTITY install-requirements SYSTEM "install-requirements.sgml">
<!ENTITY install-packages SYSTEM "install-packages.sgml">
<!ENTITY install-source SYSTEM "install-source.sgml">
<!ENTITY quickstart SYSTEM "quickstart.sgml">
<!ENTITY configuration SYSTEM "configuration.sgml">
<!ENTITY configuration-file SYSTEM "configuration-file.sgml">
<!ENTITY configuration-file-required-settings SYSTEM "configuration-file-required-settings.sgml">
<!ENTITY configuration-file-log-settings SYSTEM "configuration-file-log-settings.sgml">
<!ENTITY configuration-file-service-commands SYSTEM "configuration-file-service-commands.sgml">
<!ENTITY cloning-standbys SYSTEM "cloning-standbys.sgml">
<!ENTITY promoting-standby SYSTEM "promoting-standby.sgml">
<!ENTITY follow-new-primary SYSTEM "follow-new-primary.sgml">
<!ENTITY switchover SYSTEM "switchover.sgml">
<!ENTITY configuring-witness-server SYSTEM "configuring-witness-server.sgml">
<!ENTITY event-notifications SYSTEM "event-notifications.sgml">
<!ENTITY upgrading-repmgr SYSTEM "upgrading-repmgr.sgml">
<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.sgml">
<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.sgml">
<!ENTITY repmgrd-demonstration SYSTEM "repmgrd-demonstration.sgml">
<!ENTITY repmgrd-monitoring SYSTEM "repmgrd-monitoring.sgml">
<!ENTITY repmgrd-degraded-monitoring SYSTEM "repmgrd-degraded-monitoring.sgml">
<!ENTITY repmgrd-cascading-replication SYSTEM "repmgrd-cascading-replication.sgml">
<!ENTITY repmgrd-network-split SYSTEM "repmgrd-network-split.sgml">
<!ENTITY repmgrd-witness-server SYSTEM "repmgrd-witness-server.sgml">
<!ENTITY repmgrd-pausing SYSTEM "repmgrd-pausing.sgml">
<!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.sgml">
<!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.sgml">
<!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.sgml">
<!ENTITY repmgr-standby-clone SYSTEM "repmgr-standby-clone.sgml">
<!ENTITY repmgr-standby-register SYSTEM "repmgr-standby-register.sgml">
<!ENTITY repmgr-standby-unregister SYSTEM "repmgr-standby-unregister.sgml">
<!ENTITY repmgr-standby-promote SYSTEM "repmgr-standby-promote.sgml">
<!ENTITY repmgr-standby-follow SYSTEM "repmgr-standby-follow.sgml">
<!ENTITY repmgr-standby-switchover SYSTEM "repmgr-standby-switchover.sgml">
<!ENTITY repmgr-witness-register SYSTEM "repmgr-witness-register.sgml">
<!ENTITY repmgr-witness-unregister SYSTEM "repmgr-witness-unregister.sgml">
<!ENTITY repmgr-node-status SYSTEM "repmgr-node-status.sgml">
<!ENTITY repmgr-node-check SYSTEM "repmgr-node-check.sgml">
<!ENTITY repmgr-node-rejoin SYSTEM "repmgr-node-rejoin.sgml">
<!ENTITY repmgr-cluster-show SYSTEM "repmgr-cluster-show.sgml">
<!ENTITY repmgr-cluster-matrix SYSTEM "repmgr-cluster-matrix.sgml">
<!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.sgml">
<!ENTITY repmgr-cluster-event SYSTEM "repmgr-cluster-event.sgml">
<!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.sgml">
<!ENTITY repmgr-daemon-status SYSTEM "repmgr-daemon-status.sgml">
<!ENTITY repmgr-daemon-pause SYSTEM "repmgr-daemon-pause.sgml">
<!ENTITY repmgr-daemon-unpause SYSTEM "repmgr-daemon-unpause.sgml">
<!ENTITY appendix-release-notes SYSTEM "appendix-release-notes.sgml">
<!ENTITY appendix-faq SYSTEM "appendix-faq.sgml">
<!ENTITY appendix-signatures SYSTEM "appendix-signatures.sgml">
<!ENTITY appendix-packages SYSTEM "appendix-packages.sgml">
<!ENTITY bookindex SYSTEM "bookindex.sgml">

View File

@@ -1,71 +0,0 @@
<!-- doc/filelist.xml -->
<!ENTITY legal SYSTEM "legal.xml">
<!ENTITY bookindex SYSTEM "bookindex.xml">
<!--
Include external documentation sections
-->
<!ENTITY overview SYSTEM "overview.xml">
<!ENTITY install SYSTEM "install.xml">
<!ENTITY install-requirements SYSTEM "install-requirements.xml">
<!ENTITY install-packages SYSTEM "install-packages.xml">
<!ENTITY install-source SYSTEM "install-source.xml">
<!ENTITY quickstart SYSTEM "quickstart.xml">
<!ENTITY configuration SYSTEM "configuration.xml">
<!ENTITY configuration-file SYSTEM "configuration-file.xml">
<!ENTITY configuration-file-required-settings SYSTEM "configuration-file-required-settings.xml">
<!ENTITY configuration-file-optional-settings SYSTEM "configuration-file-optional-settings.xml">
<!ENTITY configuration-file-log-settings SYSTEM "configuration-file-log-settings.xml">
<!ENTITY configuration-file-service-commands SYSTEM "configuration-file-service-commands.xml">
<!ENTITY configuration-permissions SYSTEM "configuration-permissions.xml">
<!ENTITY configuration-password-management SYSTEM "configuration-password-management.xml">
<!ENTITY cloning-standbys SYSTEM "cloning-standbys.xml">
<!ENTITY promoting-standby SYSTEM "promoting-standby.xml">
<!ENTITY follow-new-primary SYSTEM "follow-new-primary.xml">
<!ENTITY switchover SYSTEM "switchover.xml">
<!ENTITY event-notifications SYSTEM "event-notifications.xml">
<!ENTITY upgrading-repmgr SYSTEM "upgrading-repmgr.xml">
<!ENTITY repmgrd-overview SYSTEM "repmgrd-overview.xml">
<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.xml">
<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.xml">
<!ENTITY repmgrd-operation SYSTEM "repmgrd-operation.xml">
<!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.xml">
<!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.xml">
<!ENTITY repmgr-standby-clone SYSTEM "repmgr-standby-clone.xml">
<!ENTITY repmgr-standby-register SYSTEM "repmgr-standby-register.xml">
<!ENTITY repmgr-standby-unregister SYSTEM "repmgr-standby-unregister.xml">
<!ENTITY repmgr-standby-promote SYSTEM "repmgr-standby-promote.xml">
<!ENTITY repmgr-standby-follow SYSTEM "repmgr-standby-follow.xml">
<!ENTITY repmgr-standby-switchover SYSTEM "repmgr-standby-switchover.xml">
<!ENTITY repmgr-witness-register SYSTEM "repmgr-witness-register.xml">
<!ENTITY repmgr-witness-unregister SYSTEM "repmgr-witness-unregister.xml">
<!ENTITY repmgr-node-status SYSTEM "repmgr-node-status.xml">
<!ENTITY repmgr-node-check SYSTEM "repmgr-node-check.xml">
<!ENTITY repmgr-node-rejoin SYSTEM "repmgr-node-rejoin.xml">
<!ENTITY repmgr-node-service SYSTEM "repmgr-node-service.xml">
<!ENTITY repmgr-cluster-show SYSTEM "repmgr-cluster-show.xml">
<!ENTITY repmgr-cluster-matrix SYSTEM "repmgr-cluster-matrix.xml">
<!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.xml">
<!ENTITY repmgr-cluster-event SYSTEM "repmgr-cluster-event.xml">
<!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.xml">
<!ENTITY repmgr-service-status SYSTEM "repmgr-service-status.xml">
<!ENTITY repmgr-service-pause SYSTEM "repmgr-service-pause.xml">
<!ENTITY repmgr-service-unpause SYSTEM "repmgr-service-unpause.xml">
<!ENTITY repmgr-daemon-start SYSTEM "repmgr-daemon-start.xml">
<!ENTITY repmgr-daemon-stop SYSTEM "repmgr-daemon-stop.xml">
<!ENTITY appendix-release-notes SYSTEM "appendix-release-notes.xml">
<!ENTITY appendix-faq SYSTEM "appendix-faq.xml">
<!ENTITY appendix-signatures SYSTEM "appendix-signatures.xml">
<!ENTITY appendix-packages SYSTEM "appendix-packages.xml">
<!ENTITY appendix-support SYSTEM "appendix-support.xml">
<!ENTITY bookindex SYSTEM "bookindex.xml">

View File

@@ -1,22 +1,21 @@
<chapter id="follow-new-primary">
<title>Following a new primary</title>
<indexterm>
<primary>Following a new primary</primary>
<seealso>repmgr standby follow</seealso>
</indexterm>
<title>Following a new primary</title>
<para>
Following the failure or removal of the replication cluster's existing primary
server, <xref linkend="repmgr-standby-follow"/> can be used to make &quot;orphaned&quot; standbys
server, <xref linkend="repmgr-standby-follow"> can be used to make 'orphaned' standbys
follow the new primary and catch up to its current state.
</para>
<para>
To demonstrate this, assuming a replication cluster in the same state as the
end of the preceding section (<xref linkend="promoting-standby"/>),
end of the preceding section (<xref linkend="promoting-standby">),
execute this:
<programlisting>
$ repmgr -f /etc/repmgr.conf standby follow
$ repmgr -f /etc/repmgr.conf repmgr standby follow
INFO: changing node 3's primary to node 2
NOTICE: restarting server using "pg_ctl -l /var/log/postgresql/startup.log -w -D '/var/lib/postgresql/data' restart"
waiting for server to shut down......... done

228
doc/install-packages.sgml Normal file
View File

@@ -0,0 +1,228 @@
<sect1 id="installation-packages" xreflabel="Installing from packages">
<title>Installing &repmgr; from packages</title>
<para>
We recommend installing &repmgr; using the available packages for your
system.
</para>
<sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, CentOS and Fedora">
<indexterm>
<primary>installation</primary>
<secondary>on Red Hat/CentOS/Fedora etc.</secondary>
</indexterm>
<title>RedHat/CentOS/Fedora</title>
<para>
&repmgr; RPM packages for RedHat/CentOS variants and Fedora are available from the
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink>; see following
section for details.
</para>
<para>
RPM packages for &repmgr; are also available via Yum through
the PostgreSQL Global Development Group RPM repository
(<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>).
Follow the instructions for your distribution (RedHat, CentOS,
Fedora, etc.) and architecture as detailed there. Note that it can take some days
for new &repmgr; packages to become available via the this repository.
</para>
<note>
<para>
&repmgr; packages are designed to be compatible with the community-provided PostgreSQL packages.
They may not work with vendor-specific packages such as those provided by RedHat for RHEL
customers, as the filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
</para>
</note>
<para>
For more information on the package contents, including details of installation
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
see the appendix section <xref linkend="packages-centos">.
</para>
<sect3 id="installation-packages-redhat-2ndq">
<title>2ndQuadrant public RPM yum repository</title>
<para>
Beginning with <ulink url="https://repmgr.org/docs/4.1/release-4.0.5.html">repmgr 4.0.5</ulink>,
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink> for 2ndQuadrant software,
including &repmgr;. We recommend using this for all future &repmgr; releases.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
<emphasis>Installation</emphasis>
<itemizedlist>
<listitem>
<para>
Locate the repository RPM for your PostgreSQL version from the list at:
<ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink>
</para>
</listitem>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the 2ndQuadrant repository as a source of &repmgr; packages).
</para>
<para>
For example, for PostgreSQL 10 on CentOS, execute:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/10/rpm | sudo bash</programlisting>
</para>
<para>
Verify that the repository is installed with:
<programlisting>
sudo yum repolist</programlisting>
The output should contain two entries like this:
<programlisting>
2ndquadrant-dl-default-release-pg10/7/x86_64 2ndQuadrant packages (PG10) for 7 - x86_64 4
2ndquadrant-dl-default-release-pg10-debug/7/x86_64 2ndQuadrant packages (PG10) for 7 - x86_64 - Debug 3</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting>
$ yum install repmgr10</programlisting>
</para>
</listitem>
</itemizedlist>
</para>
<para>
<emphasis>Compatibility with PGDG Repositories</emphasis>
</para>
<para>
The 2ndQuadrant &repmgr; yum repository packages use the same definitions and file system layout as the
main PGDG repository.
</para>
<para>
Normally <application>yum</application> will prioritize the repository with the most recent &repmgr; version.
Once the PGDG repository has been updated, it doesn't matter which repository
the packages are installed from.
</para>
<para>
To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal>
and set the repository priorities accordingly.
</para>
<para>
<emphasis>Installing a specific package version</emphasis>
</para>
<para>
To install a specific package version, execute <command>yum --showduplicates list</command>
for the package in question:
<programlisting>
[root@localhost ~]# yum --showduplicates list repmgr10
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp
Available Packages
repmgr10.x86_64 4.0.3-1.rhel7 pgdg10
repmgr10.x86_64 4.0.4-1.rhel7 pgdg10
repmgr10.x86_64 4.0.5-1.el7 2ndquadrant-repo-10</programlisting>
then append the appropriate version number to the package name with a hyphen, e.g.:
<programlisting>
[root@localhost ~]# yum install repmgr10-4.0.3-1.rhel7</programlisting>
</para>
</sect3>
</sect2>
<sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu">
<indexterm>
<primary>installation</primary>
<secondary>on Debian/Ubuntu etc.</secondary>
</indexterm>
<title>Debian/Ubuntu</title>
<para>.deb packages for &repmgr; are available from the
PostgreSQL Community APT repository (<ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink>).
Instructions can be found in the APT section of the PostgreSQL Wiki
(<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
</para>
<para>
For more information on the package contents, including details of installation
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
see the appendix section <xref linkend="packages-debian-ubuntu">.
</para>
<sect3 id="installation-packages-debian-ubuntu-2ndq">
<title>2ndQuadrant public apt repository for Debian/Ubuntu</title>
<para>
Beginning with <ulink url="https://repmgr.org/docs/4.0/release-4.0.5.html">repmgr 4.0.5</ulink>,
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a
<ulink url="https://dl.2ndquadrant.com/">public apt repository</ulink> for 2ndQuadrant software,
including &repmgr;.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
<emphasis>Installation</emphasis>
<itemizedlist>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the 2ndQuadrant repository as a source of &repmgr; packages) by executing:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash</programlisting>
</para>
<note>
<para>
This will automatically install the following additional packages, if not already present:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>lsb-release</literal></simpara>
</listitem>
<listitem>
<simpara><literal>apt-transport-https</literal></simpara>
</listitem>
</itemizedlist>
</para>
</note>
</listitem>
<listitem>
<para>
Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting>
$ apt-get install postgresql-10-repmgr</programlisting>
</para>
<note>
<para>
For packages for PostgreSQL 9.6 and earlier, the package name includes
a period between major and minor version numbers, e.g.
<literal>postgresql-9.6-repmgr</literal>.
</para>
</note>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
</sect1>

View File

@@ -1,285 +0,0 @@
<sect1 id="installation-packages" xreflabel="Installing from packages">
<title>Installing &repmgr; from packages</title>
<indexterm>
<primary>installation</primary>
<secondary>from packages</secondary>
</indexterm>
<para>
We recommend installing &repmgr; using the available packages for your
system.
</para>
<sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, CentOS and Fedora">
<title>RedHat/CentOS/Fedora</title>
<indexterm>
<primary>installation</primary>
<secondary>on Red Hat/CentOS/Fedora etc.</secondary>
</indexterm>
<para>
&repmgr; RPM packages for RedHat/CentOS variants and Fedora are available from the
<ulink url="https://www.enterprisedb.com">EDB</ulink>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink>; see following
section for details.
</para>
<note>
<para>
Currently the <ulink url="https://www.enterprisedb.com">EDB</ulink>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink> provides
support for RedHat/CentOS versions 6,7 and 8.
</para>
</note>
<para>
RPM packages for &repmgr; are also available via Yum through
the PostgreSQL Global Development Group (PGDG) RPM repository
(<ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink>).
Follow the instructions for your distribution (RedHat, CentOS,
Fedora, etc.) and architecture as detailed there. Note that it can take some days
for new &repmgr; packages to become available via the this repository.
</para>
<note>
<para>
&repmgr; RPM packages are designed to be compatible with the community-provided PostgreSQL packages
and EDB's PostgreSQL Extended Server (formerly 2ndQPostgres).
They may not work with vendor-specific packages such as those provided by RedHat for RHEL
customers, as the PostgreSQL filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
</para>
<para>
See also <link linkend="appendix-faq">FAQ</link> entry
<xref linkend="faq-third-party-packages"/>.
</para>
</note>
<para>
For more information on the package contents, including details of installation
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
see the appendix section <xref linkend="packages-centos"/>.
</para>
<sect3 id="installation-packages-redhat-2ndq">
<title>EDB public RPM yum repository</title>
<para>
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides a dedicated <literal>yum</literal>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink> for EDB software,
including &repmgr;. We recommend using this for all future &repmgr; releases.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.enterprisedb.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
<emphasis>Installation</emphasis>
<itemizedlist>
<listitem>
<para>
Locate the repository RPM for your PostgreSQL version from the list at:
<ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink>
</para>
</listitem>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the EDB repository as a source of &repmgr; packages).
</para>
<para>
For example, for PostgreSQL 14 on Rocky Linux 8, execute:
<programlisting>
curl https://dl.enterprisedb.com/default/release/get/14/rpm | sudo bash</programlisting>
</para>
<para>
Verify that the repository is installed with:
<programlisting>
sudo dnf repolist</programlisting>
The output should contain two entries like this:
<programlisting>
2ndquadrant-dl-default-release-pg14 2ndQuadrant packages (PG14) for 8 - x86_64
2ndquadrant-dl-default-release-pg14-debug 2ndQuadrant packages (PG14) for 8 - x86_64 - Debug</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the &repmgr; version appropriate for your PostgreSQL version (e.g. <literal>repmgr14</literal>):
<programlisting>
sudo dnf install repmgr14</programlisting>
</para>
<tip>
<para>
To determine the names of available packages, execute:
<programlisting>
dnf search repmgr</programlisting>
</para>
<para>
In CentOS 7 and earlier, use <literal>yum</literal> instead of <literal>dnf</literal>.
</para>
</tip>
</listitem>
</itemizedlist>
</para>
<para>
<emphasis>Compatibility with PGDG Repositories</emphasis>
</para>
<para>
The EDB &repmgr; yum repository packages use the same definitions and file system layout as the
main PGDG repository.
</para>
<para>
Normally <application>yum</application> will prioritize the repository with the most recent &repmgr; version.
Once the PGDG repository has been updated, it doesn't matter which repository
the packages are installed from.
</para>
<para>
To ensure the EDB repository is always prioritised, set the <literal>priority</literal> option
in the repository configuration file (e.g. <filename>/etc/yum.repos.d/2ndquadrant-dl-default-release-pg14.repo</filename>
accordingly.
</para>
<note>
<para>
With CentOS 7 and earlier, the package <literal>yum-plugin-priorities</literal> must be installed
to be able to set the repository priority.
</para>
</note>
<para>
<emphasis>Installing a specific package version</emphasis>
</para>
<para>
To install a specific package version, execute <command>dnf --showduplicates list</command>
for the package in question:
<programlisting>
[root@localhost ~]# dnf --showduplicates list repmgr10
Last metadata expiration check: 0:09:15 ago on Fri 11 Mar 2022 01:09:19 AM UTC.
Installed Packages
repmgr10.x86_64 5.3.1-1.el8 @2ndquadrant-dl-default-release-pg10
Available Packages
repmgr10.x86_64 5.0.0-1.rhel8 pgdg10
repmgr10.x86_64 5.1.0-1.el8 2ndquadrant-dl-default-release-pg10
repmgr10.x86_64 5.1.0-1.rhel8 pgdg10
repmgr10.x86_64 5.1.0-2.el8 2ndquadrant-dl-default-release-pg10
repmgr10.x86_64 5.2.0-1.el8 2ndquadrant-dl-default-release-pg10
repmgr10.x86_64 5.2.0-1.rhel8 pgdg10
repmgr10.x86_64 5.2.1-1.el8 2ndquadrant-dl-default-release-pg10
repmgr10.x86_64 5.3.0-1.el8 2ndquadrant-dl-default-release-pg10
repmgr10.x86_64 5.3.1-1.el8 2ndquadrant-dl-default-release-pg10</programlisting>
then append the appropriate version number to the package name with a hyphen, e.g.:
<programlisting>
[root@localhost ~]# dnf install repmgr10-5.3.0-1.el8</programlisting>
</para>
<para>
<emphasis>Installing old packages</emphasis>
</para>
<para>
See appendix <link linkend="packages-old-versions-rhel-centos">Installing old package versions</link>
for details on how to retrieve older package versions.
</para>
</sect3>
</sect2>
<sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu">
<title>Debian/Ubuntu</title>
<indexterm>
<primary>installation</primary>
<secondary>on Debian/Ubuntu etc.</secondary>
</indexterm>
<para>.deb packages for &repmgr; are available from the
PostgreSQL Community APT repository (<ulink url="https://apt.postgresql.org/">https://apt.postgresql.org/</ulink>).
Instructions can be found in the APT section of the PostgreSQL Wiki
(<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
</para>
<para>
For more information on the package contents, including details of installation
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
see the appendix section <xref linkend="packages-debian-ubuntu"/>.
</para>
<sect3 id="installation-packages-debian-ubuntu-2ndq">
<title>EDB public apt repository for Debian/Ubuntu</title>
<para>
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides a
<ulink url="https://dl.enterprisedb.com/">public apt repository</ulink> for EDB software,
including &repmgr;.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.enterprisedb.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
<emphasis>Installation</emphasis>
<itemizedlist>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the EDB repository as a source of &repmgr; packages) by executing:
<programlisting>
curl https://dl.enterprisedb.com/default/release/get/deb | sudo bash</programlisting>
</para>
<note>
<para>
This will automatically install the following additional packages, if not already present:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>lsb-release</literal></simpara>
</listitem>
<listitem>
<simpara><literal>apt-transport-https</literal></simpara>
</listitem>
</itemizedlist>
</para>
</note>
</listitem>
<listitem>
<para>
Install the &repmgr; version appropriate for your PostgreSQL version (e.g. <literal>repmgr11</literal>):
<programlisting>
sudo apt-get install postgresql-11-repmgr</programlisting>
</para>
<note>
<para>
For packages for PostgreSQL 9.6 and earlier, the package name includes
a period between major and minor version numbers, e.g.
<literal>postgresql-9.6-repmgr</literal>.
</para>
</note>
</listitem>
</itemizedlist>
</para>
<para>
<emphasis>Installing old packages</emphasis>
</para>
<para>
See appendix <link linkend="packages-old-versions-debian">Installing old package versions</link>
for details on how to retrieve older package versions.
</para>
</sect3>
</sect2>
</sect1>

View File

@@ -0,0 +1,79 @@
<sect1 id="install-requirements" xreflabel="installation requirements">
<indexterm>
<primary>installation</primary>
<secondary>requirements</secondary>
</indexterm>
<title>Requirements for installing repmgr</title>
<para>
repmgr is developed and tested on Linux and OS X, but should work on any
UNIX-like system supported by PostgreSQL itself. There is no support for
Microsoft Windows.
</para>
<para>
From version 4.0, repmgr is compatible with all PostgreSQL versions from 9.3, including PostgreSQL 10.
Note that some &repmgr; functionality is not available in PostgreSQL 9.3 and PostgreSQL 9.4.
</para>
<note>
<simpara>
If upgrading from &repmgr; 3.x, please see the section <xref linkend="upgrading-from-repmgr-3">.
</simpara>
</note>
<para>
All servers in the replication cluster must be running the same major version of
PostgreSQL, and we recommend that they also run the same minor version.
</para>
<para>
&repmgr; must be installed on each server in the replication cluster.
If installing repmgr from packages, the package version must match the PostgreSQL
version. If installing from source, repmgr must be compiled against the same
major version.
</para>
<para>
A dedicated system user for &repmgr; is <emphasis>not</emphasis> required; as many &repmgr; and
<application>repmgrd</application> actions require direct access to the PostgreSQL data directory,
these commands should be executed by the <literal>postgres</literal> user.
</para>
<para>
Passwordless <command>ssh</command> connectivity between all servers in the replication cluster
is not required, but is necessary in the following cases:
<itemizedlist>
<listitem>
<simpara>if you need &repmgr; to copy configuration files from outside the PostgreSQL
data directory (in which case <command>rsync</command> is also required)</simpara>
</listitem>
<listitem>
<simpara>to perform <link linkend="performing-switchover">switchover operations</link></simpara>
</listitem>
<listitem>
<simpara>
when executing <command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command>
and <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
</simpara>
</listitem>
</itemizedlist>
</para>
<tip>
<simpara>
Consider setting <varname>ConnectTimeout</varname> to a low value in your SSH configuration.
This will make it faster to detect any SSH connection errors.
</simpara>
</tip>
<tip>
<simpara>
We recommend using a session multiplexer utility such as <command>screen</command> or
<command>tmux</command> when performing long-running actions (such as cloning a database)
on a remote server - this will ensure the &repmgr; action won't be prematurely
terminated if your <command>ssh</command> session to the server is interrupted or closed.
</simpara>
</tip>
</sect1>

View File

@@ -1,329 +0,0 @@
<sect1 id="install-requirements" xreflabel="installation requirements">
<title>Requirements for installing repmgr</title>
<indexterm>
<primary>installation</primary>
<secondary>requirements</secondary>
</indexterm>
<para>
repmgr is developed and tested on Linux and OS X, but should work on any
UNIX-like system supported by PostgreSQL itself. There is no support for
Microsoft Windows.
</para>
<para>
&repmgr; &repmgrversion; is compatible with all PostgreSQL versions from 9.4. See
section <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
for an overview of version compatibility.
</para>
<note>
<simpara>
If upgrading from &repmgr; 3.x, please see the section <xref linkend="upgrading-from-repmgr-3"/>.
</simpara>
</note>
<para>
All servers in the replication cluster must be running the same major version of
PostgreSQL, and we recommend that they also run the same minor version.
</para>
<para>
&repmgr; must be installed on each server in the replication cluster.
If installing repmgr from packages, the package version must match the PostgreSQL
version. If installing from source, &repmgr; must be compiled against the same
major version.
</para>
<note>
<simpara>
The same &quot;major&quot; &repmgr; version (e.g. <literal>&repmgrversion;.x</literal>) <emphasis>must</emphasis>
be installed on all node in the replication cluster. We strongly recommend keeping all
nodes on the same (preferably latest) &quot;minor&quot; &repmgr; version to minimize the risk
of incompatibilities.
</simpara>
<simpara>
If different &quot;major&quot; &repmgr; versions (e.g. 4.1.x and &repmgrversion;.x)
are installed on different nodes, in the best case &repmgr; (in particular &repmgrd;)
will not run. In the worst case, you will end up with a broken cluster.
</simpara>
</note>
<para>
A dedicated system user for &repmgr; is <emphasis>not</emphasis> required; as many &repmgr; and
&repmgrd; actions require direct access to the PostgreSQL data directory,
these commands should be executed by the <literal>postgres</literal> user.
</para>
<para>
See also <link linkend="configuration-prerequisites">Prerequisites for configuration</link>
for information on networking requirements.
</para>
<tip>
<simpara>
We recommend using a session multiplexer utility such as <command>screen</command> or
<command>tmux</command> when performing long-running actions (such as cloning a database)
on a remote server - this will ensure the &repmgr; action won't be prematurely
terminated if your <command>ssh</command> session to the server is interrupted or closed.
</simpara>
</tip>
<sect2 id="install-compatibility-matrix">
<title>&repmgr; compatibility matrix</title>
<indexterm>
<primary>repmgr</primary>
<secondary>compatibility matrix</secondary>
</indexterm>
<indexterm>
<primary>compatibility matrix</primary>
</indexterm>
<para>
The following table provides an overview of which &repmgr; version supports
which PostgreSQL version.
</para>
<table id="repmgr-compatibility-matrix">
<title>&repmgr; compatibility matrix</title>
<tgroup cols="4">
<thead>
<row>
<entry>
&repmgr; version
</entry>
<entry>
Supported?
</entry>
<entry>
Latest release
</entry>
<entry>
Supported PostgreSQL versions
</entry>
<entry>
Notes
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
&repmgr; 5.4
</entry>
<entry>
(dev)
</entry>
<entry>
<link linkend="release-current">&repmgrversion;</link> (&releasedate;)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13, 15
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 5.3
</entry>
<entry>
YES
</entry>
<entry>
<link linkend="release-current">&repmgrversion;</link> (&releasedate;)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13, 14, 15
</entry>
<entry>
PostgreSQL 15 supported from &repmgr; 5.3.3
</entry>
</row>
<row>
<entry>
&repmgr; 5.2
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.2.1">5.2.1</link> (2020-12-07)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 5.1
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.1.0">5.1.0</link> (2020-04-13)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 5.0
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.0">5.0</link> (2019-10-15)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 4.x
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-4.4">4.4</link> (2019-06-27)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 3.x
</entry>
<entry>
NO
</entry>
<entry>
<ulink url="https://repmgr.org/release-notes-3.3.2.html">3.3.2</ulink> (2017-05-30)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6
</entry>
<entry>
&nbsp;
</entry>
</row>
<row>
<entry>
&repmgr; 2.x
</entry>
<entry>
NO
</entry>
<entry>
<ulink url="https://repmgr.org/release-notes-2.0.3.html">2.0.3</ulink> (2015-04-16)
</entry>
<entry>
9.0, 9.1, 9.2, 9.3, 9.4
</entry>
<entry>
&nbsp;
</entry>
</row>
</tbody>
</tgroup>
</table>
<important>
<para>
The &repmgr; 2.x and 3.x series are no longer maintained or supported.
We strongly recommend upgrading to the latest &repmgr; version.
</para>
<para>
Following the release of &repmgr; 5.0, there will be no further releases of
the &repmgr; 4.x series. Note that &repmgr; 5.x is an incremental development
of the 4.x series and &repmgr; 4.x users should upgrade to this as soon as possible.
</para>
</important>
</sect2>
<sect2 id="install-postgresql-93-94">
<title>PostgreSQL 9.4 support</title>
<indexterm>
<primary>PostgreSQL 9.4</primary>
<secondary>repmgr support</secondary>
</indexterm>
<para>
Note that some &repmgr; functionality is not available in PostgreSQL 9.4:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
In PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
distribution. <command>pg_rewind</command> will need to be compiled separately to be able
to use any &repmgr; functionality which takes advantage of it.
</para>
</listitem>
</itemizedlist>
<warning>
<para>
PostgreSQL 9.3 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.3/release-9-3-25.html">9.3.25</ulink>
in November 2018) and will no longer be updated with security or bugfixes.
</para>
<para>
Beginning with &repmgr; 5.2, &repmgr; no longer supports PostgreSQL 9.3.
</para>
<para>
PostgreSQL 9.4 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.4/release-9-4-26.html">9.4.26</ulink>
in February 2020) and will no longer be updated with security or bugfixes.
</para>
<para>
We recommend that users of these versions migrate to a supported PostgreSQL version
as soon as possible.
</para>
<para>
For further details, see the <ulink url="https://www.postgresql.org/support/versioning/">PostgreSQL Versioning Policy</ulink>.
</para>
</warning>
</sect2>
</sect1>

175
doc/install-source.sgml Normal file
View File

@@ -0,0 +1,175 @@
<sect1 id="installation-source" xreflabel="Installing from source code">
<indexterm>
<primary>installation</primary>
<secondary>from source</secondary>
</indexterm>
<title>Installing &repmgr; from source</title>
<sect2 id="installation-source-prereqs">
<title>Prerequisites for installing from source</title>
<para>
To install &repmgr; the prerequisites for compiling
&postgres; must be installed. These are described in &postgres;'s
documentation
on <ulink url="https://www.postgresql.org/docs/current/static/install-requirements.html">build requirements</ulink>
and <ulink url="https://www.postgresql.org/docs/current/static/docguide-toolsets.html">build requirements for documentation</ulink>.
</para>
<para>
Most mainstream Linux distributions and other UNIX variants provide simple
ways to install the prerequisites from packages.
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
<literal>Debian</literal> and <literal>Ubuntu</literal>: First
add the <ulink
url="http://apt.postgresql.org/">apt.postgresql.org</ulink>
repository to your <filename>sources.list</filename> if you
have not already done so. Then install the pre-requisites for
building PostgreSQL with:
<programlisting>
sudo apt-get update
sudo apt-get build-dep postgresql-9.6</programlisting>
</para>
</listitem>
<listitem>
<para>
<literal>RHEL or CentOS 6.x or 7.x</literal>: install the appropriate repository RPM
for your system from <ulink url="https://yum.postgresql.org/repopackages.php">
yum.postgresql.org</ulink>. Then install the prerequisites for building
PostgreSQL with:
<programlisting>
sudo yum check-update
sudo yum groupinstall "Development Tools"
sudo yum install yum-utils openjade docbook-dtds docbook-style-dsssl docbook-style-xsl
sudo yum-builddep postgresql96</programlisting>
</para>
</listitem>
</itemizedlist>
</para>
<note>
<simpara>
Select the appropriate PostgreSQL versions for your target repmgr version.
</simpara>
</note>
</sect2>
<sect2 id="installation-get-source">
<title>Getting &repmgr; source code</title>
<para>
There are two ways to get the &repmgr; source code: with git, or by downloading tarballs of released versions.
</para>
<sect3>
<title>Using <application>git</application> to get the &repmgr; sources</title>
<para>
Use <application><ulink url="https://git-scm.com">git</ulink></application> if you expect
to update often, you want to keep track of development or if you want to contribute
changes to &repmgr;. There is no reason <emphasis>not</emphasis> to use <application>git</application>
if you're familiar with it.
</para>
<para>
The source for &repmgr; is maintained at
<ulink url="https://github.com/2ndQuadrant/repmgr">https://github.com/2ndQuadrant/repmgr</ulink>.
</para>
<para>
There are also tags for each &repmgr; release, e.g. <filename>4.0.5</filename>.
</para>
<para>
Clone the source code using <application>git</application>:
<programlisting>
git clone https://github.com/2ndQuadrant/repmgr</programlisting>
</para>
<para>
For more information on using <application>git</application> see
<ulink url="https://git-scm.com/">git-scm.com</ulink>.
</para>
</sect3>
<sect3>
<title>Downloading release source tarballs</title>
<para>
Official release source code is uploaded as tarballs to the
&repmgr; website along with a tarball checksum and a matching GnuPG
signature. See
<ulink url="http://repmgr.org/">http://repmgr.org/</ulink>
for the download information. See <xref linkend="appendix-signatures">
for information on verifying digital signatures.
</para>
<para>
You will need to download the repmgr source, e.g. <filename>repmgr-4.0.tar.gz</filename>.
You may optionally verify the package checksums from the
<literal>.md5</literal> files and/or verify the GnuPG signatures
per <xref linkend="appendix-signatures">.
</para>
<para>
After you unpack the source code archives using <literal>tar xf</literal>
the installation process is the same as if you were installing from a git
clone.
</para>
</sect3>
</sect2>
<sect2 id="installation-repmgr-source">
<title>Installation of &repmgr; from source</title>
<para>
To installing &repmgr; from source, simply execute:
<programlisting>
./configure && make install</programlisting>
Ensure <command>pg_config</command> for the target PostgreSQL version is in
<varname>$PATH</varname>.
</para>
</sect2>
<sect2 id="installation-build-repmgr-docs">
<title>Building &repmgr; documentation</title>
<para>
The &repmgr; documentation is (like the main PostgreSQL project)
written in DocBook format. To build it locally as HTML, you'll need to
install the required packages as described in the
<ulink url="https://www.postgresql.org/docs/9.6/static/docguide-toolsets.html">
PostgreSQL documentation</ulink> then execute:
<programlisting>
./configure && make install-doc</programlisting>
</para>
<para>
The generated HTML files will be placed in the <filename>doc/html</filename>
subdirectory of your source tree.
</para>
<para>
To build the documentation as a single HTML file, execute:
<programlisting>
cd doc/ && make repmgr.html</programlisting>
</para>
<note>
<simpara>
Due to changes in PostgreSQL's documentation build system from PostgreSQL 10,
the documentation can currently only be built agains PostgreSQL 9.6 or earlier.
This limitation will be fixed when time and resources permit.
</simpara>
</note>
</sect2>
</sect1>

View File

@@ -1,293 +0,0 @@
<sect1 id="installation-source" xreflabel="Installing from source code">
<title>Installing &repmgr; from source</title>
<indexterm>
<primary>installation</primary>
<secondary>from source</secondary>
</indexterm>
<sect2 id="installation-source-prereqs">
<title>Prerequisites for installing from source</title>
<para>
To install &repmgr; the prerequisites for compiling
&postgres; must be installed. These are described in &postgres;'s
documentation
on <ulink url="https://www.postgresql.org/docs/current/install-requirements.html">build requirements</ulink>
and <ulink url="https://www.postgresql.org/docs/current/docguide-toolsets.html">build requirements for documentation</ulink>.
</para>
<para>
Most mainstream Linux distributions and other UNIX variants provide simple
ways to install the prerequisites from packages.
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
<literal>Debian</literal> and <literal>Ubuntu</literal>: First
add the <ulink url="https://apt.postgresql.org/">apt.postgresql.org</ulink>
repository to your <filename>sources.list</filename> if you
have not already done so, and ensure the source repository is enabled.
</para>
<tip>
<para>
If not configured, the source repository can be added by including
a <literal>deb-src</literal> line as a copy of the existing <literal>deb</literal>
line in the repository file, which is usually
<filename>/etc/apt/sources.list.d/pgdg.list</filename>, e.g.:
<programlisting>
deb https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main
deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlisting>
</para>
</tip>
<para>
Then install the prerequisites for
building PostgreSQL with e.g.:
<programlisting>
sudo apt-get update
sudo apt-get build-dep postgresql-9.6</programlisting>
</para>
<important>
<simpara>
Select the appropriate PostgreSQL version for your target repmgr version.
</simpara>
</important>
<note>
<para>
If using <command>apt-get build-dep</command> is not possible, the
following packages may need to be installed manually:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>flex</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libedit-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libkrb5-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libpam0g-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libreadline-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libselinux1-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libssl-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libxml2-dev</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libxslt1-dev</literal></simpara>
</listitem>
</itemizedlist>
</para>
</note>
</listitem>
<listitem>
<para>
<literal>RHEL or CentOS 6.x or 7.x</literal>: install the appropriate repository RPM
for your system from <ulink url="https://yum.postgresql.org/repopackages.php">
yum.postgresql.org</ulink>. Then install the prerequisites for building
PostgreSQL with:
<programlisting>
sudo yum check-update
sudo yum groupinstall "Development Tools"
sudo yum install yum-utils openjade docbook-dtds docbook-style-dsssl docbook-style-xsl
sudo yum-builddep postgresql96</programlisting>
</para>
<important>
<simpara>
Select the appropriate PostgreSQL version for your target repmgr version.
</simpara>
</important>
<note>
<para>
If using <command>yum-builddep</command> is not possible, the
following packages may need to be installed manually:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>flex</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libselinux-devel</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libxml2-devel</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libxslt-devel</literal></simpara>
</listitem>
<listitem>
<simpara><literal>openssl-devel</literal></simpara>
</listitem>
<listitem>
<simpara><literal>pam-devel</literal></simpara>
</listitem>
<listitem>
<simpara><literal>readline-devel</literal></simpara>
</listitem>
</itemizedlist>
</para>
</note>
<tip>
<para>
If building against PostgreSQL 11 or later configured with the <option>--with-llvm</option> option
(this is the case with the PGDG-provided packages) you'll also need to install the
<literal>llvm-toolset-7-clang</literal> package. This is available via the
<ulink url="https://wiki.centos.org/AdditionalResources/Repositories/SCL">Software Collections (SCL) Repository</ulink>.
</para>
</tip>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="installation-get-source">
<title>Getting &repmgr; source code</title>
<para>
There are two ways to get the &repmgr; source code: with git, or by downloading tarballs of released versions.
</para>
<sect3>
<title>Using <application>git</application> to get the &repmgr; sources</title>
<para>
Use <application><ulink url="https://git-scm.com">git</ulink></application> if you expect
to update often, you want to keep track of development or if you want to contribute
changes to &repmgr;. There is no reason <emphasis>not</emphasis> to use <application>git</application>
if you're familiar with it.
</para>
<para>
The source for &repmgr; is maintained at
<ulink url="https://github.com/EnterpriseDB/repmgr">https://github.com/EnterpriseDB/repmgr</ulink>.
</para>
<para>
There are also tags for each <ulink url="https://github.com/EnterpriseDB/repmgr/releases">&repmgr; release</ulink>, e.g.
<literal><ulink url="https://github.com/EnterpriseDB/repmgr/releases/tag/v4.4.0">v4.4.0</ulink></literal>.
</para>
<para>
Clone the source code using <application>git</application>:
<programlisting>
git clone https://github.com/EnterpriseDB/repmgr</programlisting>
</para>
<para>
For more information on using <application>git</application> see
<ulink url="https://git-scm.com/">git-scm.com</ulink>.
</para>
</sect3>
<sect3>
<title>Downloading release source tarballs</title>
<para>
Official release source code is uploaded as tarballs to the
&repmgr; website along with a tarball checksum and a matching GnuPG
signature. See
<ulink url="http://repmgr.org/">http://repmgr.org/</ulink>
for the download information. See <xref linkend="appendix-signatures"/>
for information on verifying digital signatures.
</para>
<para>
You will need to download the repmgr source, e.g. <filename>repmgr-4.0.tar.gz</filename>.
You may optionally verify the package checksums from the
<literal>.md5</literal> files and/or verify the GnuPG signatures
per <xref linkend="appendix-signatures"/>.
</para>
<para>
After you unpack the source code archives using <command>tar xf</command>
the installation process is the same as if you were installing from a git
clone.
</para>
</sect3>
</sect2>
<sect2 id="installation-repmgr-source">
<title>Installation of &repmgr; from source</title>
<para>
To installing &repmgr; from source, simply execute:
<programlisting>
./configure &amp;&amp; make install</programlisting>
Ensure <command>pg_config</command> for the target PostgreSQL version is in
<varname>$PATH</varname>.
</para>
</sect2>
<sect2 id="installation-build-repmgr-docs" xreflabel="Building repmgr documentation">
<title>Building &repmgr; documentation</title>
<para>
The &repmgr; documentation is (like the main PostgreSQL project)
written in DocBook XML format. To build it locally as HTML, you'll need to
install the required packages as described in the
<ulink url="https://www.postgresql.org/docs/current/docguide-toolsets.html">PostgreSQL documentation</ulink>.
</para>
<para>
The minimum PostgreSQL version for building the &repmgr; documentation is
PostgreSQL 9.5.
</para>
<note>
<simpara>
In &repmgr; 4.3 and earlier, the documentation can only be built against
PostgreSQL 9.6 or earlier.
</simpara>
</note>
<para>
To build the documentation as HTML, execute:
<programlisting>
./configure &amp;&amp; make doc</programlisting>
</para>
<para>
The generated HTML files will be placed in the <filename>doc/html</filename>
subdirectory of your source tree.
</para>
<para>
To build the documentation as a single HTML file, after configuring and building
the main &repmgr; source as described above, execute:
<programlisting>
./configure &amp;&amp; make doc-repmgr.html</programlisting>
</para>
<para>
To build the documentation as a PDF file, after configuring and building
the main &repmgr; source as described above, execute:
<programlisting>
./configure &amp;&amp; make doc-repmgr-A4.pdf</programlisting>
</para>
</sect2>
</sect1>

View File

@@ -1,11 +1,10 @@
<chapter id="installation" xreflabel="Installation">
<title>Installation</title>
<indexterm>
<primary>installation</primary>
</indexterm>
<title>Installation</title>
<para>
&repmgr; can be installed from binary packages provided by your operating
system's packaging system, or from source.
@@ -19,7 +18,7 @@
only option if there are no packages for your operating system yet.
</para>
<para>
Before installing &repmgr; make sure you satisfy the <xref linkend="install-requirements"/>.
Before installing &repmgr; make sure you satisfy the <xref linkend="install-requirements">.
</para>
&install-requirements;

View File

@@ -1,18 +1,18 @@
<!-- doc/legal.xml -->
<!-- doc/legal.sgml -->
<date>2022</date>
<date>2017</date>
<copyright>
<year>2010-2022</year>
<holder>EDB</holder>
<year>2010-2018</year>
<holder>2ndQuadrant, Ltd.</holder>
</copyright>
<legalnotice id="legalnotice">
<title>Legal Notice</title>
<para>
<productname>repmgr</productname> is Copyright &copy; 2010-2022
by EDB All rights reserved.
<productname>repmgr</productname> is Copyright &copy; 2010-2018
by 2ndQuadrant, Ltd. All rights reserved.
</para>
<para>

View File

@@ -7,18 +7,18 @@
</para>
<sect1 id="repmgr-concepts" xreflabel="Concepts">
<title>Concepts</title>
<indexterm>
<primary>concepts</primary>
</indexterm>
<title>Concepts</title>
<para>
This guide assumes that you are familiar with PostgreSQL administration and
streaming replication concepts. For further details on streaming
replication, see the PostgreSQL documentation section on <ulink
url="https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION">
streaming replication</ulink>.
url="https://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION">
streaming replication</>.
</para>
<para>
The following terms are used throughout the &repmgr; documentation.
@@ -58,7 +58,7 @@
<listitem>
<simpara>
This is the action which occurs if a primary server fails and a suitable standby
is promoted as the new primary. The &repmgrd; daemon supports automatic failover
is promoted as the new primary. The <application>repmgrd</application> daemon supports automatic failover
to minimise downtime.
</simpara>
</listitem>
@@ -107,7 +107,7 @@
promotes a (local) standby.
</para>
<para>
A witness server only needs to be created if &repmgrd;
A witness server only needs to be created if <application>repmgrd</application>
is in use.
</para>
</listitem>
@@ -198,7 +198,7 @@
</listitem>
<listitem>
<simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
written by &repmgrd;</simpara>
written by <application>repmgrd</application></simpara>
</listitem>
</itemizedlist>
</para>
@@ -214,7 +214,7 @@
name of the server's upstream node</simpara>
</listitem>
<listitem>
<simpara>repmgr.replication_status: when &repmgrd;'s monitoring is enabled, shows
<simpara>repmgr.replication_status: when <application>repmgrd</application>'s monitoring is enabled, shows
current monitoring status for each standby.</simpara>
</listitem>
</itemizedlist>

View File

@@ -1,13 +1,13 @@
<chapter id="promoting-standby" xreflabel="Promoting a standby">
<title>Promoting a standby server with repmgr</title>
<indexterm>
<primary>promoting a standby</primary>
<seealso>repmgr standby promote</seealso>
</indexterm>
<title>Promoting a standby server with repmgr</title>
<para>
If a primary server fails or needs to be removed from the replication cluster,
a new primary server must be designated, to ensure the cluster continues
to function correctly. This can be done with <xref linkend="repmgr-standby-promote"/>,
to function correctly. This can be done with <xref linkend="repmgr-standby-promote">,
which promotes the standby on the current server to primary.
</para>
@@ -31,7 +31,7 @@
At this point the replication cluster will be in a partially disabled state, with
both standbys accepting read-only connections while attempting to connect to the
stopped primary. Note that the &repmgr; metadata table will not yet have been updated;
executing <xref linkend="repmgr-cluster-show"/> will note the discrepancy:
executing <xref linkend="repmgr-cluster-show"> will note the discrepancy:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
@@ -60,7 +60,7 @@
DETAIL: node 2 was successfully promoted to primary</programlisting>
</para>
<para>
Executing <xref linkend="repmgr-cluster-show"/> will show the current state; as there is now an
Executing <xref linkend="repmgr-cluster-show"> will show the current state; as there is now an
active primary, the previous warning will not be displayed:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
@@ -72,8 +72,8 @@
</para>
<para>
However the sole remaining standby (<literal>node3</literal>) is still trying to replicate from the failed
primary; <xref linkend="repmgr-standby-follow"/> must now be executed to rectify this situation
(see <xref linkend="follow-new-primary"/> for example).
primary; <xref linkend="repmgr-standby-follow"> must now be executed to rectify this situation
(see <xref linkend="follow-new-primary"> for example).
</para>
</chapter>

View File

@@ -1,10 +1,6 @@
<chapter id="quickstart" xreflabel="Quick-start guide">
<title>Quick-start guide</title>
<indexterm>
<primary>quickstart</primary>
</indexterm>
<para>
This section gives a quick introduction to &repmgr;, including setting up a
sample &repmgr; installation and a basic replication cluster.
@@ -17,7 +13,7 @@
<note>
<simpara>
To upgrade an existing &repmgr; 3.x installation, see section
<xref linkend="upgrading-from-repmgr-3"/>.
<xref linkend="upgrading-from-repmgr-3">.
</simpara>
</note>
@@ -54,8 +50,7 @@
</para>
<para>
If you want <application>repmgr</application> to copy configuration files which are
located outside the PostgreSQL data directory, and/or to test
<command><link linkend="repmgr-standby-switchover">switchover</link></command>
located outside the PostgreSQL data directory, and/or to test <command>switchover</command>
functionality, you will also need passwordless SSH connections between both servers, and
<application>rsync</application> should be installed.
</para>
@@ -68,7 +63,7 @@
</tip>
</sect1>
<sect1 id="quickstart-postgresql-configuration" xreflabel="PostgreSQL configuration">
<sect1 id="quickstart-postgresql-configuration">
<title>PostgreSQL configuration</title>
<para>
On the primary server, a PostgreSQL instance must be initialised and running.
@@ -76,26 +71,13 @@
</para>
<programlisting>
# Enable replication connections; set this value to at least one more
# Enable replication connections; set this figure to at least one more
# than the number of standbys which will connect to this server
# (note that repmgr will execute "pg_basebackup" in WAL streaming mode,
# which requires two free WAL senders).
#
# See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-WAL-SENDERS
# (note that repmgr will execute `pg_basebackup` in WAL streaming mode,
# which requires two free WAL senders)
max_wal_senders = 10
# If using replication slots, set this value to at least one more
# than the number of standbys which will connect to this server.
# Note that repmgr will only make use of replication slots if
# "use_replication_slots" is set to "true" in "repmgr.conf".
# (If you are not intending to use replication slots, this value
# can be set to "0").
#
# See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS
max_replication_slots = 10
# Ensure WAL files contain enough information to enable read-only queries
# on the standby.
#
@@ -103,48 +85,47 @@
# PostgreSQL 9.6 and later: one of 'replica' or 'logical'
# ('hot_standby' will still be accepted as an alias for 'replica')
#
# See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-LEVEL
# See: https://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-LEVEL
wal_level = 'hot_standby'
# Enable read-only queries on a standby
# (Note: this will be ignored on a primary but we recommend including
# it anyway, in case the primary later becomes a standby)
#
# See: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-HOT-STANDBY
# it anyway)
hot_standby = on
# Enable WAL file archiving
#
# See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE
archive_mode = on
# Set archive command to a dummy command; this can later be changed without
# needing to restart the PostgreSQL instance.
#
# See: https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-COMMAND
# Set archive command to a script or application that will safely store
# you WALs in a secure place. /bin/true is an example of a command that
# ignores archiving. Use something more sensible.
archive_command = '/bin/true'
# If you have configured "pg_basebackup_options"
# in "repmgr.conf" to include the setting "--xlog-method=fetch" (from
# PostgreSQL 10 "--wal-method=fetch"), *and* you have not set
# "restore_command" in "repmgr.conf"to fetch WAL files from another
# source such as Barman, you'll need to set "wal_keep_segments" to a
# high enough value to ensure that all WAL files generated while
# the standby is being cloned are retained until the standby starts up.
#
# wal_keep_segments = 5000
</programlisting>
<tip>
<simpara>
Rather than editing these settings in the default <filename>postgresql.conf</filename>
file, create a separate file such as <filename>postgresql.replication.conf</filename> and
file, create a separate file such as <filename>postgresql.replication.conf</filename> and
include it from the end of the main configuration file with:
<command>include 'postgresql.replication.conf'</command>.
<command>include 'postgresql.replication.conf</command>.
</simpara>
</tip>
<para>
Additionally, if you are intending to use <application>pg_rewind</application>,
and the cluster was not initialised using data checksums, you may want to consider enabling
<varname>wal_log_hints</varname>; for more details see <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
<varname>wal_log_hints</varname>; for more details see <xref linkend="repmgr-node-rejoin-pg-rewind">.
</para>
<para>
See also the <link linkend="configuration-postgresql">PostgreSQL configuration</link> section in the
<link linkend="configuration">repmgr configuration guide</link>.
</para>
</sect1>
<sect1 id="quickstart-repmgr-user-database">
@@ -167,7 +148,7 @@
For the sake of simplicity, the <literal>repmgr</literal> user is created
as a superuser. If desired, it's possible to create the <literal>repmgr</literal>
user as a normal user. However for certain operations superuser permissions
are required; in this case the command line option <command>--superuser</command>
are requiredl; in this case the command line option <command>--superuser</command>
can be provided to specify a superuser.
</para>
<para>
@@ -215,20 +196,11 @@
<sect1 id="quickstart-standby-preparation">
<title>Preparing the standby</title>
<para>
On the standby, do <emphasis>not</emphasis> create a PostgreSQL instance (i.e.
do not execute <application>initdb</application> or any database creation
scripts provided by packages), but do ensure the destination
On the standby, do not create a PostgreSQL instance, but do ensure the destination
data directory (and any other directories which you want PostgreSQL to use)
exist and are owned by the <literal>postgres</literal> system user. Permissions
must be set to <literal>0700</literal> (<literal>drwx------</literal>).
</para>
<tip>
<simpara>
&repmgr; will place a copy of the primary's database files in this directory.
It will however refuse to run if a PostgreSQL instance has already been
created there.
</simpara>
</tip>
<para>
Check the primary database is reachable from the standby using <application>psql</application>:
</para>
@@ -238,7 +210,7 @@
<note>
<para>
&repmgr; stores connection information as <ulink
url="https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING">libpq
url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING">libpq
connection strings</ulink> throughout. This documentation refers to them as <literal>conninfo</literal>
strings; an alternative name is <literal>DSN</literal> (<literal>data source name</literal>).
We'll use these in place of the <command>-h hostname -d databasename -U username</command> syntax.
@@ -254,7 +226,7 @@
</para>
<programlisting>
node_id=1
node_name='node1'
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
</programlisting>
@@ -262,9 +234,10 @@
<para>
<filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory,
as it could be overwritten when setting up or reinitialising the PostgreSQL
server. See sections <xref linkend="configuration"/> and <xref linkend="configuration-file"/>
server. See sections <xref linkend="configuration"> and <xref linkend="configuration-file">
for further details about <filename>repmgr.conf</filename>.
</para>
<note>
<para>
&repmgr; only uses <option>pg_bindir</option> when it executes
@@ -284,7 +257,7 @@
<tip>
<simpara>
For Debian-based distributions we recommend explicitly setting
For Debian-based distributions we recommend explictly setting
<option>pg_bindir</option> to the directory where <command>pg_ctl</command> and other binaries
not in the standard path are located. For PostgreSQL 9.6 this would be <filename>/usr/lib/postgresql/9.6/bin/</filename>.
</simpara>
@@ -302,7 +275,7 @@
<para>
See the file
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</>
for details of all available configuration parameters.
</para>
@@ -351,7 +324,7 @@
slot_name |
config_file | /etc/repmgr.conf</programlisting>
<para>
Each server in the replication cluster will have its own record. If &repmgrd;
Each server in the replication cluster will have its own record. If <application>repmgrd</application>
is in use, the fields <literal>upstream_node_id</literal>, <literal>active</literal> and
<literal>type</literal> will be updated when the node's status or role changes.
</para>
@@ -367,7 +340,7 @@
</para>
<programlisting>
node_id=2
node_name='node2'
node_name=node2
conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'</programlisting>
<para>
@@ -405,10 +378,9 @@
</programlisting>
<para>
This has cloned the PostgreSQL data directory files from the primary <literal>node1</literal>
using PostgreSQL's <command>pg_basebackup</command> utility. Replication configuration
containing the correct parameters to start streaming from this primary server will be
automatically appended to <filename>postgresql.auto.conf</filename>. (In PostgreSQL 11
and earlier the file <filename>recovery.conf</filename> will be created).
using PostgreSQL's <command>pg_basebackup</command> utility. A <filename>recovery.conf</filename>
file containing the correct parameters to start streaming from this primary server will be created
automatically.
</para>
<note>
<simpara>
@@ -460,7 +432,7 @@
</para>
<para>
From PostgreSQL 9.6 you can also use the view
<ulink url="https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW">
<ulink url="https://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW">
<literal>pg_stat_wal_receiver</literal></ulink> to check the replication status from the standby.
<programlisting>
@@ -478,14 +450,11 @@
latest_end_lsn | 0/7000538
latest_end_time | 2017-08-28 15:20:56.418735+09
slot_name |
sender_host | node1
sender_port | 5432
conninfo | user=repmgr dbname=replication host=node1 application_name=node2
</programlisting>
Note that the <varname>conninfo</varname> value is that generated in <filename>postgresql.auto.conf</filename>
(PostgreSQL 11 and earlier: <filename>recovery.conf</filename>) and will differ slightly from the primary's
<varname>conninfo</varname> as set in <filename>repmgr.conf</filename> - among others it will contain the
connecting node's name as <varname>application_name</varname>.
Note that the <varname>conninfo</varname> value is that generated in <filename>recovery.conf</filename>
and will differ slightly from the primary's <varname>conninfo</varname> as set in <filename>repmgr.conf</filename> -
among others it will contain the connecting node's name as <varname>application_name</varname>.
</para>
</sect1>
@@ -500,12 +469,11 @@
<para>
Check the node is registered by executing <command>repmgr cluster show</command> on the standby:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | 100 | 1 | host=node2 dbname=repmgr user=repmgr</programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+---------+-----------+----------+----------+--------------------------------------
1 | node1 | primary | * running | | default | host=node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | host=node2 dbname=repmgr user=repmgr</programlisting>
</para>
<para>
Both nodes are now registered with &repmgr; and the records have been copied to the standby server.

View File

@@ -38,7 +38,7 @@
<title>Notes</title>
<para>
Monitoring history will only be written if &repmgrd; is active, and
Monitoring history will only be written if <application>repmgrd</application> is active, and
<varname>monitoring_history</varname> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>.
</para>
@@ -69,8 +69,8 @@
<refsect1>
<title>See also</title>
<para>
For more details see the sections <xref linkend="repmgrd-monitoring"/> and
<xref linkend="repmgrd-monitoring-configuration"/>.
For more details see the sections <xref linkend="repmgrd-monitoring"> and
<xref linkend="repmgrd-monitoring-configuration">.
</para>
</refsect1>

View File

@@ -16,9 +16,9 @@
<refsect1>
<title>Description</title>
<para>
<command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix"/>,
<command>repmgr cluster crosscheck</command> is similar to <xref linkend="repmgr-cluster-matrix">,
but cross-checks connections between each combination of nodes. In "Example 3" in
<xref linkend="repmgr-cluster-matrix"/> we have no information about the state of <literal>node3</literal>.
<xref linkend="repmgr-cluster-matrix"> we have no information about the state of <literal>node3</literal>.
However by running <command>repmgr cluster crosscheck</command> it's possible to get a better
overview of the cluster situation:
<programlisting>
@@ -42,7 +42,7 @@
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr cluster crosscheck</command>:
Following exit codes can be emitted by <command>repmgr cluster crosscheck</command>:
</para>
<variablelist>

View File

@@ -40,12 +40,12 @@
<simpara><literal>--node-name</literal>: restrict entries to node with this name</simpara>
</listitem>
<listitem>
<simpara><literal>--event</literal>: filter specific event (see <xref linkend="event-notifications"/> for a full list)</simpara>
<simpara><literal>--event</literal>: filter specific event (see <xref linkend="event-notifications"> for a full list)</simpara>
</listitem>
</itemizedlist>
</para>
<para>
The &quot;Details&quot; column can be omitted by providing <literal>--compact</literal>.
The "Details" column can be omitted by providing <literal>--terse</literal>.
</para>
</refsect1>
@@ -71,9 +71,9 @@
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster event --event=standby_register
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+------------------+----+---------------------+-------------------------------------------------------
3 | node3 | standby_register | t | 2019-04-16 10:59:59 | standby registration succeeded; upstream node ID is 1
2 | node2 | standby_register | t | 2019-04-16 10:59:57 | standby registration succeeded; upstream node ID is 1</programlisting>
---------+-------+------------------+----+---------------------+--------------------------------
3 | node3 | standby_register | t | 2017-08-17 10:28:55 | standby registration succeeded
2 | node2 | standby_register | t | 2017-08-17 10:28:53 | standby registration succeeded</programlisting>
</para>
</refsect1>
</refentry>

View File

@@ -93,7 +93,7 @@
connection from <literal>node3</literal>.
</para>
<para>
In this case, the <xref linkend="repmgr-cluster-crosscheck"/> command will produce a more
In this case, the <xref linkend="repmgr-cluster-crosscheck"> command will produce a more
useful result.
</para>
</refsect1>
@@ -102,7 +102,7 @@
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr cluster matrix</command>:
Following exit codes can be emitted by <command>repmgr cluster matrix</command>:
</para>
<variablelist>

View File

@@ -0,0 +1,172 @@
<refentry id="repmgr-cluster-show">
<indexterm>
<primary>repmgr cluster show</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr cluster show</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr cluster show</refname>
<refpurpose>display information about each registered node in the replication cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Displays information about each registered node in the replication cluster. This
command polls each registered server and shows its role (<literal>primary</literal> /
<literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
directly and can be run on any node in the cluster; this is also useful when analyzing
connectivity from a particular node.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
This command requires either a valid <filename>repmgr.conf</filename> file or a database
connection string to one of the registered nodes; no additional arguments are needed.
</para>
<para>
To show database connection errors when polling nodes, run the command in
<literal>--verbose</literal> mode.
</para>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+---------+-----------+----------+----------+-----------------------------------------
1 | node1 | primary | * running | | default | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr</programlisting>
</para>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
The column <literal>Role</literal> shows the expected server role according to the
&repmgr; metadata. <literal>Status</literal> shows whether the server is running or unreachable.
If the node has an unexpected role not reflected in the &repmgr; metadata, e.g. a node was manually
promoted to primary, this will be highlighted with an exclamation mark, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+---------+----------------------+----------+----------+-----------------------------------------
1 | node1 | primary | ? unreachable | | default | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | ! running as primary | node1 | default | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | node1 | default | host=db_node3 dbname=repmgr user=repmgr
WARNING: following issues were detected
node "node1" (ID: 1) is registered as an active primary but is unreachable
node "node2" (ID: 2) is registered as standby but running as primary</programlisting>
</para>
<para>
Node availability is tested by connecting from the node where
<command>repmgr cluster show</command> is executed, and does not necessarily imply the node
is down. See <xref linkend="repmgr-cluster-matrix"> and <xref linkend="repmgr-cluster-crosscheck"> to get
a better overviews of connections between nodes.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--csv</option></term>
<listitem>
<para>
<command>repmgr cluster show</command> accepts an optional parameter <literal>--csv</literal>, which
outputs the replication cluster's status in a simple CSV format, suitable for
parsing by scripts, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show --csv
1,-1,-1
2,0,0
3,0,1</programlisting>
</para>
<para>
The columns have following meanings:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
node ID
</simpara>
</listitem>
<listitem>
<simpara>
availability (0 = available, -1 = unavailable)
</simpara>
</listitem>
<listitem>
<simpara>
recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verbose</option></term>
<listitem>
<para>
Display the full text of any database connection error messages
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr cluster show</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status">, <xref linkend="repmgr-node-check">, <xref linkend="repmgr-daemon-status">
</para>
</refsect1>
</refentry>

View File

@@ -1,245 +0,0 @@
<refentry id="repmgr-cluster-show">
<indexterm>
<primary>repmgr cluster show</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr cluster show</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr cluster show</refname>
<refpurpose>display information about each registered node in the replication cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Displays information about each registered node in the replication cluster. This
command polls each registered server and shows its role (<literal>primary</literal> /
<literal>standby</literal>) and status. It polls each server
directly and can be run on any node in the cluster; this is also useful when analyzing
connectivity from a particular node.
</para>
<para>
For PostgreSQL 9.6 and later, the output will also contain the node's current timeline ID.
</para>
<para>
Node availability is tested by connecting from the node where
<command>repmgr cluster show</command> is executed, and does not necessarily imply the node
is down. See <xref linkend="repmgr-cluster-matrix"/> and <xref linkend="repmgr-cluster-crosscheck"/> to get
better overviews of connections between nodes.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
This command requires either a valid <filename>repmgr.conf</filename> file or a database
connection string to one of the registered nodes; no additional arguments are needed.
</para>
<para>
To show database connection errors when polling nodes, run the command in
<literal>--verbose</literal> mode.
</para>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | 100 | 1 | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr
4 | node4 | standby | running | node1 | default | 100 | 1 | host=db_node4 dbname=repmgr user=repmgr
5 | node5 | witness | * running | node1 | default | 0 | n/a | host=db_node5 dbname=repmgr user=repmgr</programlisting>
</para>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
The column <literal>Role</literal> shows the expected server role according to the
&repmgr; metadata.
</para>
<para>
<literal>Status</literal> shows whether the server is running or unreachable.
If the node has an unexpected role not reflected in the &repmgr; metadata, e.g. a node was manually
promoted to primary, this will be highlighted with an exclamation mark.
If a connection to the node cannot be made, this will be highlighted with a question mark.
Note that the node will only be shown as <literal>? unreachable</literal>
if a connection is not possible at network level; if the PostgreSQL instance on the
node is pingable but not accepting connections, it will be shown as <literal>? running</literal>.
</para>
<para>
In the following example, executed on <literal>node3</literal>, <literal>node1</literal> is not reachable
at network level and assumed to be down; <literal>node2</literal> has been promoted to primary
(but <literal>node3</literal> is not attached to it, and its metadata has not yet been updated);
<literal>node4</literal> is running but rejecting connections (from <literal>node3</literal> at least).
<programlisting>
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+----------------------+----------+----------+----------+----------+----------------------------------------------------
1 | node1 | primary | ? unreachable | | default | 100 | | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | ! running as primary | ? node1 | default | 100 | 2 | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | ? node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr
4 | node4 | standby | ? running | ? node1 | default | 100 | | host=db_node4 dbname=repmgr user=repmgr
WARNING: following issues were detected
- unable to connect to node "node1" (ID: 1)
- node "node1" (ID: 1) is registered as an active primary but is unreachable
- node "node2" (ID: 2) is registered as standby but running as primary
- unable to connect to node "node2" (ID: 2)'s upstream node "node1" (ID: 1)
- unable to determine if node "node2" (ID: 2) is attached to its upstream node "node1" (ID: 1)
- unable to connect to node "node3" (ID: 3)'s upstream node "node1" (ID: 1)
- unable to determine if node "node3" (ID: 3) is attached to its upstream node "node1" (ID: 1)
- unable to connect to node "node4" (ID: 4)
HINT: execute with --verbose option to see connection error messages</programlisting>
</para>
<para>
To diagnose connection issues, execute <command>repmgr cluster show</command>
with the <option>--verbose</option> option; this will display the error message
for each failed connection attempt.
</para>
<tip>
<para>
Use <xref linkend="repmgr-cluster-matrix"/> and <xref linkend="repmgr-cluster-crosscheck"/>
to diagnose connection issues across the whole replication cluster.
</para>
</tip>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--csv</option></term>
<listitem>
<para>
<command>repmgr cluster show</command> accepts an optional parameter <literal>--csv</literal>, which
outputs the replication cluster's status in a simple CSV format, suitable for
parsing by scripts, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show --csv
1,-1,-1
2,0,0
3,0,1</programlisting>
</para>
<para>
The columns have following meanings:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
node ID
</simpara>
</listitem>
<listitem>
<simpara>
availability (0 = available, -1 = unavailable)
</simpara>
</listitem>
<listitem>
<simpara>
recovery state (0 = not in recovery, 1 = in recovery, -1 = unknown)
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--compact</option></term>
<listitem>
<para>
Suppress display of the <literal>conninfo</literal> column.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--terse</option></term>
<listitem>
<para>
Suppress warnings about connection issues.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verbose</option></term>
<listitem>
<para>
Display the full text of any database connection error messages
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr cluster show</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_BAD_CONFIG (1)</option></term>
<listitem>
<para>
An issue was encountered while attempting to retrieve
&repmgr; metadata.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_DB_CONN (6)</option></term>
<listitem>
<para>
&repmgr; was unable to connect to the local PostgreSQL instance.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected with the replication configuration,
e.g. a node was not in its expected state.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status"/>, <xref linkend="repmgr-node-check"/>, <xref linkend="repmgr-service-status"/>
</para>
</refsect1>
</refentry>

View File

@@ -1,60 +1,46 @@
<refentry id="repmgr-service-pause">
<refentry id="repmgr-daemon-pause">
<indexterm>
<primary>repmgr service pause</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>pausing</secondary>
<primary>repmgr daemon pause</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr service pause</refentrytitle>
<refentrytitle>repmgr daemon pause</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service pause</refname>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to pause failover operations</refpurpose>
<refname>repmgr daemon pause</refname>
<refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to pause failover operations</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command can be run on any active node in the replication cluster to instruct all
running &repmgrd; instances to &quot;pause&quot; themselves, i.e. take no
running <application>repmgrd</application> instances to &quot;pause&quot; themselves, i.e. take no
action (such as promoting themselves or following a new primary) if a failover event is detected.
</para>
<para>
This functionality is useful for performing maintenance operations, such as switchovers
or upgrades, which might otherwise trigger a failover if &repmgrd;
or upgrades, which might otherwise trigger a failover if <application>repmgrd</application>
is running normally.
</para>
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr service pause</command>, as the &repmgrd; instance
<command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
<para>
<xref linkend="repmgr-service-unpause"/> will instruct all previously paused &repmgrd;
<xref linkend="repmgr-daemon-unpause"> will instruct all previously paused <application>repmgrd</application>
instances to resume normal failover operation.
</para>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL must be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service pause</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service pause</command> can be executed on any active node in the
<command>repmgr daemon pause</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
It will have no effect on previously paused nodes.
</para>
@@ -64,7 +50,7 @@
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf service pause
$ repmgr -f /etc/repmgr.conf daemon pause
NOTICE: node 1 (node1) paused
NOTICE: node 2 (node2) paused
NOTICE: node 3 (node3) paused</programlisting>
@@ -78,7 +64,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check if nodes are reachable but don't pause &repmgrd;.
Check if nodes are reachable but don't pause <application>repmgrd</application>.
</para>
</listitem>
</varlistentry>
@@ -88,7 +74,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr service unpause</command>:
Following exit codes can be emitted by <command>repmgr daemon unpause</command>:
</para>
<variablelist>
@@ -96,7 +82,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
&repmgrd; could be paused on all nodes.
<application>repmgrd</application> could be paused on all nodes.
</para>
</listitem>
</varlistentry>
@@ -105,7 +91,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<term><option>ERR_REPMGRD_PAUSE (26)</option></term>
<listitem>
<para>
&repmgrd; could not be paused on one or mode nodes.
<application>repmgrd</application> could not be paused on one or mode nodes.
</para>
</listitem>
</varlistentry>
@@ -116,11 +102,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-unpause"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
<xref linkend="repmgr-daemon-unpause">, <xref linkend="repmgr-daemon-status">
</para>
</refsect1>
</refentry>

View File

@@ -1,208 +0,0 @@
<refentry id="repmgr-daemon-start">
<indexterm>
<primary>repmgr daemon start</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>starting</secondary>
</indexterm>
<refmeta>
<refentrytitle>repmgr daemon start</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr daemon start</refname>
<refpurpose>Start the &repmgrd; daemon on the local node</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command starts the &repmgrd; service on the
local node.
</para>
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
started. This behaviour can be overridden by specifying a different value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>
<important>
<para>
The <filename>repmgr.conf</filename> parameter <varname>repmgrd_service_start_command</varname>
must be set for <command>repmgr daemon start</command> to work; see section
<xref linkend="repmgr-daemon-start-configuration"/> for details.
</para>
</important>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually attempt to start &repmgrd;.
</para>
<para>
This action will output the command which would be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for the specified number of seconds to confirm that &repmgrd;
started successfully.
</para>
<para>
Note that providing <option>--wait=0</option> is the equivalent of <option>--no-wait</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--no-wait</option></term>
<listitem>
<para>
Don't wait to confirm that &repmgrd;
started successfully.
</para>
<para>
This is equivalent to providing <option>--wait=0</option>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-daemon-start-configuration" xreflabel="repmgr daemon start configuration">
<title>Configuration file settings</title>
<para>
The following parameter in <filename>repmgr.conf</filename> is relevant
to <command>repmgr daemon start</command>:
</para>
<variablelist>
<varlistentry>
<term><option>repmgrd_service_start_command</option></term>
<listitem>
<indexterm>
<primary>repmgrd_service_start_command</primary>
<secondary>with &quot;repmgr daemon start&quot;</secondary>
</indexterm>
<para>
<command>repmgr daemon start</command> will execute the command defined by the
<varname>repmgrd_service_start_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will start &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
<important>
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_start_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr daemon start</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The &repmgrd; start command (defined in
<varname>repmgrd_service_start_command</varname>) was successfully executed.
</para>
<para>
If the <option>--wait</option> option was provided, &repmgr; will confirm that
&repmgrd; has actually started up.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_BAD_CONFIG (1)</option></term>
<listitem>
<para>
<varname>repmgrd_service_start_command</varname> is not defined in
<filename>repmgr.conf</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_DB_CONN (6)</option></term>
<listitem>
<para>
&repmgr; was unable to connect to the local PostgreSQL node.
</para>
<para>
PostgreSQL must be running before &repmgrd;
can be started. Additionally, unless the <option>--no-wait</option> option was
provided, &repmgr; needs to be able to connect to the local PostgreSQL node
to determine the state of &repmgrd;.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_REPMGRD_SERVICE (27)</option></term>
<listitem>
<para>
The &repmgrd; start command (defined in
<varname>repmgrd_service_start_command</varname>) was not successfully executed.
</para>
<para>
This can also mean that &repmgr; was unable to confirm whether &repmgrd;
successfully started (unless the <option>--no-wait</option> option was provided).
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-daemon-stop"/>,
<xref linkend="repmgrd-daemon"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>
</para>
</refsect1>
</refentry>

View File

@@ -0,0 +1,165 @@
<refentry id="repmgr-daemon-status">
<indexterm>
<primary>repmgr daemon status</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr daemon status</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr daemon status</refname>
<refpurpose>display information about the status of <application>repmgrd</application> on each node in the cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command provides an overview over all active nodes in the cluster and the state
of each node's <application>repmgrd</application> instance. It can be used to check
the result of <xref linkend="repmgr-daemon-pause"> and <xref linkend="repmgr-daemon-unpause">
operations.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr daemon status</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
</para>
<note>
<para>
After restarting PostgreSQL on any node, the <application>repmgrd</application> instance
will take a second or two before it is able to update its status. Until then,
<application>repmgrd</application> will be shown as not running.
</para>
</note>
</refsect1>
<refsect1>
<title>Examples</title>
<para>
<application>repmgrd</application> running normally on all nodes:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | repmgrd | PID | Paused?
----+-------+---------+---------+---------+------+---------
1 | node1 | primary | running | running | 7851 | no
2 | node2 | standby | running | running | 7889 | no
3 | node3 | standby | running | running | 7918 | no</programlisting>
</para>
<para>
<application>repmgrd</application> paused on all nodes (using <xref linkend="repmgr-daemon-pause">):
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | repmgrd | PID | Paused?
----+-------+---------+---------+---------+------+---------
1 | node1 | primary | running | running | 7851 | yes
2 | node2 | standby | running | running | 7889 | yes
3 | node3 | standby | running | running | 7918 | yes</programlisting>
</para>
<para>
<application>repmgrd</application> not running on one node:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | repmgrd | PID | Paused?
----+-------+---------+---------+-------------+------+---------
1 | node1 | primary | running | running | 7851 | yes
2 | node2 | standby | running | not running | n/a | n/a
3 | node3 | standby | running | running | 7918 | yes</programlisting>
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--csv</option></term>
<listitem>
<para>
<command>repmgr daemon status</command> accepts an optional parameter <literal>--csv</literal>, which
outputs the replication cluster's status in a simple CSV format, suitable for
parsing by scripts, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf daemon status --csv
1,node1,primary,1,1,10204,1
2,node2,standby,1,0,-1,1
3,node3,standby,1,1,10225,1</programlisting>
</para>
<para>
The columns have following meanings:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
node ID
</simpara>
</listitem>
<listitem>
<simpara>
node name
</simpara>
</listitem>
<listitem>
<simpara>
node type (primary or standby)
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL server running
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> running (1 = running, 0 = not running)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> PID (-1 if not running)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> paused (1 = paused, 0 = not paused)
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verbose</option></term>
<listitem>
<para>
Display the full text of any database connection error messages
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-daemon-pause">, <xref linkend="repmgr-daemon-unpause">, <xref linkend="repmgr-cluster-show">
</para>
</refsect1>
</refentry>

View File

@@ -1,205 +0,0 @@
<refentry id="repmgr-daemon-stop">
<indexterm>
<primary>repmgr daemon stop</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>stopping</secondary>
</indexterm>
<refmeta>
<refentrytitle>repmgr daemon stop</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr daemon stop</refname>
<refpurpose>Stop the &repmgrd; daemon on the local node</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command stops the &repmgrd; daemon on the
local node.
</para>
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
stopped. This behaviour can be overridden by specifying a different value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>
<note>
<para>
If PostgreSQL is not running on the local node, under some circumstances &repmgr; may not
be able to confirm if &repmgrd; has actually stopped.
</para>
</note>
<important>
<para>
The <filename>repmgr.conf</filename> parameter <varname>repmgrd_service_stop_command</varname>
must be set for <command>repmgr daemon stop</command> to work; see section
<xref linkend="repmgr-daemon-stop-configuration"/> for details.
</para>
</important>
</refsect1>
<refsect1>
<title>Configuration</title>
<para>
<command>repmgr daemon stop</command> will execute the command defined by the
<varname>repmgrd_service_stop_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will stop &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
<important>
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_stop_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually attempt to stop &repmgrd;.
</para>
<para>
This action will output the command which would be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for the specified number of seconds to confirm that &repmgrd;
stopped successfully.
</para>
<para>
Note that providing <option>--wait=0</option> is the equivalent of <option>--no-wait</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--no-wait</option></term>
<listitem>
<para>
Don't wait to confirm that &repmgrd;
stopped successfully.
</para>
<para>
This is equivalent to providing <option>--wait=0</option>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-daemon-stop-configuration" xreflabel="repmgr daemon stop configuration">
<title>Configuration file settings</title>
<para>
The following parameter in <filename>repmgr.conf</filename> is relevant
to <command>repmgr daemon stop</command>:
</para>
<variablelist>
<varlistentry>
<term><option>repmgrd_service_stop_command</option></term>
<listitem>
<indexterm>
<primary>repmgrd_service_stop_command</primary>
<secondary>with &quot;repmgr daemon stop&quot;</secondary>
</indexterm>
<para>
<command>repmgr daemon stop</command> will execute the command defined by the
<varname>repmgrd_service_stop_command</varname> parameter in <filename>repmgr.conf</filename>.
This must be set to a shell command which will stop &repmgrd;;
if &repmgr; was installed from a package, this will be the service command defined by the
package. For more details see <link linkend="appendix-packages">Appendix: &repmgr; package details</link>.
</para>
<important>
<para>
If &repmgr; was installed from a system package, and you do not configure
<varname>repmgrd_service_stop_command</varname> to an appropriate service command, this may
result in the system becoming confused about the state of the &repmgrd;
service; this is particularly the case with <literal>systemd</literal>.
</para>
</important>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr daemon stop</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
&repmgrd; could be stopped.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_BAD_CONFIG (1)</option></term>
<listitem>
<para>
<varname>repmgrd_service_stop_command</varname> is not defined in
<filename>repmgr.conf</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_REPMGRD_SERVICE (27)</option></term>
<listitem>
<para>
&repmgrd; could not be stopped.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgrd-daemon"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>
</para>
</refsect1>
</refentry>

View File

@@ -1,54 +1,40 @@
<refentry id="repmgr-service-unpause">
<refentry id="repmgr-daemon-unpause">
<indexterm>
<primary>repmgr service unpause</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>unpausing</secondary>
<primary>repmgr daemon unpause</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr service unpause</refentrytitle>
<refentrytitle>repmgr daemon unpause</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service unpause</refname>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to resume failover operations</refpurpose>
<refname>repmgr daemon unpause</refname>
<refpurpose>Instruct all <application>repmgrd</application> instances in the replication cluster to resume failover operations</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command can be run on any active node in the replication cluster to instruct all
running &repmgrd; instances to &quot;unpause&quot;
(following a previous execution of <xref linkend="repmgr-service-pause"/>)
running <application>repmgrd</application> instances to &quot;unpause&quot;
(following a previous execution of <xref linkend="repmgr-daemon-pause">)
and resume normal failover/monitoring operation.
</para>
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr service pause</command>, as the &repmgrd; instance
<command>repmgr daemon pause</command>, as the <application>repmgrd</application> instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL must be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service pause</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service unpause</command> can be executed on any active node in the
<command>repmgr daemon unpause</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
It will have no effect on nodes which are not already paused.
</para>
@@ -58,7 +44,7 @@
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf service unpause
$ repmgr -f /etc/repmgr.conf daemon unpause
NOTICE: node 1 (node1) unpaused
NOTICE: node 2 (node2) unpaused
NOTICE: node 3 (node3) unpaused</programlisting>
@@ -72,7 +58,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check if nodes are reachable but don't unpause &repmgrd;.
Check if nodes are reachable but don't unpause <application>repmgrd</application>.
</para>
</listitem>
</varlistentry>
@@ -82,7 +68,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr service unpause</command>:
Following exit codes can be emitted by <command>repmgr daemon unpause</command>:
</para>
<variablelist>
@@ -90,7 +76,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
&repmgrd; could be unpaused on all nodes.
<application>repmgrd</application> could be unpaused on all nodes.
</para>
</listitem>
</varlistentry>
@@ -99,7 +85,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<term><option>ERR_REPMGRD_PAUSE (26)</option></term>
<listitem>
<para>
&repmgrd; could not be unpaused on one or mode nodes.
<application>repmgrd</application> could not be unpaused on one or mode nodes.
</para>
</listitem>
</varlistentry>
@@ -110,11 +96,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
<xref linkend="repmgr-daemon-pause">, <xref linkend="repmgr-daemon-status">
</para>
</refsect1>
</refentry>

189
doc/repmgr-node-check.sgml Normal file
View File

@@ -0,0 +1,189 @@
<refentry id="repmgr-node-check">
<indexterm>
<primary>repmgr node check</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr node check</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr node check</refname>
<refpurpose>performs some health checks on a node from a replication perspective</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Performs some health checks on a node from a replication perspective.
This command must be run on the local node.
</para>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf node check
Node "node1":
Server role: OK (node is primary)
Replication lag: OK (N/A - node is primary)
WAL archiving: OK (0 pending files)
Downstream servers: OK (2 of 2 downstream nodes attached)
Replication slots: OK (node has no replication slots)
Missing replication slots: OK (node has no missing replication slots)</programlisting>
</para>
</refsect1>
<refsect1>
<title>Individual checks</title>
<para>
Each check can be performed individually by supplying
an additional command line parameter, e.g.:
<programlisting>
$ repmgr node check --role
OK (node is primary)</programlisting>
</para>
<para>
Parameters for individual checks are as follows:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>--role</literal>: checks if the node has the expected role
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--replication-lag</literal>: checks if the node is lagging by more than
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived,
and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--downstream</literal>: checks that the expected downstream nodes are attached
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--slots</literal>: checks there are no inactive replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--missing-slots</literal>: checks there are no missing replication slots
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Output format</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>--csv</literal>: generate output in CSV format (not available
for individual checks)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--nagios</literal>: generate output in a Nagios-compatible format
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
When executing <command>repmgr node check</command> with one of the individual
checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
(even if <literal>--nagios</literal> is not supplied):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>0</literal>: OK
</simpara>
</listitem>
<listitem>
<simpara>
<literal>1</literal>: WARNING
</simpara>
</listitem>
<listitem>
<simpara>
<literal>2</literal>: ERROR
</simpara>
</listitem>
<listitem>
<simpara>
<literal>3</literal>: UNKNOWN
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Following exit codes can be emitted by <command>repmgr status check</command>
if no individual check was specified.
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status">, <xref linkend="repmgr-cluster-show">
</para>
</refsect1>
</refentry>

View File

@@ -1,308 +0,0 @@
<refentry id="repmgr-node-check">
<indexterm>
<primary>repmgr node check</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr node check</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr node check</refname>
<refpurpose>performs some health checks on a node from a replication perspective</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Performs some health checks on a node from a replication perspective.
This command must be run on the local node.
</para>
<note>
<para>
Currently &repmgr; performs health checks on physical replication
slots only, with the aim of warning about streaming replication standbys which
have become detached and the associated risk of uncontrolled WAL file
growth.
</para>
</note>
</refsect1>
<refsect1>
<title>Example</title>
<para>
Execution on the primary server:
<programlisting>
$ repmgr -f /etc/repmgr.conf node check
Node "node1":
Server role: OK (node is primary)
Replication lag: OK (N/A - node is primary)
WAL archiving: OK (0 pending files)
Upstream connection: OK (N/A - is primary)
Downstream servers: OK (2 of 2 downstream nodes attached)
Replication slots: OK (node has no physical replication slots)
Missing replication slots: OK (node has no missing physical replication slots)
Configured data directory: OK (configured "data_directory" is "/var/lib/postgresql/data")</programlisting>
</para>
<para>
Execution on a standby server:
<programlisting>
$ repmgr -f /etc/repmgr.conf node check
Node "node2":
Server role: OK (node is standby)
Replication lag: OK (0 seconds)
WAL archiving: OK (0 pending archive ready files)
Upstream connection: OK (node "node2" (ID: 2) is attached to expected upstream node "node1" (ID: 1))
Downstream servers: OK (this node has no downstream nodes)
Replication slots: OK (node has no physical replication slots)
Missing physical replication slots: OK (node has no missing physical replication slots)
Configured data directory: OK (configured "data_directory" is "/var/lib/postgresql/data")</programlisting>
</para>
</refsect1>
<refsect1>
<title>Individual checks</title>
<para>
Each check can be performed individually by supplying
an additional command line parameter, e.g.:
<programlisting>
$ repmgr node check --role
OK (node is primary)</programlisting>
</para>
<para>
Parameters for individual checks are as follows:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--role</option>: checks if the node has the expected role
</simpara>
</listitem>
<listitem>
<simpara>
<option>--replication-lag</option>: checks if the node is lagging by more than
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<option>--archive-ready</option>: checks for WAL files which have not yet been archived,
and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
</simpara>
</listitem>
<listitem>
<simpara>
<option>--downstream</option>: checks that the expected downstream nodes are attached
</simpara>
</listitem>
<listitem>
<simpara>
<option>--upstream</option>: checks that the node is attached to its expected upstream
</simpara>
</listitem>
<listitem>
<simpara>
<option>--slots</option>: checks there are no inactive physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<option>--missing-slots</option>: checks there are no missing physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<option>--data-directory-config</option>: checks the data directory configured in
<filename>repmgr.conf</filename> matches the actual data directory.
This check is not directly related to replication, but is useful to verify &repmgr;
is correctly configured.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>repmgrd</title>
<para>
A separate check is available to verify whether &repmgrd; is running,
This is not included in the general output, as this does not
per-se constitute a check of the node's replication status.
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--repmgrd</option>: checks whether &repmgrd; is running.
If &repmgrd; is running but paused, status <literal>1</literal>
(<literal>WARNING</literal>) is returned.
</simpara>
</listitem>
</itemizedlist>
</refsect1>
<refsect1>
<title>Additional checks</title>
<para>
Several checks are provided for diagnostic purposes and are not
included in the general output:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--db-connection</option>: checks if &repmgr; can connect to the
database on the local node.
</simpara>
<simpara>
This option is particularly useful in combination with <command>SSH</command>, as
it can be used to troubleshoot connection issues encountered when &repmgr; is
executed remotely (e.g. during a switchover operation).
</simpara>
</listitem>
<listitem>
<simpara>
<option>--replication-config-owner</option>: checks if the file containing replication
configuration (PostgreSQL 12 and later: <filename>postgresql.auto.conf</filename>;
PostgreSQL 11 and earlier: <filename>recovery.conf</filename>) is
owned by the same user who owns the data directory.
</simpara>
<simpara>
Incorrect ownership of these files (e.g. if they are owned by <literal>root</literal>)
will cause operations which need to update the replication configuration
(e.g. <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
or <link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>)
to fail.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Connection options</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>-S</option>/<option>--superuser</option>: connect as the
named superuser instead of the &repmgr; user
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Output format</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--csv</option>: generate output in CSV format (not available
for individual checks)
</simpara>
</listitem>
<listitem>
<simpara>
<option>--nagios</option>: generate output in a Nagios-compatible format
(for individual checks only)
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
When executing <command>repmgr node check</command> with one of the individual
checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
(even if <option>--nagios</option> is not supplied):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>0</literal>: OK
</simpara>
</listitem>
<listitem>
<simpara>
<literal>1</literal>: WARNING
</simpara>
</listitem>
<listitem>
<simpara>
<literal>2</literal>: ERROR
</simpara>
</listitem>
<listitem>
<simpara>
<literal>3</literal>: UNKNOWN
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
One of the following exit codes will be emitted by <command>repmgr status check</command>
if no individual check was specified.
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status"/>, <xref linkend="repmgr-cluster-show"/>
</para>
</refsect1>
</refentry>

264
doc/repmgr-node-rejoin.sgml Normal file
View File

@@ -0,0 +1,264 @@
<refentry id="repmgr-node-rejoin">
<indexterm>
<primary>repmgr node rejoin</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr node rejoin</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr node rejoin</refname>
<refpurpose>rejoin a dormant (stopped) node to the replication cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Enables a dormant (stopped) node to be rejoined to the replication cluster.
</para>
<para>
This can optionally use <application>pg_rewind</application> to re-integrate
a node which has diverged from the rest of the cluster, typically a failed primary.
</para>
<tip>
<para>
If the node is running and needs to be attached to the current primary, use
<xref linkend="repmgr-standby-follow">.
</para>
<para>
Note <xref linkend="repmgr-standby-follow"> can only be used for standbys which have not diverged
from the rest of the cluster.
</para>
</tip>
</refsect1>
<refsect1>
<title>Usage</title>
<para>
<programlisting>
repmgr node rejoin -d '$conninfo'</programlisting>
where <literal>$conninfo</literal> is the conninfo string of any reachable node in the cluster.
<filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
otherwise available.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute the rejoin.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
<listitem>
<para>
Execute <application>pg_rewind</application>.
</para>
<para>
It is only necessary to provide the <application>pg_rewind</application> path
if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
is not installed in the PostgreSQL <filename>bin</filename> directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-files</option></term>
<listitem>
<para>
comma-separated list of configuration files to retain after
executing <application>pg_rewind</application>.
</para>
<para>
Currently <application>pg_rewind</application> will overwrite
the local node's configuration files with the files from the source node,
so it's advisable to use this option to ensure they are kept.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-archive-dir</option></term>
<listitem>
<para>
Directory to temporarily store configuration files specified with
<option>--config-files</option>; default: <filename>/tmp</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-W/--no-wait</option></term>
<listitem>
<para>
Don't wait for the node to rejoin cluster.
</para>
<para>
If this option is supplied, &repmgr; will restart the node but
not wait for it to connect to the primary.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>node_rejoin_timeout</literal>:
the maximum length of time (in seconds) to wait for
the node to reconnect to the replication cluster (defaults to
the value set in <literal>standby_reconnect_timeout</literal>,
60 seconds).
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1 id="repmgr-node-rejoin-events">
<title>Event notifications</title>
<para>
A <literal>node_rejoin</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
Currently <command>repmgr node rejoin</command> can only be used to attach
a standby to the current primary, not another standby.
</para>
<para>
The node must have been shut down cleanly; if this was not the case, it will
need to be manually started (remove any existing <filename>recovery.conf</filename> file first)
until it has reached a consistent recovery point, then shut down cleanly.
</para>
<tip>
<para>
If <application>PostgreSQL</application> is started in single-user mode and
input is directed from <filename>/dev/null/</filename>, it will perform recovery
then immediately quit, and will then be in a state suitable for use by
<application>pg_rewind</application>.
<programlisting>
rm -f /var/lib/pgsql/data/recovery.conf
postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
</para>
</tip>
</refsect1>
<refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
<indexterm>
<primary>pg_rewind</primary>
<secondary>using with "repmgr node rejoin"</secondary>
</indexterm>
<title>Using <command>pg_rewind</command></title>
<para>
<command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
node which has diverged from the rest of the cluster, typically a failed primary.
<command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution,
and can be installed from external sources for PostgreSQL 9.3 and 9.4.
</para>
<note>
<para>
<command>pg_rewind</command> <emphasis>requires</emphasis> that either
<varname>wal_log_hints</varname> is enabled, or that
data checksums were enabled when the cluster was initialized. See the
<ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html"><command>pg_rewind</command> documentation</ulink> for details.
</para>
</note>
<para>
To have <command>repmgr node rejoin</command> use <command>pg_rewind</command>,
pass the command line option <literal>--force-rewind</literal>, which will tell &repmgr;
to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
</para>
<para>
Be aware that if <command>pg_rewind</command> is executed and actually performs a
rewind operation, any configuration files in the PostgreSQL data directory will be
overwritten with those from the source server.
</para>
<para>
To prevent this happening, provide a comma-separated list of files to retain
using the <literal>--config-file</literal> command line option; the specified files
will be archived in a temporary directory (whose parent directory can be specified with
<literal>--config-archive-dir</literal>) and restored once the rewind operation is
complete.
</para>
<para>
Example, first using <literal>--dry-run</literal>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
--force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose --dry-run
NOTICE: using provided configuration file "/etc/repmgr.conf"
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.local.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
INFO: file "postgresql.conf" would be copied to "/tmp/repmgr-config-archive-node1/postgresql.local.conf"
INFO: 2 files would have been copied to "/tmp/repmgr-config-archive-node1"
INFO: directory "/tmp/repmgr-config-archive-node1" deleted
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node1 dbname=repmgr user=repmgr'</programlisting>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but cannot
predict the outcome of actually executing <application>pg_rewind</application>.
</para>
</note>
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node1 dbname=repmgr user=repmgr' \
--force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose
NOTICE: using provided configuration file "/etc/repmgr.conf"
INFO: prerequisites for using pg_rewind are met
INFO: 2 files copied to "/tmp/repmgr-config-archive-node1"
NOTICE: executing pg_rewind
NOTICE: 2 files copied to /var/lib/pgsql/data
INFO: directory "/tmp/repmgr-config-archive-node1" deleted
INFO: deleting "recovery.done"
INFO: setting node 1's primary to node 2
NOTICE: starting server using "pg_ctl-l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
waiting for server to start.... done
server started
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2</programlisting>
</para>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-standby-follow">
</para>
</refsect1>
</refentry>

View File

@@ -1,505 +0,0 @@
<refentry id="repmgr-node-rejoin">
<indexterm>
<primary>repmgr node rejoin</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr node rejoin</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr node rejoin</refname>
<refpurpose>rejoin a dormant (stopped) node to the replication cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Enables a dormant (stopped) node to be rejoined to the replication cluster.
</para>
<para>
This can optionally use <application>pg_rewind</application> to re-integrate
a node which has diverged from the rest of the cluster, typically a failed primary.
</para>
<para>
Note that <command>repmgr node rejoin</command> can only be used to attach
a standby to the current primary, not another standby.
</para>
<tip>
<para>
If the node is running and needs to be attached to the current primary, use
<xref linkend="repmgr-standby-follow"/>.
</para>
<para>
Note <xref linkend="repmgr-standby-follow"/> can only be used for standbys which have not diverged
from the rest of the cluster.
</para>
</tip>
</refsect1>
<refsect1>
<title>Usage</title>
<para>
<programlisting>
repmgr node rejoin -d '$conninfo'</programlisting>
where <literal>$conninfo</literal> is the PostgreSQL <literal>conninfo</literal> string of the
<emphasis>current</emphasis> primary node (or that of any reachable node in the cluster, but
<emphasis>not</emphasis> the local node). This is so that &repmgr; can fetch up-to-date information
about the current state of the cluster.
</para>
<para>
<filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
otherwise available.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute the rejoin.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind</option></term>
<listitem>
<para>
Execute <application>pg_rewind</application>.
</para>
<para>
See <xref linkend="repmgr-node-rejoin-pg-rewind"/> for more details on using
<application>pg_rewind</application>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-files</option></term>
<listitem>
<para>
comma-separated list of configuration files to retain after
executing <application>pg_rewind</application>.
</para>
<para>
Currently <application>pg_rewind</application> will overwrite
the local node's configuration files with the files from the source node,
so it's advisable to use this option to ensure they are kept.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-archive-dir</option></term>
<listitem>
<para>
Directory to temporarily store configuration files specified with
<option>--config-files</option>; default: <filename>/tmp</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-W/--no-wait</option></term>
<listitem>
<para>
Don't wait for the node to rejoin cluster.
</para>
<para>
If this option is supplied, &repmgr; will restart the node but
not wait for it to connect to the primary.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>node_rejoin_timeout</literal>:
the maximum length of time (in seconds) to wait for
the node to reconnect to the replication cluster (defaults to
the value set in <literal>standby_reconnect_timeout</literal>,
60 seconds).
</simpara>
<simpara>
Note that <literal>standby_reconnect_timeout</literal> must be
set to a value equal to or greater than
<literal>node_rejoin_timeout</literal>.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1 id="repmgr-node-rejoin-events">
<title>Event notifications</title>
<para>
A <literal>node_rejoin</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr node rejoin</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The node rejoin succeeded; or if <option>--dry-run</option> was provided,
no issues were detected which would prevent the node rejoin.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_BAD_CONFIG (1)</option></term>
<listitem>
<para>
A configuration issue was detected which prevented &repmgr; from
continuing with the node rejoin.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NO_RESTART (4)</option></term>
<listitem>
<para>
The node could not be restarted.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_REJOIN_FAIL (24)</option></term>
<listitem>
<para>
The node rejoin operation failed.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
Currently <command>repmgr node rejoin</command> can only be used to attach
a standby to the current primary, not another standby.
</para>
<para>
The node's PostgreSQL instance must have been shut down cleanly. If this was not the
case, it will need to be started up until it has reached a consistent recovery point,
then shut down cleanly.
</para>
<para>
In PostgreSQL 13 and later, this will be done automatically
if the <option>--force-rewind</option> is provided (even if an actual rewind
is not necessary).
</para>
<para>
With PostgreSQL 12 and earlier, PostgreSQL will need to
be started and shut down manually; see below for the best way to do this.
</para>
<tip>
<para>
If <application>PostgreSQL</application> is started in single-user mode and
input is directed from <filename>/dev/null/</filename>, it will perform recovery
then immediately quit, and will then be in a state suitable for use by
<application>pg_rewind</application>.
<programlisting>
rm -f /var/lib/pgsql/data/recovery.conf
postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
</para>
<para>
Note that <filename>standby.signal</filename> (PostgreSQL 11 and earlier:
<filename>recovery.conf</filename>) <emphasis>must</emphasis> be removed
from the data directory for PostgreSQL to be able to start in single
user mode.
</para>
</tip>
</refsect1>
<refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
<title>Using <command>pg_rewind</command></title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>using with "repmgr node rejoin"</secondary>
</indexterm>
<para>
<command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
node which has diverged from the rest of the cluster, typically a failed primary.
</para>
<note>
<para>
<command>pg_rewind</command> <emphasis>requires</emphasis> that either
<varname>wal_log_hints</varname> is enabled, or that
data checksums were enabled when the cluster was initialized. See the
<ulink url="https://www.postgresql.org/docs/current/app-pgrewind.html"><command>pg_rewind</command> documentation</ulink> for details.
</para>
<para>
Additionally, <varname>full_page_writes</varname> must be enabled; this is the default and
normally should never be disabled.
</para>
</note>
<para>
We strongly recommend familiarizing yourself with <command>pg_rewind</command> before attempting
to use it with &repmgr;, as while it is an extremely useful tool, it is <emphasis>not</emphasis>
a &quot;magic bullet&quot; which can resolve all problematic replication situations.
</para>
<para>
A typical use-case for <command>pg_rewind</command> is when a scenario like the following
is encountered:
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose --dry-run
NOTICE: rejoin target is node "node3" (node ID: 3)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 6652184002263212600
ERROR: this node cannot attach to rejoin target node 3
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/610D710
HINT: use --force-rewind to execute pg_rewind</programlisting>
Here, <literal>node3</literal> was promoted to a primary while the local node was
still attached to the previous primary; this can potentially happen during e.g. a
network split. <command>pg_rewind</command> can re-sync the local node with <literal>node3</literal>,
removing the need for a full reclone.
</para>
<para>
To have <command>repmgr node rejoin</command> use <command>pg_rewind</command>,
pass the command line option <literal>--force-rewind</literal>, which will tell &repmgr;
to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
</para>
<refsect2 id="repmgr-node-rejoin-pg-rewind-config-files" xreflabel="pg_rewind and configuration files">
<title><command>pg_rewind</command> and configuration file retention</title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<para>
Be aware that if <command>pg_rewind</command> is executed and actually performs a
rewind operation, any configuration files in the PostgreSQL data directory will be
overwritten with those from the source server.
</para>
<para>
To prevent this happening, provide a comma-separated list of files to retain
using the <option>--config-file</option> command line option; the specified files
will be archived in a temporary directory (whose parent directory can be specified with
<option>--config-archive-dir</option>, default: <filename>/tmp</filename>)
and restored once the rewind operation is complete.
</para>
</refsect2>
<refsect2 id="repmgr-node-rejoin-pg-rewind-example" xreflabel="example using repmgr node rejoin and pg_rewind">
<title>Example using <command>repmgr node rejoin</command> and <command>pg_rewind</command></title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<para>
Example, first using <option>--dry-run</option>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind --dry-run
NOTICE: rejoin target is node "node3" (node ID: 3)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 6652460429293670710
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 3
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/610D710
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.local.conf" would be copied to "/tmp/repmgr-config-archive-node2/postgresql.local.conf"
INFO: file "postgresql.replication-setup.conf" would be copied to "/tmp/repmgr-config-archive-node2/postgresql.replication-setup.conf"
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node3 dbname=repmgr user=repmgr'
INFO: prerequisites for executing NODE REJOIN are met</programlisting>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but is
not an absolute guarantee that actually executing <application>pg_rewind</application>
will succeed. See also section <xref linkend="repmgr-node-rejoin-caveats"/> below.
</para>
</note>
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 3
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/610D710
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node3 dbname=repmgr user=repmgr'"
NOTICE: 2 files copied to /var/lib/postgresql/data
NOTICE: setting node 2's upstream to node 3
NOTICE: starting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 3</programlisting>
</para>
</refsect2>
<refsect2 id="repmgr-node-rejoin-postgresql-94" xreflabel="pg_rewind and PostgreSQL 9.4">
<title><command>pg_rewind</command> and PostgreSQL 9.4</title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>PostgreSQL 9.4</secondary>
</indexterm>
<para>
<application>pg_rewind</application> is available in PostgreSQL 9.5 and later as part of the core distribution.
Users of PostgreSQL 9.4 will need to manually install it; the source code is available here:
<ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
If the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
its full path on the demotion candidate with <option>--force-rewind</option>.
</para>
<para>
Note that building the 9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code.
</para>
</refsect2>
</refsect1>
<refsect1 id="repmgr-node-rejoin-caveats" xreflabel="Caveats">
<title>Caveats when using <command>repmgr node rejoin</command></title>
<indexterm>
<primary>repmgr node rejoin</primary>
<secondary>caveats</secondary>
</indexterm>
<para>
<command>repmgr node rejoin</command> attempts to determine whether it will succeed by
comparing the timelines and relative WAL positions of the local node (rejoin candidate) and primary
(rejoin target). This is particularly important if planning to use <application>pg_rewind</application>,
which currently (as of PostgreSQL 12) may appear to succeed (or indicate there is no action
needed) but potentially allow an impossible action, such as trying to rejoin a standby to a
primary which is behind the standby. &repmgr; will prevent this situation from occurring.
</para>
<para>
Currently it is <emphasis>not</emphasis> possible to detect a situation where the rejoin target
is a standby which has been &quot;promoted&quot; by removing <filename>recovery.conf</filename>
(PostgreSQL 12 and later: <filename>standby.signal</filename>) and restarting it.
In this case there will be no information about the point the rejoin target diverged
from the current standby; the rejoin operation will fail and
the current standby's PostgreSQL log will contain entries with the text
&quot;<literal>record with incorrect prev-link</literal>&quot;.
</para>
<para>
In PostgreSQL 9.5 and earlier, it is <emphasis>not</emphasis> possible to use
<application>pg_rewind</application> to attach to a target node with a lower
timeline than the local node.
</para>
<para>
We strongly recommend running <command>repmgr node rejoin</command> with the
<option>--dry-run</option> option first. Additionally it might be a good idea
to execute the <application>pg_rewind</application> command displayed by
&repmgr; with the <application>pg_rewind</application> <option>--dry-run</option>
option. Note that <application>pg_rewind</application> does not indicate that it
is running in <option>--dry-run</option> mode.
</para>
<warning>
<para>
In all PostgreSQL released before February 2021, <application>pg_rewind</application>
contains a corner-case bug which affects standbys in a very specific situation.
</para>
<para>
This situation occurs when a standby was shut down <emphasis>before</emphasis> its
primary node, and an attempt is made to attach this standby to another primary
in the same cluster (following a &quot;split brain&quot; situation where the standby
was connected to the wrong primary). In this case, &repmgr; will correctly determine
that <application>pg_rewind</application> should be executed, however
<application>pg_rewind</application> incorrectly decides that no action is necessary.
</para>
<para>
In this situation, &repmgr; will report something like:
<programlisting>
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 1
DETAIL: rejoin target server's timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
but when executed, <application>pg_rewind</application> will report:
<programlisting>
pg_rewind: servers diverged at WAL location 0/7015540 on timeline 2
pg_rewind: no rewind required</programlisting>
and if an attempt is made to attach the standby to the new primary, PostgreSQL logs on the standby
will contain errors like:
<programlisting>
[2020-09-07 15:01:41 UTC] LOG: 00000: replication terminated by primary server
[2020-09-07 15:01:41 UTC] DETAIL: End of WAL reached on timeline 2 at 0/7015540.
[2020-09-07 15:01:41 UTC] LOG: 00000: new timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
</para>
<para>
Currently it is not possible to resolve this situation using <application>pg_rewind</application>.
A <ulink url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2b4f3130382fe2f8705863e4d38589d4d69cd695">patch</ulink>
was submitted and is included in all PostgreSQL versions released in February 2021 or later.
</para>
<para>
As a workaround, start the primary server the standby was previously attached to,
and ensure the standby can be attached to it. If <application>pg_rewind</application> was actually executed,
it will have copied in the <filename>.history</filename> file from the target primary server; this must
be removed. <command>repmgr node rejoin</command> can then be used to attach the standby to the original
primary. Ensure any changes pending on the primary have propagated to the standby. Then shut down the primary
server <emphasis>first</emphasis>, before shutting down the standby. It should then be possible to
use <command>repmgr node rejoin</command> to attach the standby to the new primary.
</para>
</warning>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-standby-follow"/>, <xref linkend="repmgr-standby-switchover"/>
</para>
</refsect1>
</refentry>

View File

@@ -1,165 +0,0 @@
<refentry id="repmgr-node-service">
<indexterm>
<primary>repmgr node service</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr node service</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr node service</refname>
<refpurpose>show or execute the system service command to stop/start/restart/reload/promote a node</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Shows or executes the system service command to stop/start/restart/reload a node.
</para>
<para>
This command is mainly meant for internal &repmgr; usage, but is useful for
confirming the command configuration.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Log the steps which would be taken, including displaying the command which would be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--action</option></term>
<listitem>
<para>
The action to perform. One of <literal>start</literal>, <literal>stop</literal>,
<literal>restart</literal>, <literal>reload</literal> or <literal>promote</literal>.
</para>
<para>
If the parameter <option>--list-actions</option> is provided together with
<option>--action</option>, the command which would be executed will be printed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--list-actions</option></term>
<listitem>
<para>
List all configured commands.
</para>
<para>
If the parameter <option>--action</option> is provided together with
<option>--list-actions</option>, the command which would be executed for that
particular action will be printed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--checkpoint</option></term>
<listitem>
<para>
Issue a <command>CHECKPOINT</command> before stopping or restarting the node.
</para>
<para>
Note that a superuser connection is required to be able to execute the
<command>CHECKPOINT</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-S</option>/<option>--superuser</option></term>
<listitem>
<para>
Connect as the named superuser instead of the normal &repmgr; user.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr node service</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_LOCAL_COMMAND (5)</option></term>
<listitem>
<para>
Execution of the system service command failed.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Examples</title>
<para>
See what action would be taken for a restart:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --action=restart --checkpoint --dry-run
INFO: a CHECKPOINT would be issued here
INFO: would execute server command "sudo service postgresql-12 restart"</programlisting>
</para>
<para>
Restart the PostgreSQL instance:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --action=restart --checkpoint
NOTICE: issuing CHECKPOINT
DETAIL: executing server command "sudo service postgresql-12 restart"
Redirecting to /bin/systemctl restart postgresql-12.service</programlisting>
</para>
<para>
List all commands:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --list-actions
Following commands would be executed for each action:
start: "sudo service postgresql-12 start"
stop: "sudo service postgresql-12 stop"
restart: "sudo service postgresql-12 restart"
reload: "sudo service postgresql-12 reload"
promote: "/usr/pgsql-12/bin/pg_ctl -w -D '/var/lib/pgsql/12/data' promote"</programlisting>
</para>
<para>
List a single command:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --list-actions --action=promote
/usr/pgsql-12/bin/pg_ctl -w -D '/var/lib/pgsql/12/data' promote </programlisting>
</para>
</refsect1>
</refentry>

View File

@@ -55,7 +55,7 @@
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr node status</command>:
Following exit codes can be emitted by <command>repmgr node status</command>:
</para>
<variablelist>
@@ -84,7 +84,7 @@
<refsect1>
<title>See also</title>
<para>
See <xref linkend="repmgr-node-check"/> to diagnose issues and <xref linkend="repmgr-cluster-show"/>
See <xref linkend="repmgr-node-check"> to diagnose issues and <xref linkend="repmgr-cluster-show">
for an overview of all nodes in the cluster.
</para>
</refsect1>

View File

@@ -21,20 +21,6 @@
installing the &repmgr; extension. This command needs to be executed before any
standby nodes are registered.
</para>
<note>
<para>
&repmgr; will attempt to install the &repmgr; extension as part of this command,
however this will fail if the <literal>repmgr</literal> user is not a superuser.
</para>
<para>
It's possible to install the &repmgr; extension manually before executing
<command>repmgr primary register</command>; in this case &repmgr; will
detect the presence of the extension and skip that step.
</para>
</note>
</refsect1>
<refsect1>
@@ -43,42 +29,25 @@
Execute with the <option>--dry-run</option> option to check what would happen without
actually registering the primary.
</para>
<note>
<para>
If providing the configuration file location with <option>-f/--config-file</option>,
avoid using a relative path, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover"/>). &repmgr; will attempt to convert the
a relative path into an absolute one, but this may not be the same as the path you
would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
<para>
<para>
<command>repmgr master register</command> can be used as an alias for
<command>repmgr primary register</command>.
</para>
<note>
<para>
If providing the configuration file location with <option>-f/--config-file</option>,
avoid using a relative path, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
a relative path into an absolute one, but this may not be the same as the path you
would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para>
The <literal>repmgr</literal> user must be a superuser in order for &repmgr;
to be able to install the <literal>repmgr</literal> extension.
</para>
<para>
If this is not the case, the <literal>repmgr</literal> extension can be installed
manually before executing <command>repmgr primary register</command>.
</para>
<para>
A future &repmgr; release will enable the provision of a <option>--superuser</option>
name for the installation of the extension.
</para>
</refsect1>
<refsect1>
<title>Options</title>

View File

@@ -60,17 +60,6 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force</option></term>
<listitem>
<para>
Forcibly unregister the node if it is registered as an active primary,
as long as it has no registered standbys; or if it is registered as
a primary but running as a standby.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@@ -1,215 +0,0 @@
<refentry id="repmgr-service-status">
<indexterm>
<primary>repmgr service status</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>displaying service status</secondary>
</indexterm>
<refmeta>
<refentrytitle>repmgr service status</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service status</refname>
<refpurpose>display information about the status of &repmgrd; on each node in the cluster</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command provides an overview over all active nodes in the cluster and the state
of each node's &repmgrd; instance. It can be used to check
the result of <xref linkend="repmgr-service-pause"/> and <xref linkend="repmgr-service-unpause"/>
operations.
</para>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL should be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service status</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service status</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
</para>
<para>
If a node is not accessible, or PostgreSQL itself is not running on the node,
&repmgr; will not be able to determine the status of that node's &repmgrd; instance,
and &quot;<literal>n/a</literal>&quot; will be displayed in the node's <literal>repmgrd</literal>
column.
</para>
<note>
<para>
After restarting PostgreSQL on any node, the &repmgrd; instance
will take a second or two before it is able to update its status. Until then,
&repmgrd; will be shown as not running.
</para>
</note>
</refsect1>
<refsect1>
<title>Examples</title>
<para>
&repmgrd; running normally on all nodes:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | no | n/a
2 | node2 | standby | running | node1 | running | 96572 | no | 1 second(s) ago
3 | node3 | standby | running | node1 | running | 96584 | no | 0 second(s) ago</programlisting>
</para>
<para>
&repmgrd; paused on all nodes (using <xref linkend="repmgr-service-pause"/>):
<programlisting>$ repmgr -f /etc/repmgr.conf service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | yes | n/a
2 | node2 | standby | running | node1 | running | 96572 | yes | 1 second(s) ago
3 | node3 | standby | running | node1 | running | 96584 | yes | 0 second(s) ago</programlisting>
</para>
<para>
&repmgrd; not running on one node:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+-------------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | yes | n/a
2 | node2 | standby | running | node1 | not running | n/a | n/a | n/a
3 | node3 | standby | running | node1 | running | 96584 | yes | 0 second(s) ago</programlisting>
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--csv</option></term>
<listitem>
<para>
<command>repmgr service status</command> accepts an optional parameter <literal>--csv</literal>, which
outputs the replication cluster's status in a simple CSV format, suitable for
parsing by scripts, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf service status --csv
1,node1,primary,1,1,5722,1,100,-1,default
2,node2,standby,1,0,-1,1,100,1,default
3,node3,standby,1,1,5779,1,100,1,default</programlisting>
</para>
<para>
The columns have following meanings:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
node ID
</simpara>
</listitem>
<listitem>
<simpara>
node name
</simpara>
</listitem>
<listitem>
<simpara>
node type (primary or standby)
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL server running (1 = running, 0 = not running)
</simpara>
</listitem>
<listitem>
<simpara>
&repmgrd; running (1 = running, 0 = not running, -1 = unknown)
</simpara>
</listitem>
<listitem>
<simpara>
&repmgrd; PID (-1 if not running or status unknown)
</simpara>
</listitem>
<listitem>
<simpara>
&repmgrd; paused (1 = paused, 0 = not paused, -1 = unknown)
</simpara>
</listitem>
<listitem>
<simpara>
&repmgrd; node priority
</simpara>
</listitem>
<listitem>
<simpara>
interval in seconds since the node's upstream was last seen (this will be -1 if the value could not be retrieved, or the node is primary)
</simpara>
</listitem>
<listitem>
<simpara>
node location
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--detail</option></term>
<listitem>
<para>
Display additional information (<literal>location</literal>, <literal>priority</literal>)
about the &repmgr; configuration.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verbose</option></term>
<listitem>
<para>
Display the full text of any database connection error messages.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>,
<xref linkend="repmgr-cluster-show"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
</para>
</refsect1>
</refentry>

View File

@@ -18,7 +18,7 @@
<para>
<command>repmgr standby clone</command> clones a PostgreSQL node from another
PostgreSQL node, typically the primary, but optionally from any other node in
the cluster or from Barman. It creates the replication configuration required
the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
to attach the cloned node to the primary node (or another standby, if cascading replication
is in use).
</para>
@@ -85,25 +85,27 @@
</refsect1>
<refsect1 id="repmgr-standby-clone-recovery-conf">
<title>Customising replication configuration</title>
<indexterm>
<indexterm>
<primary>recovery.conf</primary>
<secondary>customising with &quot;repmgr standby clone&quot;</secondary>
</indexterm>
<indexterm>
<primary>replication configuration</primary>
<secondary>customising with &quot;repmgr standby clone&quot;</secondary>
</indexterm>
<secondary>customising with "repmgr standby clone"</secondary>
</indexterm>
<title>Customising recovery.conf</title>
<para>
By default, &repmgr; will create a minimal replication configuration
By default, &repmgr; will create a minimal <filename>recovery.conf</filename>
containing following parameters:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>primary_conninfo</varname></simpara>
</listitem>
@@ -114,24 +116,9 @@
</itemizedlist>
<para>
For PostgreSQL 11 and earlier, these parameters will also be set:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
</itemizedlist>
<para>
The following additional parameters can be specified in <filename>repmgr.conf</filename>
for inclusion in the replication configuration:
for inclusion in <filename>recovery.conf</filename>:
</para>
<itemizedlist spacing="compact" mark="bullet">
@@ -155,7 +142,7 @@
We recommend using <ulink url="https://www.pgbarman.org/">Barman</ulink> to manage
WAL file archiving. For more details on combining &repmgr; and <application>Barman</application>,
in particular using <varname>restore_command</varname> to configure Barman as a backup source of
WAL files, see <xref linkend="cloning-from-barman"/>.
WAL files, see <xref linkend="cloning-from-barman">.
</para>
</note>
@@ -167,7 +154,7 @@
When initially cloning a standby, you will need to ensure
that all required WAL files remain available while the cloning is taking
place. To ensure this happens when using the default <command>pg_basebackup</command> method,
&repmgr; will set <command>pg_basebackup</command>'s <literal>--wal-method</literal>
&repmgr; will set <command>pg_basebackup</command>'s <literal>--xlog-method</literal>
parameter to <literal>stream</literal>,
which will ensure all WAL files generated during the cloning process are
streamed in parallel with the main backup. Note that this requires two
@@ -177,112 +164,63 @@
</para>
<para>
To override this behaviour, in <filename>repmgr.conf</filename> set
<command>pg_basebackup</command>'s <literal>--wal-method</literal>
<command>pg_basebackup</command>'s <literal>--xlog-method</literal>
parameter to <literal>fetch</literal>:
<programlisting>
pg_basebackup_options='--wal-method=fetch'</programlisting>
pg_basebackup_options='--xlog-method=fetch'</programlisting>
and ensure that <literal>wal_keep_segments</literal> (PostgreSQL 13 and later:
<literal>wal_keep_size</literal>) is set to an appropriately high value. Note
however that this is not a particularly reliable way of ensuring sufficient
WAL is retained and is not recommended.
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
See the <ulink url="https://www.postgresql.org/docs/current/static/app-pgbasebackup.html">
pg_basebackup</ulink> documentation for details.
</para>
<note>
<simpara>
If using PostgreSQL 9.6 or earlier, replace <literal>--wal-method</literal>
with <literal>--xlog-method</literal>.
From PostgreSQL 10, <command>pg_basebackup</command>'s
<literal>--xlog-method</literal> parameter has been renamed to
<literal>--wal-method</literal>.
</simpara>
</note>
</refsect1>
<refsect1 id="repmgr-standby-clone-wal-directory">
<title>Placing WAL files into a different directory</title>
<para>
To ensure that WAL files are placed in a directory outside of the main data
directory (e.g. to keep them on a separate disk for performance reasons),
specify the location with <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
the <filename>repmgr.conf</filename> parameter <option>pg_basebackup_options</option>,
e.g.:
<programlisting>
pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
This setting will also be honored by &repmgr; when cloning from Barman
(&repmgr; 5.2 and later).
</para>
</refsect1>
<!-- don't rename this id as it may be used in external links -->
<refsect1 id="repmgr-standby-create-recovery-conf">
<title>Using a standby cloned by another method</title>
<indexterm>
<primary>replication configuration</primary>
<secondary>generating for a standby cloned by another method</secondary>
</indexterm>
<indexterm>
<primary>recovery.conf</primary>
<secondary>generating for a standby cloned by another method</secondary>
</indexterm>
<title>Using a standby cloned by another method</title>
<para>
&repmgr; supports standbys cloned by another method (e.g. using <application>barman</application>'s
<command><ulink url="https://docs.pgbarman.org/#recover">barman recover</ulink></command> command).
<command><ulink url="http://docs.pgbarman.org/release/2.4/#recover">barman recover</ulink></command> command).
</para>
<para>
To integrate the standby as a &repmgr; node, once the standby has been cloned,
ensure the <filename>repmgr.conf</filename>
To integrate the standby as a &repmgr; node, ensure the <filename>repmgr.conf</filename>
file is created for the node, and that it has been registered using
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command>.
</para>
<tip>
<para>
To register a standby which is not running, execute
<link linkend="repmgr-standby-register">repmgr standby register --force</link>
and provide the connection details for the primary.
</para>
<para>
See <xref linkend="repmgr-standby-register-inactive-node"/> for more details.
</para>
</tip>
<para>
Then execute the command <command>repmgr standby clone --replication-conf-only</command>.
Then execute the command <command>repmgr standby clone --recovery-conf-only</command>.
This will create the <filename>recovery.conf</filename> file needed to attach
the node to its upstream (in PostgreSQL 12 and later: append replication configuration
to <filename>postgresql.auto.conf</filename>), and will also create a replication slot on the
the node to its upstream, and will also create a replication slot on the
upstream node if required.
</para>
<para>
The upstream node must be running so the correct replication configuration can be obtained.
</para>
<para>
If the standby is running, the replication configuration will not be written unless the
Note that the upstream node must be running. An existing
<filename>recovery.conf</filename> will not be overwritten unless the
<option>-F/--force</option> option is provided.
</para>
<tip>
<para>
Execute <command>repmgr standby clone --replication-conf-only --dry-run</command>
to check the prerequisites for creating the recovery configuration,
and display the configuration changes which would be made without actually
making any changes.
</para>
</tip>
<para>
In PostgreSQL 13 and later, the PostgreSQL configuration must be reloaded for replication
configuration changes to take effect.
</para>
<para>
In PostgreSQL 12 and earlier, the PostgreSQL instance must be restarted for replication
configuration changes to take effect.
Execute <command>repmgr standby clone --recovery-conf-only --dry-run</command>
to check the prerequisites for creating the <filename>recovery.conf</filename> file,
and display the contents of the file without actually creating it.
</para>
<note>
<para>
<option>--recovery-conf-only</option> was introduced in &repmgr; <link linkend="release-4.0.4">4.0.4</link>.
</para>
</note>
</refsect1>
@@ -308,9 +246,9 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
Check prerequisites but don't actually clone the standby.
</para>
<para>
If <option>--replication-conf-only</option> specified, the contents of
the generated recovery configuration will be displayed
but not written.
If <option>--recovery-conf-only</option> specified, the contents of
the generated <filename>recovery.conf</filename> file will be displayed
but the file itself not written.
</para>
</listitem>
</varlistentry>
@@ -332,12 +270,6 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
node to the same path on the standby (default) or to the
PostgreSQL data directory.
</para>
<para>
Note that to be able to use this option, the &repmgr; user must be a superuser or
member of the <literal>pg_read_all_settings</literal> predefined role.
If this is not the case, provide a valid superuser with the
<option>-S</option>/<option>--superuser</option> option.
</para>
</listitem>
</varlistentry>
@@ -350,25 +282,6 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--recovery-min-apply-delay</option></term>
<listitem>
<para>
Set PostgreSQL configuration <option>recovery_min_apply_delay</option> parameter
to the provided value.
</para>
<para>
This overrides any <option>recovery_min_apply_delay</option> provided via
<filename>repmgr.conf</filename>.
</para>
<para>
For more details on this parameter, see:
<ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-RECOVERY-MIN-APPLY-DELAY">recovery_min_apply_delay</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R, --remote-user=USERNAME</option></term>
<listitem>
@@ -379,19 +292,11 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</varlistentry>
<varlistentry>
<term><option>--replication-conf-only</option></term>
<term><option> --recovery-conf-only</option></term>
<listitem>
<para>
Create recovery configuration for a previously cloned instance.
Create <filename>recovery.conf</filename> file for a previously cloned instance. &repmgr 4.0.4 and later.
</para>
<para>
In PostgreSQL 12 and later, the replication configuration will be
written to <filename>postgresql.auto.conf</filename>.
</para>
<para>
In PostgreSQL 11 and earlier, the replication configuration will be
written to <filename>recovery.conf</filename>.
</para>
</listitem>
</varlistentry>
@@ -405,15 +310,11 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</varlistentry>
<varlistentry>
<term><option>-S</option>/<option>--superuser</option></term>
<term><option>--superuser</option></term>
<listitem>
<para>
The name of a valid PostgreSQL superuser can be provided with this option.
</para>
<para>
This is only required if the <option>--copy-external-config-files</option> was provided
and the &repmgr; user is not a superuser or member of the <literal>pg_read_all_settings</literal>
predefined role.
If the &repmgr; user is not a superuser, the name of a valid superuser must
be provided with this option.
</para>
</listitem>
</varlistentry>
@@ -423,13 +324,9 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
<term><option>--upstream-conninfo</option></term>
<listitem>
<para>
<literal>primary_conninfo</literal> value to include in the recovery configuration
<literal>primary_conninfo</literal> value to write in recovery.conf
when the intended upstream server does not yet exist.
</para>
<para>
Note that &repmgr; may modify the provided value, in particular to set the
correct <literal>application_name</literal>.
</para>
</listitem>
</varlistentry>
@@ -441,23 +338,6 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verify-backup</option></term>
<listitem>
<para>
<!-- update link after Pg13 release -->
Verify a cloned node using the
<ulink url="https://www.postgresql.org/docs/13/app-pgverifybackup.html">pg_verifybackup</ulink>
utility (PostgreSQL 13 and later).
</para>
<para>
This option can currently only be used when cloning directly from an upstream
node.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--without-barman </option></term>
<listitem>
@@ -480,7 +360,7 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
<refsect1>
<title>See also</title>
<para>
See <xref linkend="cloning-standbys"/> for details about various aspects of cloning.
See <xref linkend="cloning-standbys"> for details about various aspects of cloning.
</para>
</refsect1>
</refentry>

View File

@@ -0,0 +1,116 @@
<refentry id="repmgr-standby-follow">
<indexterm>
<primary>repmgr standby follow</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby follow</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby follow</refname>
<refpurpose>attach a standby to a new primary</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Attaches the standby to a new primary. This command requires a valid
<filename>repmgr.conf</filename> file for the standby, either specified
explicitly with <literal>-f/--config-file</literal> or located in a
default location; no additional arguments are required.
</para>
<para>
This command will force a restart of the standby server, which must be
running. It can only be used to attach an active standby to the current primary node
(and not to another standby).
</para>
<tip>
<para>
To re-add an inactive node to the replication cluster, use
<xref linkend="repmgr-node-rejoin">.
</para>
</tip>
<para>
<command>repmgr standby follow</command> will wait up to
<varname>standby_follow_timeout</varname> seconds (default: <literal>30</literal>)
to verify the standby has actually connected to the new primary.
</para>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf standby follow
INFO: setting node 3's primary to node 2
NOTICE: restarting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' restart"
waiting for server to shut down........ done
server stopped
waiting for server to start.... done
server started
NOTICE: STANDBY FOLLOW successful
DETAIL: node 3 is now attached to node 2</programlisting>
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually follow a new standby.
</para>
<important>
<para>
This does not guarantee the standby can follow the primary; in
particular, whether the primary and standby timelines have diverged,
can currently only be determined by actually attempting to
attach the standby to the primary.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for a primary to appear. &repmgr; will wait for up to
<varname>primary_follow_timeout</varname> seconds
(default: 60 seconds) to verify that the standby is following the new primary.
This value can be defined in <filename>repmgr.conf</filename>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-standby-follow-events">
<title>Event notifications</title>
<para>
A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
<para>
If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the primary
being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-rejoin">
</para>
</refsect1>
</refentry>

View File

@@ -1,271 +0,0 @@
<refentry id="repmgr-standby-follow">
<indexterm>
<primary>repmgr standby follow</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby follow</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby follow</refname>
<refpurpose>attach a running standby to a new upstream node</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Attaches the standby (&quot;follow candidate&quot;) to a new upstream node
(&quot;follow target&quot;). Typically this will be the primary, but this
command can also be used to attach the standby to another standby.
</para>
<para>
This command requires a valid <filename>repmgr.conf</filename> file for the standby,
either specified explicitly with <literal>-f/--config-file</literal> or located in a
default location; no additional arguments are required.
</para>
<para>The standby node (&quot;follow candidate&quot;) <emphasis>must</emphasis>
be running. If the new upstream (&quot;follow target&quot;) is not the primary,
the cluster primary <emphasis>must</emphasis> be running and accessible from the
standby node.
</para>
<tip>
<para>
To re-add an inactive node to the replication cluster, use
<xref linkend="repmgr-node-rejoin"/>.
</para>
</tip>
<para>
By default &repmgr; will attempt to attach the standby to the current primary.
If <option>--upstream-node-id</option> is provided, &repmgr; will attempt
to attach the standby to the specified node, which can be another standby.
</para>
<para>
In PostgreSQL 12 and earlier, this command will force a restart of PostgreSQL on the standby node.
</para>
<para>
In PostgreSQL 13 and later, by default this command will signal PostgreSQL to reload its
configuration, which will cause PostgreSQL to follow the new upstream without
a restart. If this behaviour is not desired for whatever reason, the configuration
file parameter <varname>standby_follow_restart</varname> can be set <literal>true</literal>
to always force a restart.
</para>
<para>
<command>repmgr standby follow</command> will wait up to
<varname>standby_follow_timeout</varname> seconds (default: <literal>30</literal>)
to verify the standby has actually connected to the new upstream node.
</para>
<note>
<para>
If <option>recovery_min_apply_delay</option> is set for the standby, it
will not attach to the new upstream node until it has replayed available
WAL.
</para>
<para>
Conversely, if the standby is attached to an upstream standby
which has <option>recovery_min_apply_delay</option> set, the upstream
standby's replay state may actually be behind that of its new downstream node.
</para>
</note>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf standby follow
INFO: setting node 3's primary to node 2
NOTICE: restarting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' restart"
waiting for server to shut down........ done
server stopped
waiting for server to start.... done
server started
NOTICE: STANDBY FOLLOW successful
DETAIL: node 3 is now attached to node 2</programlisting>
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually follow a new upstream node.
</para>
<para>
This will also verify whether the standby is capable of following the new upstream node.
</para>
<important>
<para>
If a standby was turned into a primary by removing <filename>recovery.conf</filename>
(<application>PostgreSQL 12</application> and later: <filename>standby.signal</filename>),
&repmgr; will <emphasis>not</emphasis> be able to determine whether that primary's timeline
has diverged from the timeline of the standby (&quot;follow candidate&quot;).
</para>
<para>
We recommend always to use <link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>
to promote a standby to primary, as this will ensure that the new primary
will perform a timeline switch (making it practical to check for timeline divergence)
and also that &repmgr; metadata is updated correctly.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--upstream-node-id</option></term>
<listitem>
<para>
Node ID of the new upstream node (&quot;follow target&quot;).
</para>
<para>
If not provided, &repmgr; will attempt to follow the current primary node.
</para>
<para>
Note that when using &repmgrd;, <option>--upstream-node-id</option>
should always be configured;
see <link linkend="repmgrd-automatic-failover-configuration">Automatic failover configuration</link>
for details.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for a primary to appear. &repmgr; will wait for up to
<varname>primary_follow_timeout</varname> seconds
(default: 60 seconds) to verify that the standby is following the new primary.
This value can be defined in <filename>repmgr.conf</filename>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
Execute with the <literal>--dry-run</literal> option to test the follow operation as
far as possible, without actually changing the status of the node.
</para>
<para>
Note that &repmgr; will first attempt to determine whether the standby
(&quot;follow candidate&quot;) is capable of following the
new upstream node (&quot;follow target&quot;).
</para>
<para>
If, for example, the new upstream node has diverged from this node's timeline,
for example if the new upstream node was promoted to primary while this node
was still attached to the original primary, it will <emphasis>not</emphasis>
be possible to follow the new upstream node, and &repmgr; will emit an error
message like this:
<programlisting>
ERROR: this node cannot attach to follow target node &quot;node3&quot; (ID 3)
DETAIL: follow target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/6108880</programlisting>
</para>
<para>
In this case, it may be possible to have this node follow the new upstream
using <command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>
with the <option>--force-rewind</option> to execute <command>pg_rewind</command>.
This does mean that transactions which exist on this node, but not the new upstream,
will be lost.
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr standby follow</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The follow operation succeeded; or if <option>--dry-run</option> was provided,
no issues were detected which would prevent the follow operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_BAD_CONFIG (1)</option></term>
<listitem>
<para>
A configuration issue was detected which prevented &repmgr; from
continuing with the follow operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NO_RESTART (4)</option></term>
<listitem>
<para>
The node could not be restarted.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_DB_CONN (6)</option></term>
<listitem>
<para>
&repmgr; was unable to establish a database connection to one of the nodes.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_FOLLOW_FAIL (23)</option></term>
<listitem>
<para>
&repmgr; was unable to complete the follow command.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-standby-follow-events">
<title>Event notifications</title>
<para>
A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
<para>
If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the node
being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-rejoin"/>
</para>
</refsect1>
</refentry>

View File

@@ -0,0 +1,60 @@
<refentry id="repmgr-standby-promote">
<indexterm>
<primary>repmgr standby promote</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby promote</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby promote</refname>
<refpurpose>promote a standby to a primary</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Promotes a standby to a primary if the current primary has failed. This
command requires a valid <filename>repmgr.conf</filename> file for the standby, either
specified explicitly with <literal>-f/--config-file</literal> or located in a
default location; no additional arguments are required.
</para>
<para>
If the standby promotion succeeds, the server will not need to be
restarted. However any other standbys will need to follow the new server,
by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application>
is active, it will handle this automatically.
</para>
<para>
Note that &repmgr; will wait for up to <varname>promote_check_timeout</varname> seconds
(default: 60 seconds) to verify that the standby has been promoted, and will
check the promotion every <varname>promote_check_interval</varname> seconds (default: 1 second).
Both values can be defined in <filename>repmgr.conf</filename>.
</para>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf standby promote
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
server promoting
DEBUG: setting node 2 as primary and marking existing primary as failed
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
</para>
</refsect1>
<refsect1 id="repmgr-standby-promote-events">
<title>Event notifications</title>
<para>
A <literal>standby_promote</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
</refsect1>
</refentry>

View File

@@ -1,345 +0,0 @@
<refentry id="repmgr-standby-promote">
<indexterm>
<primary>repmgr standby promote</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby promote</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby promote</refname>
<refpurpose>promote a standby to a primary</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Promotes a standby to a primary if the current primary has failed. This
command requires a valid <filename>repmgr.conf</filename> file for the standby, either
specified explicitly with <literal>-f/--config-file</literal> or located in a
default location; no additional arguments are required.
</para>
<important>
<para>
If &repmgrd; is active, you must execute
<command><link linkend="repmgr-service-pause">repmgr service pause</link></command>
(&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-pause">repmgr service pause</link></command>)
to temporarily disable &repmgrd; while making any changes
to the replication cluster.
</para>
</important>
<para>
If the standby promotion succeeds, the server will not need to be
restarted. However any other standbys will need to follow the new primary,
and will need to be restarted to do this.
</para>
<para>
Beginning with <link linkend="release-4.4">repmgr 4.4</link>,
the option <option>--siblings-follow</option> can be used to have
all other standbys (and a witness server, if in use)
follow the new primary.
</para>
<note>
<para>
If using &repmgrd;, when invoking
<command>repmgr standby promote</command> (either directly via
the <option>promote_command</option>, or in a script called
via <option>promote_command</option>), <option>--siblings-follow</option>
<emphasis>must not</emphasis> be included as a
command line option for <command>repmgr standby promote</command>.
</para>
</note>
<para>
In <link linkend="release-4.3">repmgr 4.3</link> and earlier,
<command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command>
must be executed on each standby individually.
</para>
<para>
&repmgr; will wait for up to <varname>promote_check_timeout</varname> seconds
(default: <literal>60</literal>) to verify that the standby has been promoted, and will
check the promotion every <varname>promote_check_interval</varname> seconds (default: 1 second).
Both values can be defined in <filename>repmgr.conf</filename>.
</para>
<warning>
<para>
In PostgreSQL 12 and earlier, if WAL replay is paused on the standby, and not all
WAL files on the standby have been replayed, &repmgr; will not attempt to promote it.
</para>
<para>
This is because if WAL replay is paused, PostgreSQL itself will not react to a promote command
until WAL replay is resumed and all pending WAL has been replayed. This means
attempting to promote PostgreSQL in this state will leave PostgreSQL in a condition where the
promotion may occur at a unpredictable point in the future.
</para>
<para>
Note that if the standby is in archive recovery, &repmgr; will not be able to determine
if more WAL is pending replay, and will abort the promotion attempt if WAL replay is paused.
</para>
<para>
This restriction does <emphasis>not</emphasis> apply to PostgreSQL 13 and later.
</para>
</warning>
</refsect1>
<refsect1>
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf standby promote
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
server promoting
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
</para>
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para><emphasis>pg_promote() (PostgreSQL 12 and later)</emphasis></para>
<para>
From PostgreSQL 12, &repmgr; will attempt to use the built-in <function>pg_promote()</function>
function to promote a standby to primary.
</para>
<para>
By default, execution of <function>pg_promote()</function> is restricted to superusers.
If the <literal>repmgr</literal> user does not have permission to execute
<function>pg_promote()</function>, &repmgr; will fall back to using &quot;<command>pg_ctl promote</command>&quot;.
</para>
<tip>
<para>
Execute <command>repmgr standby promote</command> with the <option>--dry-run</option>
to check whether the &repmgr; user has permission to execute <function>pg_promote()</function>.
</para>
<para>
If the <literal>repmgr</literal> user is not a superuser, execution permission for this
function can be granted with e.g.:
<programlisting>
GRANT EXECUTE ON FUNCTION pg_catalog.pg_promote TO repmgr</programlisting>
</para>
<para>
Note that permissions are only effective for the database they are granted in, so
this <emphasis>must</emphasis> be executed in the &repmgr; database to be effective.
</para>
</tip>
<para>
For more details on <function>pg_promote()</function>, see the
<ulink url="https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL-TABLE">PostgreSQL documentation</ulink>.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check if this node can be promoted, but don't carry out the promotion.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--siblings-follow</option></term>
<listitem>
<para>
Have all sibling nodes (nodes formerly attached to the same upstream
node as the promotion candidate) follow this node after it has been promoted.
</para>
<para>
Note that a witness server, if in use, is also
counted as a &quot;sibling node&quot; as it needs to be instructed to
synchronise its metadata with the new primary.
</para>
<important>
<para>
Do <emphasis>not</emphasis> provide this option when configuring
&repmgrd;'s <option>promote_command</option>.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
This option is relevant in the following situations if <option>--siblings-follow</option> was specified:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
If one or more sibling nodes was not reachable via SSH, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If the promotion candidate has insufficient free walsenders to accommodate the standbys which will
be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If replication slots are in use but the promotion candidate has insufficient free replication slots
to accommodate the standbys which will be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Note that if the <option>-F</option>/<option>--force</option> option is used when any of the above
situations is encountered, the onus is on the user to manually resolve any resulting issues.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
The following parameters in <filename>repmgr.conf</filename> are relevant to the
promote operation:
</para>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<indexterm>
<primary>promote_check_interval</primary>
<secondary>with &quot;repmgr standby promote &quot;</secondary>
</indexterm>
<simpara>
<literal>promote_check_interval</literal>:
interval (in seconds, default: 1 second) to wait between each check
to determine whether the standby has been promoted.
</simpara>
</listitem>
<listitem>
<indexterm>
<primary>promote_check_timeout</primary>
<secondary>with &quot;repmgr standby promote &quot;</secondary>
</indexterm>
<simpara>
<literal>promote_check_timeout</literal>:
time (in seconds, default: 60 seconds) to wait to verify that the standby has been promoted
before exiting with <literal>ERR_PROMOTION_FAIL</literal>.
</simpara>
</listitem>
<listitem>
<indexterm>
<primary>service_promote_command</primary>
<secondary>with &quot;repmgr standby promote &quot;</secondary>
</indexterm>
<simpara>
<literal>service_promote_command</literal>:
a command which will be executed instead of <command>pg_ctl promote</command>
or (in PostgreSQL 12 and later) <function>pg_promote()</function>.
</simpara>
<simpara>
This is intended for systems which provide a package-level promote command,
such as Debian's <application>pg_ctlcluster</application>, to promote the
PostgreSQL from standby to primary.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr standby promote</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The standby was successfully promoted to primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_DB_CONN (6)</option></term>
<listitem>
<para>
&repmgr; was unable to connect to the local PostgreSQL node.
</para>
<para>
PostgreSQL must be running before the node can be promoted.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_PROMOTION_FAIL (8)</option></term>
<listitem>
<para>
The node could not be promoted to primary for one of the following
reasons:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
there is an existing primary node in the replication cluster
</simpara>
</listitem>
<listitem>
<simpara>
the node is not a standby
</simpara>
</listitem>
<listitem>
<simpara>
WAL replay is paused on the node
</simpara>
</listitem>
<listitem>
<simpara>
execution of the PostgreSQL promote command failed
</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-standby-promote-events">
<title>Event notifications</title>
<para>
A <literal>standby_promote</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
</refsect1>
</refentry>

View File

@@ -17,7 +17,7 @@
<para>
<command>repmgr standby register</command> adds a standby's information to
the &repmgr; metadata. This command needs to be executed to enable
promote/follow operations and to allow &repmgrd; to work with the node.
promote/follow operations and to allow <application>repmgrd</application> to work with the node.
An existing standby can be registered using this command. Execute with the
<literal>--dry-run</literal> option to check what would happen without actually registering the
standby.
@@ -28,7 +28,7 @@
If providing the configuration file location with <literal>-f/--config-file</literal>,
avoid using a relative path, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover"/>). &repmgr; will attempt to convert the
<xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
a relative path into an absolute one, but this may not be the same as the path you
would explicitly provide (e.g. <filename>./repmgr.conf</filename> might be converted
to <filename>/path/to/./repmgr.conf</filename>, whereas you'd normally write
@@ -59,7 +59,7 @@
<para>
Depending on your environment and workload, it may take some time for the standby's node record
to propagate from the primary to the standby. Some actions (such as starting
&repmgrd;) require that the standby's node record
<application>repmgrd</application>) require that the standby's node record
is present and up-to-date to function correctly.
</para>
<para>
@@ -75,22 +75,10 @@
<para>
Under some circumstances you may wish to register a standby which is not
yet running; this can be the case when using provisioning tools to create
a complex replication cluster, or if the node was not cloned by &repmgr;.
</para>
<para>
In this case, by using the <option>-F/--force</option>
a complex replication cluster. In this case, by using the <option>-F/--force</option>
option and providing the connection parameters to the primary server,
the standby can be registered even if it has not yet been started.
the standby can be registered.
</para>
<tip>
<para>
Connection parameters can either be provided either as a <literal>conninfo</literal> string
(e.g. <option>-d 'host=node1 user=repmgr'</option> or as individual connection parameters
(<option>-h/--host</option>, <option>-d/--dbname</option>,
<option>-U/--user</option>, <option>-p/--port</option> etc.).
</para>
</tip>
<para>
Similarly, with cascading replication it may be necessary to register
a standby whose upstream node has not yet been registered - in this case,
@@ -108,11 +96,9 @@
<title>Registering a node not cloned by repmgr</title>
<para>
If you've cloned a standby using another method (e.g. <application>barman</application>'s
<command><ulink url="https://docs.pgbarman.org/#recover">barman recover</ulink></command>
command), register the node as detailed in section
<xref linkend="repmgr-standby-register-inactive-node"/> then execute
<link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --replication-conf-only</link>
to generate the appropriate replication configuration.
<command>barman recover</command> command), first execute
<link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --recovery-conf-only</link>
to add the <filename>recovery.conf</filename> file, then register the standby as usual.
</para>
</refsect1>
@@ -133,7 +119,7 @@
<varlistentry>
<term><option>-F</option>/<option>--force</option></term>
<term><option>-F</option><option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record

View File

@@ -0,0 +1,289 @@
<refentry id="repmgr-standby-switchover">
<indexterm>
<primary>repmgr standby switchover</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby switchover</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby switchover</refname>
<refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Promotes a standby to primary and demotes the existing primary to a standby.
This command must be run on the standby to be promoted, and requires a
passwordless SSH connection to the current primary.
</para>
<para>
If other standbys are connected to the demotion candidate, &repmgr; can instruct
these to follow the new primary if the option <literal>--siblings-follow</literal>
is specified. This requires a passwordless SSH connection between the promotion
candidate (new primary) and the standbys attached to the demotion candidate
(existing primary).
</para>
<note>
<para>
Performing a switchover is a non-trivial operation. In particular it
relies on the current primary being able to shut down cleanly and quickly.
&repmgr; will attempt to check for potential issues but cannot guarantee
a successful switchover.
</para>
<para>
&repmgr; will refuse to perform the switchover if an exclusive backup is running on
the current primary.
</para>
</note>
<para>
For more details on performing a switchover, including preparation and configuration,
see section <xref linkend="performing-switchover">.
</para>
<note>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
<application>repmgrd</application> instances to pause operations while the switchover
is being carried out, to prevent <application>repmgrd</application> from
unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing">.
</para>
<para>
Users of &repmgr; versions prior to 4.2 should ensure that <application>repmgrd</application>
is not running on any nodes while a switchover is being executed.
</para>
</note>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--always-promote</option></term>
<listitem>
<para>
Promote standby to primary, even if it is behind or has diverged
from the original primary. The original primary will be shut down in any case,
and will need to be manually reintegrated into the replication cluster.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute a switchover.
</para>
<important>
<para>
Success of <option>--dry-run</option> does not imply the switchover will
complete successfully, only that
the prerequisites for performing the operation are met.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
Specifically, if a problem is encountered when shutting down the current primary,
using <option>-F/--force</option> will cause &repmgr; to continue by promoting
the standby to be the new primary, and if <option>--siblings-follow</option> is
specified, attach any other standbys to the new primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
<listitem>
<para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(and the prerequisites for using <application>pg_rewind</application> are met).
If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
provide its full path. For more details see also <xref linkend="switchover-pg-rewind">.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R</option></term>
<term><option>--remote-user</option></term>
<listitem>
<para>
System username for remote SSH operations (defaults to local system user).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--repmgrd-no-pause</option></term>
<listitem>
<para>
Don't pause <application>repmgrd</application> while executing a switchover.
</para>
<para>
This option should not be used unless you take steps by other means
to ensure <application>repmgrd</application> is paused or not
running on all nodes.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--siblings-follow</option></term>
<listitem>
<para>
Have standbys attached to the old primary follow the new primary.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>replication_lag_critical</literal>:
if replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the <literal>-F/--force</literal> option
is provided)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>shutdown_check_timeout</literal>: maximum number of seconds to wait for the
demotion candidate (current primary) to shut down, before aborting the switchover.
</simpara>
<simpara>
Note that this parameter is set on the node where <command>repmgr standby switchover</command>
is executed (promotion candidate); setting it on the demotion candidate (former primary) will
have no effect.
</simpara>
<note>
<para>
In versions prior to <link linkend="release-4.2">&repmgr; 4.2</link>, <command>repmgr standby switchover</command> would
use the values defined in <literal>reconnect_attempts</literal> and <literal>reconnect_interval</literal>
to determine the timeout for demotion candidate shutdown.
</para>
</note>
</listitem>
<listitem>
<simpara>
<literal>standby_reconnect_timeout</literal>:
maximum number of seconds to attempt to wait for the demotion candidate (former primary)
to reconnect to the promoted primary (default: 60 seconds)
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
Execute with the <literal>--dry-run</literal> option to test the switchover as far as
possible without actually changing the status of either node.
</para>
<important>
<para>
<application>repmgrd</application> must be shut down on all nodes while a switchover is being
executed. This restriction will be removed in a future &repmgr; version.
</para>
</important>
<para>
External database connections, e.g. from an application, should not be permitted while
the switchover is taking place. In particular, active transactions on the primary
can potentially disrupt the shutdown process.
</para>
</refsect1>
<refsect1 id="repmgr-standby-switchover-events">
<title>Event notifications</title>
<para>
<literal>standby_switchover</literal> and <literal>standby_promote</literal>
<link linkend="event-notifications">event notifications</link> will be generated for the new primary,
and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
</para>
<para>
If using an event notification script, <literal>standby_switchover</literal>
will populate the placeholder parameter <literal>%p</literal> with the node ID of
the former primary.
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr standby switchover</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The switchover completed successfully.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
<listitem>
<para>
The switchover could not be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
<listitem>
<para>
The switchover was executed but a problem was encountered.
Typically this means the former primary could not be reattached
as a standby. Check preceding log messages for more information.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
For more details see the section <xref linkend="performing-switchover">.
</para>
</refsect1>
</refentry>

View File

@@ -1,448 +0,0 @@
<refentry id="repmgr-standby-switchover">
<indexterm>
<primary>repmgr standby switchover</primary>
</indexterm>
<refmeta>
<refentrytitle>repmgr standby switchover</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr standby switchover</refname>
<refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
Promotes a standby to primary and demotes the existing primary to a standby.
This command must be run on the standby to be promoted, and requires a
passwordless SSH connection to the current primary.
</para>
<para>
If other nodes are connected to the demotion candidate, &repmgr; can instruct
these to follow the new primary if the option <literal>--siblings-follow</literal>
is specified. This requires a passwordless SSH connection between the promotion
candidate (new primary) and the nodes attached to the demotion candidate
(existing primary). Note that a witness server, if in use, is also
counted as a &quot;sibling node&quot; as it needs to be instructed to
synchronise its metadata with the new primary.
</para>
<note>
<para>
Performing a switchover is a non-trivial operation. In particular it
relies on the current primary being able to shut down cleanly and quickly.
&repmgr; will attempt to check for potential issues but cannot guarantee
a successful switchover.
</para>
<para>
&repmgr; will refuse to perform the switchover if an exclusive backup is running on
the current primary, or if WAL replay is paused on the standby.
</para>
</note>
<para>
For more details on performing a switchover, including preparation and configuration,
see section <xref linkend="performing-switchover"/>.
</para>
<note>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, &repmgr; will instruct any running
&repmgrd; instances to pause operations while the switchover
is being carried out, to prevent &repmgrd; from
unintentionally promoting a node. For more details, see <xref linkend="repmgrd-pausing"/>.
</para>
<para>
Users of &repmgr; versions prior to 4.2 should ensure that &repmgrd;
is not running on any nodes while a switchover is being executed.
</para>
</note>
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para><emphasis>data_directory</emphasis></para>
<para>
&repmgr; needs to be able to determine the location of the data directory on the
demotion candidate. If the &repmgr; is not a superuser or member of the <varname>pg_read_all_settings</varname>
<ulink url="https://www.postgresql.org/docs/current/predefined-roles.html">predefined roles</ulink>,
the name of a superuser should be provided with the <option>-S</option>/<option>--superuser</option> option.
</para>
<para><emphasis>CHECKPOINT</emphasis></para>
<para>
&repmgr; executes <command>CHECKPOINT</command> on the demotion candidate as part of the shutdown
process to ensure it shuts down as smoothly as possible.
</para>
<para>
Note that <command>CHECKPOINT</command> requires database superuser permissions to execute.
If the <literal>repmgr</literal> user is not a superuser, the name of a superuser should be
provided with the <option>-S</option>/<option>--superuser</option> option.
</para>
<para>
If &repmgr; is unable to execute the <command>CHECKPOINT</command> command, the switchover
can still be carried out, albeit at a greater risk that the demotion candidate may not
be able to shut down as smoothly as might otherwise have been the case.
</para>
<para><emphasis>pg_promote() (PostgreSQL 12 and later)</emphasis></para>
<para>
From PostgreSQL 12, &repmgr; defaults to using the built-in <command>pg_promote()</command> function to
promote a standby to primary.
</para>
<para>
Note that execution of <function>pg_promote()</function> is restricted to superusers or to
any user who has been granted execution permission for this function. If the &repmgr; user
is not permitted to execute <function>pg_promote()</function>, &repmgr; will fall back to using
&quot;<command>pg_ctl promote</command>&quot;. For more details see
<link linkend="repmgr-standby-promote">repmgr standby promote</link>.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--always-promote</option></term>
<listitem>
<para>
Promote standby to primary, even if it is behind or has diverged
from the original primary. The original primary will be shut down in any case,
and will need to be manually reintegrated into the replication cluster.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute a switchover.
</para>
<important>
<para>
Success of <option>--dry-run</option> does not imply the switchover will
complete successfully, only that
the prerequisites for performing the operation are met.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
Specifically, if a problem is encountered when shutting down the current primary,
using <option>-F/--force</option> will cause &repmgr; to continue by promoting
the standby to be the new primary, and if <option>--siblings-follow</option> is
specified, attach any other standbys to the new primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
<listitem>
<para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(and the prerequisites for using <application>pg_rewind</application> are met).
</para>
<para>
If using PostgreSQL 9.4, and the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
provide its full path. For more details see also <xref linkend="switchover-pg-rewind"/>
and <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R</option></term>
<term><option>--remote-user</option></term>
<listitem>
<para>
System username for remote SSH operations (defaults to local system user).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--repmgrd-no-pause</option></term>
<listitem>
<para>
Don't pause &repmgrd; while executing a switchover.
</para>
<para>
This option should not be used unless you take steps by other means
to ensure &repmgrd; is paused or not
running on all nodes.
</para>
<para>
This option cannot be used together with <option>--repmgrd-force-unpause</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--repmgrd-force-unpause</option></term>
<listitem>
<para>
Always unpause all &repmgrd; instances after executing a switchover. This will ensure that
any &repmgrd; instances which were paused before the switchover will be
unpaused.
</para>
<para>
This option cannot be used together with <option>--repmgrd-no-pause</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--siblings-follow</option></term>
<listitem>
<para>
Have nodes attached to the old primary follow the new primary.
</para>
<para>
This will also ensure that a witness node, if in use, is updated
with the new primary's data.
</para>
<note>
<para>
In a future &repmgr; release, <option>--siblings-follow</option> will be applied
by default.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-S</option>/<option>--superuser</option></term>
<listitem>
<para>
Use the named superuser instead of the normal &repmgr; user to perform
actions requiring superuser permissions.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
The following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
</para>
<variablelist>
<varlistentry>
<term><option>replication_lag_critical</option></term>
<listitem>
<indexterm>
<primary>replication_lag_critical</primary>
<secondary>with &quot;repmgr standby switchover&quot;</secondary>
</indexterm>
<para>
If replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the <literal>-F/--force</literal> option
is provided)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>shutdown_check_timeout</option></term>
<listitem>
<indexterm>
<primary>shutdown_check_timeout</primary>
<secondary>with &quot;repmgr standby switchover&quot;</secondary>
</indexterm>
<para>
The maximum number of seconds to wait for the
demotion candidate (current primary) to shut down, before aborting the switchover.
</para>
<para>
Note that this parameter is set on the node where <command>repmgr standby switchover</command>
is executed (promotion candidate); setting it on the demotion candidate (former primary) will
have no effect.
</para>
<note>
<para>
In versions prior to <link linkend="release-4.2">&repmgr; 4.2</link>, <command>repmgr standby switchover</command> would
use the values defined in <literal>reconnect_attempts</literal> and <literal>reconnect_interval</literal>
to determine the timeout for demotion candidate shutdown.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry>
<term><option>wal_receive_check_timeout</option></term>
<listitem>
<indexterm>
<primary>wal_receive_check_timeout</primary>
<secondary>with &quot;repmgr standby switchover&quot;</secondary>
</indexterm>
<para>
After the primary has shut down, the maximum number of seconds to wait for the
walreceiver on the standby to flush WAL to disk before comparing WAL receive location
with the primary's shut down location.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>standby_reconnect_timeout</option></term>
<listitem>
<indexterm>
<primary>standby_reconnect_timeout</primary>
<secondary>with &quot;repmgr standby switchover&quot;</secondary>
</indexterm>
<para>
The maximum number of seconds to attempt to wait for the demotion candidate (former primary)
to reconnect to the promoted primary (default: 60 seconds)
</para>
<para>
Note that this parameter is set on the node where <command>repmgr standby switchover</command>
is executed (promotion candidate); setting it on the demotion candidate (former primary) will
have no effect.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>node_rejoin_timeout</option></term>
<listitem>
<indexterm>
<primary>node_rejoin_timeout</primary>
<secondary>with &quot;repmgr standby switchover&quot;</secondary>
</indexterm>
<para>
maximum number of seconds to attempt to wait for the demotion candidate (former primary)
to reconnect to the promoted primary (default: 60 seconds)
</para>
<para>
Note that this parameter is set on the the demotion candidate (former primary);
setting it on the node where <command>repmgr standby switchover</command> is
executed will have no effect.
</para>
<para>
However, this value <emphasis>must</emphasis> be less than <option>standby_reconnect_timeout</option> on the
promotion candidate (the node where <command>repmgr standby switchover</command> is executed).
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
Execute with the <literal>--dry-run</literal> option to test the switchover as far as
possible without actually changing the status of either node.
</para>
<para>
External database connections, e.g. from an application, should not be permitted while
the switchover is taking place. In particular, active transactions on the primary
can potentially disrupt the shutdown process.
</para>
</refsect1>
<refsect1 id="repmgr-standby-switchover-events">
<title>Event notifications</title>
<para>
<literal>standby_switchover</literal> and <literal>standby_promote</literal>
<link linkend="event-notifications">event notifications</link> will be generated for the new primary,
and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
</para>
<para>
If using an event notification script, <literal>standby_switchover</literal>
will populate the placeholder parameter <literal>%p</literal> with the node ID of
the former primary.
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr standby switchover</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The switchover completed successfully; or if <option>--dry-run</option> was provided,
no issues were detected which would prevent the switchover operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
<listitem>
<para>
The switchover could not be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
<listitem>
<para>
The switchover was executed but a problem was encountered.
Typically this means the former primary could not be reattached
as a standby. Check preceding log messages for more information.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-standby-follow"/>, <xref linkend="repmgr-node-rejoin"/>
</para>
<para>
For more details on performing a switchover operation, see the section <xref linkend="performing-switchover"/>.
</para>
</refsect1>
</refentry>

View File

@@ -20,30 +20,17 @@
record to the &repmgr; metadata, and if necessary initialises the witness
node by installing the &repmgr; extension and copying the &repmgr; metadata
to the witness server. This command needs to be executed to enable
use of the witness server with &repmgrd;.
use of the witness server with <application>repmgrd</application>.
</para>
<para>
When executing <command>repmgr witness register</command>, database connection
information for the cluster primary server must also be provided.
When executing <command>repmgr witness register</command>, connection information
for the cluster primary server must also be provided. &repmgr; will automatically
use the <varname>user</varname> and <varname>dbname</varname> values defined
in the <varname>conninfo</varname> string defined in the witness node's
<filename>repmgr.conf</filename>, if these are not explicitly provided.
</para>
<para>
In most cases it's only necessary to provide the primary's hostname with
the <option>-h</option>/<option>--host</option> option; &repmgr; will
automatically use the <varname>user</varname> and <varname>dbname</varname>
values defined in the <varname>conninfo</varname> string defined in the
witness node's <filename>repmgr.conf</filename>, unless these are explicitly
provided as command line options.
</para>
<note>
<para>
The primary server must be registered with <command><link linkend="repmgr-primary-register">repmgr primary register</link></command> before the witness
server can be registered.
</para>
</note>
<para>
Execute with the <option>--dry-run</option> option to check what would happen
Execute with the <literal>--dry-run</literal> option to check what would happen
without actually registering the witness server.
</para>
</refsect1>
@@ -63,34 +50,6 @@
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually register the witness
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option>/<option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-witness-register-events">
<title>Event notifications</title>
<para>

View File

@@ -1,16 +1,14 @@
<!-- doc/repmgr.xml -->
<!-- doc/src/sgml/postgres.sgml -->
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
[
<!ENTITY % version SYSTEM "version.xml">
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
<!ENTITY % version SYSTEM "version.sgml">
%version;
<!ENTITY % filelist SYSTEM "filelist.xml">
<!ENTITY % filelist SYSTEM "filelist.sgml">
%filelist;
<!ENTITY repmgr "<productname>repmgr</productname>">
<!ENTITY repmgrd "<productname>repmgrd</productname>">
<!ENTITY postgres "<productname>PostgreSQL</productname>">
]>
@@ -18,7 +16,7 @@
<title>repmgr &repmgrversion; Documentation</title>
<bookinfo>
<corpauthor>EDB</corpauthor>
<corpauthor>2ndQuadrant Ltd</corpauthor>
<productname>repmgr</productname>
<productnumber>&repmgrversion;</productnumber>
&legal;
@@ -26,31 +24,26 @@
<abstract>
<para>
This is the official documentation of &repmgr; &repmgrversion; for
use with PostgreSQL 9.4 - PostgreSQL 15.
</para>
<para>
&repmgr; is being continually developed and we strongly recommend using the
latest version. Please check the
<ulink url="https://repmgr.org/">repmgr website</ulink> for details
about the current &repmgr; version as well as the
<ulink url="https://repmgr.org/docs/current/index.html">current repmgr documentation</ulink>.
use with PostgreSQL 9.3 - PostgreSQL 10.
It describes the functionality supported by the current version of &repmgr;.
</para>
<para>
&repmgr; is developed by
<ulink url="https://www.enterprisedb.com/">EDB</ulink>
along with contributions from other individuals and organisations.
&repmgr; was developed by
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
along with contributions from other individuals and companies.
Contributions from the community are appreciated and welcome - get
in touch via <ulink url="https://github.com/EnterpriseDB/repmgr">github</ulink>
or <ulink url="https://groups.google.com/group/repmgr">the mailing list/forum</ulink>.
Multiple EDB customers contribute funding to make &repmgr; development possible.
in touch via <ulink url="https://github.com/2ndQuadrant/repmgr">github</>
or <ulink url="https://groups.google.com/group/repmgr">the mailing list/forum</>.
Multiple 2ndQuadrant customers contribute funding
to make repmgr development possible.
</para>
<para>
&repmgr; is fully supported by EDB's
<ulink url="https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql">24/7 Production Support</ulink>.
EDB, a Major Sponsor of the PostgreSQL project, continues to maintain &repmgr;.
We welcome participation from other organisations and individual developers.
2ndQuadrant, a Platinum sponsor of the PostgreSQL project,
continues to develop repmgr to meet internal needs and those of customers.
Other companies as well as individual developers
are welcome to participate in the efforts.
</para>
</abstract>
@@ -80,16 +73,23 @@
&promoting-standby;
&follow-new-primary;
&switchover;
&configuring-witness-server;
&event-notifications;
&upgrading-repmgr;
</part>
<part id="using-repmgrd">
<title>Using repmgrd</title>
&repmgrd-overview;
&repmgrd-automatic-failover;
&repmgrd-configuration;
&repmgrd-operation;
&repmgrd-demonstration;
&repmgrd-cascading-replication;
&repmgrd-network-split;
&repmgrd-witness-server;
&repmgrd-pausing;
&repmgrd-degraded-monitoring;
&repmgrd-monitoring;
&repmgrd-bdr;
</part>
<part id="repmgr-command-reference">
@@ -108,25 +108,22 @@
&repmgr-node-status;
&repmgr-node-check;
&repmgr-node-rejoin;
&repmgr-node-service;
&repmgr-cluster-show;
&repmgr-cluster-matrix;
&repmgr-cluster-crosscheck;
&repmgr-cluster-event;
&repmgr-cluster-cleanup;
&repmgr-service-status;
&repmgr-service-pause;
&repmgr-service-unpause;
&repmgr-daemon-start;
&repmgr-daemon-stop;
&repmgr-daemon-status;
&repmgr-daemon-pause;
&repmgr-daemon-unpause;
</part>
&appendix-release-notes;
&appendix-signatures;
&appendix-faq;
&appendix-packages;
&appendix-support;
<index id="bookindex"></index>
<![%include-index;[&bookindex;]]>
<![%include-xslt-index;[<index id="bookindex"></index>]]>
</book>

View File

@@ -0,0 +1,17 @@
<chapter id="repmgrd-automatic-failover" xreflabel="Automatic failover with repmgrd">
<indexterm>
<primary>repmgrd</primary>
<secondary>automatic failover</secondary>
</indexterm>
<title>Automatic failover with repmgrd</title>
<para>
<application>repmgrd</application> is a management and monitoring daemon which runs
on each node in a replication cluster. It can automate actions such as
failover and updating standbys to follow the new primary, as well as
providing monitoring information about the state of each standby.
</para>
</chapter>

View File

@@ -1,941 +0,0 @@
<chapter id="repmgrd-automatic-failover" xreflabel="Automatic failover with repmgrd">
<title>Automatic failover with repmgrd</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>automatic failover</secondary>
</indexterm>
<para>
&repmgrd; is a management and monitoring daemon which runs
on each node in a replication cluster. It can automate actions such as
failover and updating standbys to follow the new primary, as well as
providing monitoring information about the state of each standby.
</para>
<sect1 id="repmgrd-witness-server" xreflabel="Using a witness server with repmgrd">
<title>Using a witness server</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>witness server</secondary>
</indexterm>
<indexterm>
<primary>witness server</primary>
<secondary>repmgrd</secondary>
</indexterm>
<para>
A <xref linkend="witness-server"/> is a normal PostgreSQL instance which
is not part of the streaming replication cluster; its purpose is, if a
failover situation occurs, to provide proof that it is the primary server
itself which is unavailable, rather than e.g. a network split between
different physical locations.
</para>
<para>
A typical use case for a witness server is a two-node streaming replication
setup, where the primary and standby are in different locations (data centres).
By creating a witness server in the same location (data centre) as the primary,
if the primary becomes unavailable it's possible for the standby to decide whether
it can promote itself without risking a "split brain" scenario: if it can't see either the
witness or the primary server, it's likely there's a network-level interruption
and it should not promote itself. If it can see the witness but not the primary,
this proves there is no network interruption and the primary itself is unavailable,
and it can therefore promote itself (and ideally take action to fence the
former primary).
</para>
<note>
<para>
<emphasis>Never</emphasis> install a witness server on the same physical host
as another node in the replication cluster managed by &repmgr; - it's essential
the witness is not affected in any way by failure of another node.
</para>
</note>
<para>
For more complex replication scenarios, e.g. with multiple datacentres, it may
be preferable to use location-based failover, which ensures that only nodes
in the same location as the primary will ever be promotion candidates;
see <xref linkend="repmgrd-network-split"/> for more details.
</para>
<note>
<simpara>
A witness server will only be useful if &repmgrd;
is in use.
</simpara>
</note>
<sect2 id="creating-witness-server">
<title>Creating a witness server</title>
<para>
To create a witness server, set up a normal PostgreSQL instance on a server
in the same physical location as the cluster's primary server.
</para>
<para>
This instance should <emphasis>not</emphasis> be on the same physical host as the primary server,
as otherwise if the primary server fails due to hardware issues, the witness
server will be lost too.
</para>
<note>
<simpara>
A PostgreSQL instance can only accommodate a single witness server.
</simpara>
<simpara>
If you are planning to use a single server to support more than one
witness server, a separate PostgreSQL instance is required for each
witness server in use.
</simpara>
</note>
<para>
The witness server should be configured in the same way as a normal
&repmgr; node; see section <xref linkend="configuration"/>.
</para>
<para>
Register the witness server with <xref linkend="repmgr-witness-register"/>.
This will create the &repmgr; extension on the witness server, and make
a copy of the &repmgr; metadata.
</para>
<note>
<simpara>
As the witness server is not part of the replication cluster, further
changes to the &repmgr; metadata will be synchronised by
&repmgrd;.
</simpara>
</note>
<para>
Once the witness server has been configured, &repmgrd;
should be started.
</para>
<para>
To unregister a witness server, use <xref linkend="repmgr-witness-unregister"/>.
</para>
</sect2>
</sect1>
<sect1 id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
<title>Handling network splits with repmgrd</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>network splits</secondary>
</indexterm>
<indexterm>
<primary>network splits</primary>
</indexterm>
<para>
A common pattern for replication cluster setups is to spread servers over
more than one datacentre. This can provide benefits such as geographically-
distributed read replicas and DR (disaster recovery capability). However
this also means there is a risk of disconnection at network level between
datacentre locations, which would result in a split-brain scenario if
servers in a secondary data centre were no longer able to see the primary
in the main data centre and promoted a standby among themselves.
</para>
<para>
&repmgr; enables provision of &quot;<xref linkend="witness-server"/>&quot; to
artificially create a quorum of servers in a particular location, ensuring
that nodes in another location will not elect a new primary if they
are unable to see the majority of nodes. However this approach does not
scale well, particularly with more complex replication setups, e.g.
where the majority of nodes are located outside of the primary datacentre.
It also means the <literal>witness</literal> node needs to be managed as an
extra PostgreSQL instance outside of the main replication cluster, which
adds administrative and programming complexity.
</para>
<para>
<literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
each node is associated with an arbitrary location string (default is
<literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
<programlisting>
node_id=1
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
location='dc1'</programlisting>
</para>
<para>
In a failover situation, &repmgrd; will check if any servers in the
same location as the current primary node are visible. If not, &repmgrd;
will assume a network interruption and not promote any node in any
other location (it will however enter <link linkend="repmgrd-degraded-monitoring">degraded monitoring</link>
mode until a primary becomes visible).
</para>
</sect1>
<sect1 id="repmgrd-primary-visibility-consensus" xreflabel="Primary visibility consensus">
<title>Primary visibility consensus</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>primary visibility consensus</secondary>
</indexterm>
<indexterm>
<primary>primary_visibility_consensus</primary>
</indexterm>
<para>
In more complex replication setups, particularly where replication occurs between
multiple datacentres, it's possible that some but not all standbys get cut off from the
primary (but not from the other standbys).
</para>
<para>
In this situation, normally it's not desirable for any of the standbys which have been
cut off to initiate a failover, as the primary is still functioning and standbys are
connected. Beginning with <link linkend="release-4.4">&repmgr; 4.4</link>
it is now possible for the affected standbys to build a consensus about whether
the primary is still available to some standbys (&quot;primary visibility consensus&quot;).
This is done by polling each standby (and the witness, if present) for the time it last saw the
primary; if any have seen the primary very recently, it's reasonable
to infer that the primary is still available and a failover should not be started.
</para>
<para>
The time the primary was last seen by each node can be checked by executing
<link linkend="repmgr-service-status"><command>repmgr service status</command></link>
(&repmgr; 4.2 - 4.4: <link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>)
which includes this in its output, e.g.:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 27259 | no | n/a
2 | node2 | standby | running | node1 | running | 27272 | no | 1 second(s) ago
3 | node3 | standby | running | node1 | running | 27282 | no | 0 second(s) ago
4 | node4 | witness | * running | node1 | running | 27298 | no | 1 second(s) ago</programlisting>
</para>
<para>
To enable this functionality, in <filename>repmgr.conf</filename> set:
<programlisting>
primary_visibility_consensus=true</programlisting>
</para>
<note>
<para>
<option>primary_visibility_consensus</option> <emphasis>must</emphasis> be set to
<literal>true</literal> on all nodes for it to be effective.
</para>
</note>
<para>
The following sample &repmgrd; log output demonstrates the behaviour in a situation
where one of three standbys is no longer able to connect to the primary, but <emphasis>can</emphasis>
connect to the two other standbys (&quot;sibling nodes&quot;):
<programlisting>
[2019-05-17 05:36:12] [WARNING] unable to reconnect to node 1 after 3 attempts
[2019-05-17 05:36:12] [INFO] 2 active sibling nodes registered
[2019-05-17 05:36:12] [INFO] local node's last receive lsn: 0/7006E58
[2019-05-17 05:36:12] [INFO] checking state of sibling node "node3" (ID: 3)
[2019-05-17 05:36:12] [INFO] node "node3" (ID: 3) reports its upstream is node 1, last seen 1 second(s) ago
[2019-05-17 05:36:12] [NOTICE] node 3 last saw primary node 1 second(s) ago, considering primary still visible
[2019-05-17 05:36:12] [INFO] last receive LSN for sibling node "node3" (ID: 3) is: 0/7006E58
[2019-05-17 05:36:12] [INFO] node "node3" (ID: 3) has same LSN as current candidate "node2" (ID: 2)
[2019-05-17 05:36:12] [INFO] checking state of sibling node "node4" (ID: 4)
[2019-05-17 05:36:12] [INFO] node "node4" (ID: 4) reports its upstream is node 1, last seen 0 second(s) ago
[2019-05-17 05:36:12] [NOTICE] node 4 last saw primary node 0 second(s) ago, considering primary still visible
[2019-05-17 05:36:12] [INFO] last receive LSN for sibling node "node4" (ID: 4) is: 0/7006E58
[2019-05-17 05:36:12] [INFO] node "node4" (ID: 4) has same LSN as current candidate "node2" (ID: 2)
[2019-05-17 05:36:12] [INFO] 2 nodes can see the primary
[2019-05-17 05:36:12] [DETAIL] following nodes can see the primary:
- node "node3" (ID: 3): 1 second(s) ago
- node "node4" (ID: 4): 0 second(s) ago
[2019-05-17 05:36:12] [NOTICE] cancelling failover as some nodes can still see the primary
[2019-05-17 05:36:12] [NOTICE] election cancelled
[2019-05-17 05:36:14] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state</programlisting>
In this situation it will cancel the failover and enter degraded monitoring node,
waiting for the primary to reappear.
</para>
</sect1>
<sect1 id="repmgrd-standby-disconnection-on-failover" xreflabel="Standby disconnection on failover">
<title>Standby disconnection on failover</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>standby disconnection on failover</secondary>
</indexterm>
<indexterm>
<primary>standby disconnection on failover</primary>
</indexterm>
<para>
If <option>standby_disconnect_on_failover</option> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>, in a failover situation &repmgrd; will forcibly disconnect
the local node's WAL receiver, and wait for the WAL receiver on all sibling nodes to be
disconnected, before making a failover decision.
</para>
<note>
<para>
<option>standby_disconnect_on_failover</option> is available with PostgreSQL 9.5 and later.
Until PostgreSQL 14 this requires that the <literal>repmgr</literal> database user is a superuser.
From PostgreSQL 15 a specific ALTER SYSTEM privilege can be granted to the <literal>repmgr</literal> database
user with e.g. <command>GRANT ALTER SYSTEM ON PARAMETER wal_retrieve_retry_interval TO repmgr</command>.
</para>
</note>
<para>
By doing this, it's possible to ensure that, at the point the failover decision is made, no nodes
are receiving data from the primary and their LSN location will be static.
</para>
<important>
<para>
<option>standby_disconnect_on_failover</option> <emphasis>must</emphasis> be set to the same value on
all nodes.
</para>
</important>
<para>
Note that when using <option>standby_disconnect_on_failover</option> there will be a delay of 5 seconds
plus however many seconds it takes to confirm the WAL receiver is disconnected before
&repmgrd; proceeds with the failover decision.
</para>
<para>
&repmgrd; will wait up to <option>sibling_nodes_disconnect_timeout</option> seconds (default:
<literal>30</literal>) to confirm that the WAL receiver on all sibling nodes hase been
disconnected before proceding with the failover operation. If the timeout is reached, the
failover operation will go ahead anyway.
</para>
<para>
Following the failover operation, no matter what the outcome, each node will reconnect its WAL receiver.
</para>
<para>
If using <option>standby_disconnect_on_failover</option>, we recommend that the
<option>primary_visibility_consensus</option> option is also used.
</para>
</sect1>
<sect1 id="repmgrd-failover-validation" xreflabel="Failover validation">
<title>Failover validation</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>failover validation</secondary>
</indexterm>
<indexterm>
<primary>failover validation</primary>
</indexterm>
<para>
From <link linkend="release-4.3">repmgr 4.3</link>, &repmgr; makes it possible to provide a script
to &repmgrd; which, in a failover situation,
will be executed by the promotion candidate (the node which has been selected
to be the new primary) to confirm whether the node should actually be promoted.
</para>
<para>
To use this, <option>failover_validation_command</option> in <filename>repmgr.conf</filename>
to a script executable by the <literal>postgres</literal> system user, e.g.:
<programlisting>
failover_validation_command=/path/to/script.sh %n</programlisting>
</para>
<para>
The <literal>%n</literal> parameter will be replaced with the node ID when the script is
executed. A number of other parameters are also available, see section
&quot;<xref linkend="repmgrd-automatic-failover-configuration-optional"/>&quot; for details.
</para>
<para>
This script must return an exit code of <literal>0</literal> to indicate the node should promote itself.
Any other value will result in the promotion being aborted and the election rerun.
There is a pause of <option>election_rerun_interval</option> seconds before the election is rerun.
</para>
<para>
Sample &repmgrd; log file output during which the failover validation
script rejects the proposed promotion candidate:
<programlisting>
[2019-03-13 21:01:30] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds
[2019-03-13 21:01:30] [NOTICE] promotion candidate is "node2" (ID: 2)
[2019-03-13 21:01:30] [NOTICE] executing "failover_validation_command"
[2019-03-13 21:01:30] [DETAIL] /usr/local/bin/failover-validation.sh 2
[2019-03-13 21:01:30] [INFO] output returned by failover validation command:
Node ID: 2
[2019-03-13 21:01:30] [NOTICE] failover validation command returned a non-zero value: "1"
[2019-03-13 21:01:30] [NOTICE] promotion candidate election will be rerun
[2019-03-13 21:01:30] [INFO] 1 followers to notify
[2019-03-13 21:01:30] [NOTICE] notifying node "node3" (ID: 3) to rerun promotion candidate selection
INFO: node 3 received notification to rerun promotion candidate election
[2019-03-13 21:01:30] [NOTICE] rerunning election after 15 seconds ("election_rerun_interval")</programlisting>
</para>
</sect1>
<sect1 id="cascading-replication" xreflabel="Cascading replication">
<title>repmgrd and cascading replication</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>cascading replication</secondary>
</indexterm>
<indexterm>
<primary>cascading replication</primary>
<secondary>repmgrd</secondary>
</indexterm>
<para>
Cascading replication - where a standby can connect to an upstream node and not
the primary server itself - was introduced in PostgreSQL 9.2. &repmgr; and
&repmgrd; support cascading replication by keeping track of the relationship
between standby servers - each node record is stored with the node id of its
upstream ("parent") server (except of course the primary server).
</para>
<para>
In a failover situation where the primary node fails and a top-level standby
is promoted, a standby connected to another standby will not be affected
and continue working as normal (even if the upstream standby it's connected
to becomes the primary node). If however the node's direct upstream fails,
the &quot;cascaded standby&quot; will attempt to reconnect to that node's parent
(unless <varname>failover</varname> is set to <literal>manual</literal> in
<filename>repmgr.conf</filename>).
</para>
</sect1>
<sect1 id="repmgrd-primary-child-disconnection" xreflabel="Monitoring standby disconnections on the primary">
<title>Monitoring standby disconnections on the primary node</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>standby disconnection</secondary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>child node disconnection</secondary>
</indexterm>
<note>
<para>
This functionality is available in <link linkend="release-4.4">&repmgr; 4.4</link> and later.
</para>
</note>
<para>
When running on the primary node, &repmgrd; can
monitor connections and in particular disconnections by its attached
child nodes (standbys, and if in use, the witness server), and optionally
execute a custom command if certain criteria are met (such as the number of
attached nodes falling to zero following a failover to a new primary); this
command can be used for example to &quot;fence&quot; the node and ensure it
is isolated from any applications attempting to access the replication cluster.
</para>
<note>
<para>
Currently &repmgrd; can only detect disconnections
of streaming replication standbys and cannot determine whether a standby
has disconnected and fallen back to archive recovery.
</para>
<para>
See section <link linkend="repmgrd-primary-child-disconnection-caveats">caveats</link> below.
</para>
</note>
<sect2 id="repmgrd-primary-child-disconnection-monitoring-process">
<title>Standby disconnections monitoring process and criteria</title>
<para>
&repmgrd; monitors attached child nodes and decides
whether to invoke the user-defined command based on the following process
and criteria:
<itemizedlist>
<listitem>
<para>
Every few seconds (defined by the configuration parameter <varname>child_nodes_check_interval</varname>;
default: <literal>5</literal> seconds, a value of <literal>0</literal> disables this altogether), &repmgrd; queries
the <literal>pg_stat_replication</literal> system view and compares
the nodes present there against the list of nodes registered with &repmgr; which
should be attached to the primary.
</para>
<para>
If a witness server is in use, &repmgrd; connects to it and checks which upstream node
it is following.
</para>
</listitem>
<listitem>
<para>
If a child node (standby) is no longer present in <literal>pg_stat_replication</literal>,
&repmgrd; notes the time it detected the node's absence, and additionally generates a
<literal>child_node_disconnect</literal> event.
</para>
<para>
If a witness server is in use, and it is no longer following the primary, or not
reachable at all, &repmgrd; notes the time it detected the node's absence, and additionally generates a
<literal>child_node_disconnect</literal> event.
</para>
</listitem>
<listitem>
<para>
If a child node (standby) which was absent from <literal>pg_stat_replication</literal> reappears,
&repmgrd; clears the time it detected the node's absence, and additionally generates a
<literal>child_node_reconnect</literal> event.
</para>
<para>
If a witness server is in use, which was previously not reachable or not following the
primary node, has become reachable and is following the primary node, &repmgrd; clears the
time it detected the node's absence, and additionally generates a
<literal>child_node_reconnect</literal> event.
</para>
</listitem>
<listitem>
<para>
If an entirely new child node (standby or witness) is detected, &repmgrd; adds it to its internal list
and additionally generates a <literal>child_node_new_connect</literal> event.
</para>
</listitem>
<listitem>
<para>
If the <varname>child_nodes_disconnect_command</varname> parameter is set in
<filename>repmgr.conf</filename>, &repmgrd; will then loop through all child nodes.
If it determines that insufficient child nodes are connected, and a
minimum of <varname>child_nodes_disconnect_timeout</varname> seconds (default: <literal>30</literal>)
has elapsed since the last node became disconnected, &repmgrd; will then execute the
<varname>child_nodes_disconnect_command</varname> script.
</para>
<para>
By default, the <varname>child_nodes_disconnect_command</varname> will only be executed
if all child nodes are disconnected. If <varname>child_nodes_connected_min_count</varname>
is set, the <varname>child_nodes_disconnect_command</varname> script will be triggered
if the number of connected child nodes falls below the specified value (e.g.
if set to <literal>2</literal>, the script will be triggered if only one child node
is connected). Alternatively, if <varname>child_nodes_disconnect_min_count</varname>
and more than that number of child nodes disconnects, the script will be triggered.
</para>
<note>
<para>
By default, a witness node, if in use, will <emphasis>not</emphasis> be counted as a
child node for the purposes of determining whether to execute
<varname>child_nodes_disconnect_command</varname>.
</para>
<para>
To enable the witness node to be counted as a child node, set
<varname>child_nodes_connected_include_witness</varname> in <filename>repmgr.conf</filename>
to <literal>true</literal>
(and <link linkend="repmgrd-reloading-configuration">reload the configuration</link> if &repmgrd;
is running).
</para>
</note>
</listitem>
<listitem>
<para>
Note that child nodes which are not attached when &repmgrd;
starts will <emphasis>not</emphasis> be considered as missing, as &repmgrd;
cannot know why they are not attached.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="repmgrd-primary-child-disconnection-example">
<title>Standby disconnections monitoring process example</title>
<para>
This example shows typical &repmgrd; log output from a three-node cluster
(primary and two child nodes), with <varname>child_nodes_connected_min_count</varname>
set to <literal>2</literal>.
</para>
<para>
&repmgrd; on the primary has started up, while two child
nodes are being provisioned:
<programlisting>
[2019-04-24 15:25:33] [INFO] monitoring primary node "node1" (ID: 1) in normal state
[2019-04-24 15:25:35] [NOTICE] new node "node2" (ID: 2) has connected
[2019-04-24 15:25:35] [NOTICE] 1 (of 1) child nodes are connected, but at least 2 child nodes required
[2019-04-24 15:25:35] [INFO] no child nodes have detached since repmgrd startup
(...)
[2019-04-24 15:25:44] [NOTICE] new node "node3" (ID: 3) has connected
[2019-04-24 15:25:46] [INFO] monitoring primary node "node1" (ID: 1) in normal state
(...)</programlisting>
</para>
<para>
One of the child nodes has disconnected; &repmgrd;
is now waiting <varname>child_nodes_disconnect_timeout</varname> seconds
before executing <varname>child_nodes_disconnect_command</varname>:
<programlisting>
[2019-04-24 15:28:11] [INFO] monitoring primary node "node1" (ID: 1) in normal state
[2019-04-24 15:28:17] [INFO] monitoring primary node "node1" (ID: 1) in normal state
[2019-04-24 15:28:19] [NOTICE] node "node3" (ID: 3) has disconnected
[2019-04-24 15:28:19] [NOTICE] 1 (of 2) child nodes are connected, but at least 2 child nodes required
[2019-04-24 15:28:19] [INFO] most recently detached child node was 3 (ca. 0 seconds ago), not triggering "child_nodes_disconnect_command"
[2019-04-24 15:28:19] [DETAIL] "child_nodes_disconnect_timeout" set To 30 seconds
(...)</programlisting>
</para>
<para>
<varname>child_nodes_disconnect_command</varname> is executed once:
<programlisting>
[2019-04-24 15:28:49] [INFO] most recently detached child node was 3 (ca. 30 seconds ago), triggering "child_nodes_disconnect_command"
[2019-04-24 15:28:49] [INFO] "child_nodes_disconnect_command" is:
"/usr/bin/fence-all-the-things.sh"
[2019-04-24 15:28:51] [NOTICE] 1 (of 2) child nodes are connected, but at least 2 child nodes required
[2019-04-24 15:28:51] [INFO] "child_nodes_disconnect_command" was previously executed, taking no action</programlisting>
</para>
</sect2>
<sect2 id="repmgrd-primary-child-disconnection-caveats">
<title>Standby disconnections monitoring caveats</title>
<para>
The following caveats should be considered if you are intending to use this functionality.
</para>
<para>
<itemizedlist mark="bullet">
<listitem>
<para>
If a child node is configured to use archive recovery, it's possible that
the child node will disconnect from the primary node and fall back to
archive recovery. In this case &repmgrd;
will nevertheless register a node disconnection.
</para>
</listitem>
<listitem>
<para>
&repmgr; relies on <varname>application_name</varname> in the child node's
<varname>primary_conninfo</varname> string to be the same as the node name
defined in the node's <filename>repmgr.conf</filename> file. Furthermore,
this <varname>application_name</varname> must be unique across the replication
cluster.
</para>
<para>
If a custom <varname>application_name</varname> is used, or the
<varname>application_name</varname> is not unique across the replication
cluster, &repmgr; will not be able to reliably monitor child node connections.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="repmgrd-primary-child-disconnection-configuration">
<title>Standby disconnections monitoring process configuration</title>
<para>
The following parameters, set in <filename>repmgr.conf</filename>,
control how child node disconnection monitoring operates.
</para>
<variablelist>
<varlistentry>
<term><varname>child_nodes_check_interval</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_check_interval</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
Interval (in seconds) after which &repmgrd; queries the
<literal>pg_stat_replication</literal> system view and compares the nodes present
there against the list of nodes registered with repmgr which should be attached to the primary.
</para>
<para>
Default is <literal>5</literal> seconds, a value of <literal>0</literal> disables this check
altogether.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_disconnect_command</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_disconnect_command</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
User-definable script to be executed when &repmgrd;
determines that an insufficient number of child nodes are connected. By default
the script is executed when no child nodes are executed, but the execution
threshold can be modified by setting one of <varname>child_nodes_connected_min_count</varname>
or<varname>child_nodes_disconnect_min_count</varname> (see below).
</para>
<para>
The <varname>child_nodes_disconnect_command</varname> script can be
any user-defined script or program. It <emphasis>must</emphasis> be able
to be executed by the system user under which the PostgreSQL server itself
runs (usually <literal>postgres</literal>).
</para>
<note>
<para>
If <varname>child_nodes_disconnect_command</varname> is not set, no action
will be taken.
</para>
</note>
<para>
If specified, the following format placeholder will be substituted when
executing <varname>child_nodes_disconnect_command</varname>:
</para>
<variablelist>
<varlistentry>
<term><option>%p</option></term>
<listitem>
<para>
ID of the node executing the <varname>child_nodes_disconnect_command</varname> script.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
The <varname>child_nodes_disconnect_command</varname> script will only be executed once
while the criteria for its execution are met. If the criteria for its execution are no longer
met (i.e. some child nodes have reconnected), it will be executed again if
the criteria for its execution are met again.
</para>
<para>
The <varname>child_nodes_disconnect_command</varname> script will not be executed if
&repmgrd; is <link linkend="repmgrd-pausing">paused</link>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_disconnect_timeout</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_disconnect_timeout</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
If &repmgrd; determines that an insufficient number of
child nodes are connected, it will wait for the specified number of seconds
to execute the <varname>child_nodes_disconnect_command</varname>.
</para>
<para>
Default: <literal>30</literal> seconds.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_connected_min_count</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_connected_min_count</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
If the number of child nodes connected falls below the number specified in
this parameter, the <varname>child_nodes_disconnect_command</varname> script
will be executed.
</para>
<para>
For example, if <varname>child_nodes_connected_min_count</varname> is set
to <literal>2</literal>, the <varname>child_nodes_disconnect_command</varname>
script will be executed if one or no child nodes are connected.
</para>
<para>
Note that <varname>child_nodes_connected_min_count</varname> overrides any value
set in <varname>child_nodes_disconnect_min_count</varname>.
</para>
<para>
If neither of <varname>child_nodes_connected_min_count</varname> or
<varname>child_nodes_disconnect_min_count</varname> are set,
the <varname>child_nodes_disconnect_command</varname> script
will be executed when no child nodes are connected.
</para>
<para>
A witness node, if in use, will not be counted as a child node unless
<varname>child_nodes_connected_include_witness</varname> is set to <literal>true</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_disconnect_min_count</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_disconnect_min_count</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
If the number of disconnected child nodes exceeds the number specified in
this parameter, the <varname>child_nodes_disconnect_command</varname> script
will be executed.
</para>
<para>
For example, if <varname>child_nodes_disconnect_min_count</varname> is set
to <literal>2</literal>, the <varname>child_nodes_disconnect_command</varname>
script will be executed if more than two child nodes are disconnected.
</para>
<para>
Note that any value set in <varname>child_nodes_disconnect_min_count</varname>
will be overriden by <varname>child_nodes_connected_min_count</varname>.
</para>
<para>
If neither of <varname>child_nodes_connected_min_count</varname> or
<varname>child_nodes_disconnect_min_count</varname> are set,
the <varname>child_nodes_disconnect_command</varname> script
will be executed when no child nodes are connected.
</para>
<para>
A witness node, if in use, will not be counted as a child node unless
<varname>child_nodes_connected_include_witness</varname> is set to <literal>true</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_connected_include_witness</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_connected_include_witness</primary>
<secondary>child node disconnection monitoring</secondary>
</indexterm>
<para>
Whether to count the witness node (if in use) as a child node when
determining whether to execute <varname>child_nodes_disconnect_command</varname>.
</para>
<para>
Default to <literal>false</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="repmgrd-primary-child-disconnection-events">
<title>Standby disconnections monitoring process event notifications</title>
<para>
The following <link linkend="event-notifications">event notifications</link> may be generated:
</para>
<variablelist>
<varlistentry>
<term><varname>child_node_disconnect</varname></term>
<listitem>
<indexterm>
<primary>child_node_disconnect</primary>
<secondary>event notification</secondary>
</indexterm>
<para>
This event is generated after &repmgrd;
detects that a child node is no longer streaming from the primary node.
</para>
<para>
Example:
<programlisting>
$ repmgr cluster event --event=child_node_disconnect
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+-----------------------+----+---------------------+--------------------------------------------
1 | node1 | child_node_disconnect | t | 2019-04-24 12:41:36 | node "node3" (ID: 3) has disconnected</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_node_reconnect</varname></term>
<listitem>
<indexterm>
<primary>child_node_reconnect</primary>
<secondary>event notification</secondary>
</indexterm>
<para>
This event is generated after &repmgrd;
detects that a child node has resumed streaming from the primary node.
</para>
<para>
Example:
<programlisting>
$ repmgr cluster event --event=child_node_reconnect
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+----------------------+----+---------------------+------------------------------------------------------------
1 | node1 | child_node_reconnect | t | 2019-04-24 12:42:19 | node "node3" (ID: 3) has reconnected after 42 seconds</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_node_new_connect</varname></term>
<listitem>
<indexterm>
<primary>child_node_new_connect</primary>
<secondary>event notification</secondary>
</indexterm>
<para>
This event is generated after &repmgrd;
detects that a new child node has been registered with &repmgr; and has
connected to the primary.
</para>
<para>
Example:
<programlisting>
$ repmgr cluster event --event=child_node_new_connect
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+------------------------+----+---------------------+---------------------------------------------
1 | node1 | child_node_new_connect | t | 2019-04-24 12:41:30 | new node "node3" (ID: 3) has connected</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>child_nodes_disconnect_command</varname></term>
<listitem>
<indexterm>
<primary>child_nodes_disconnect_command</primary>
<secondary>event notification</secondary>
</indexterm>
<para>
This event is generated after &repmgrd; detects
that sufficient child nodes have been disconnected for a sufficient amount
of time to trigger execution of the <varname>child_nodes_disconnect_command</varname>.
</para>
<para>
Example:
<programlisting>
$ repmgr cluster event --event=child_nodes_disconnect_command
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+--------------------------------+----+---------------------+--------------------------------------------------------
1 | node1 | child_nodes_disconnect_command | t | 2019-04-24 13:08:17 | "child_nodes_disconnect_command" successfully executed</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
</chapter>

415
doc/repmgrd-bdr.sgml Normal file
View File

@@ -0,0 +1,415 @@
<chapter id="repmgrd-bdr">
<indexterm>
<primary>repmgrd</primary>
<secondary>BDR</secondary>
</indexterm>
<indexterm>
<primary>BDR</primary>
</indexterm>
<title>BDR failover with repmgrd</title>
<para>
&repmgr; 4.x provides support for monitoring BDR nodes and taking action in
case one of the nodes fails.
</para>
<note>
<simpara>
Due to the nature of BDR 1.x/2.x, it's only safe to use this solution for
a two-node scenario. Introducing additional nodes will create an inherent
risk of node desynchronisation if a node goes down without being cleanly
removed from the cluster.
</simpara>
</note>
<para>
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with <application>repmgrd</application> and redirecting queries from the failed node to the remaining
active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically
reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
</para>
<sect1 id="bdr-prerequisites" xreflabel="BDR prequisites">
<title>Prerequisites</title>
<para>
&repmgr; 4 requires PostgreSQL 9.4 or 9.6 with the BDR 2 extension
enabled and configured for a two-node BDR network. &repmgr; 4 packages
must be installed on each node before attempting to configure
<application>repmgr</application>.
</para>
<note>
<simpara>
&repmgr; 4 will refuse to install if it detects more than two BDR nodes.
</simpara>
</note>
<para>
Application database connections *must* be passed through a proxy server/
connection pooler such as <application>PgBouncer</application>, and it must be possible to dynamically
reconfigure that from <application>repmgrd</application>. The example demonstrated in this document
will use <application>PgBouncer</application>
</para>
<para>
The proxy server / connection poolers must <emphasis>not</emphasis>
be installed on the database servers.
</para>
<para>
For this example, it's assumed password-less SSH connections are available
from the PostgreSQL servers to the servers where <application>PgBouncer</application>
runs, and that the user on those servers has permission to alter the
<application>PgBouncer</application> configuration files.
</para>
<para>
PostgreSQL connections must be possible between each node, and each node
must be able to connect to each PgBouncer instance.
</para>
</sect1>
<sect1 id="bdr-configuration" xreflabel="BDR configuration">
<title>Configuration</title>
<para>
A sample configuration for <filename>repmgr.conf</filename> on each
BDR node would look like this:
<programlisting>
# Node information
node_id=1
node_name='node1'
conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
replication_type='bdr'
# Event notification configuration
event_notifications=bdr_failover
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&1'
# repmgrd options
monitor_interval_secs=5
reconnect_attempts=6
reconnect_interval=5</programlisting>
</para>
<para>
Adjust settings as appropriate; copy and adjust for the second node (particularly
the values <varname>node_id</varname>, <varname>node_name</varname>
and <varname>conninfo</varname>).
</para>
<para>
Note that the values provided for the <varname>conninfo</varname> string
must be valid for connections from <emphasis>both</emphasis> nodes in the
replication cluster. The database must be the BDR-enabled database.
</para>
<para>
If defined, the <varname>event_notifications</varname> parameter will restrict
execution of the script defined in <varname>event_notification_command</varname>
to the specified event(s).
</para>
<note>
<simpara>
<varname>event_notification_command</varname> is the script which does the actual "heavy lifting"
of reconfiguring the proxy server/ connection pooler. It is fully
user-definable; see section <xref linkend="bdr-event-notification-command"> for a reference
implementation.
</simpara>
</note>
</sect1>
<sect1 id="bdr-repmgr-setup" xreflabel="repmgr setup with BDR">
<title>repmgr setup</title>
<para>
Register both nodes; example on <literal>node1</literal>:
<programlisting>
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: node record created for node 'node1' (ID: 1)
NOTICE: BDR node 1 registered (conninfo: host=node1 dbname=bdrtest user=repmgr)</programlisting>
</para>
<para>
and on <literal>node1</literal>:
<programlisting>
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: node record created for node 'node2' (ID: 2)
NOTICE: BDR node 2 registered (conninfo: host=node2 dbname=bdrtest user=repmgr)</programlisting>
</para>
<para>
The <literal>repmgr</literal> extension will be automatically created
when the first node is registered, and will be propagated to the second
node.
</para>
<important>
<simpara>
Ensure the &repmgr; package is available on both nodes before
attempting to register the first node.
</simpara>
</important>
<para>
At this point the meta data for both nodes has been created; executing
<xref linkend="repmgr-cluster-show"> (on either node) should produce output like this:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+------+-----------+----------+--------------------------------------------------------
1 | node1 | bdr | * running | | default | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
2 | node2 | bdr | * running | | default | host=node2 dbname=bdrtest user=repmgr connect_timeout=2</programlisting>
</para>
<para>
Additionally it's possible to display log of significant events; executing
<xref linkend="repmgr-cluster-event"> (on either node) should produce output like this:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster event
Node ID | Event | OK | Timestamp | Details
---------+--------------+----+---------------------+----------------------------------------------
2 | bdr_register | t | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
1 | bdr_register | t | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
</programlisting>
</para>
<para>
At this point there will only be records for the two node registrations (displayed here
in reverse chronological order).
</para>
</sect1>
<sect1 id="bdr-event-notification-command" xreflabel="Defining the BDR failover &quot;event_notification command&quot;">
<title>Defining the BDR failover "event_notification_command"</title>
<para>
Key to "failover" execution is the <literal>event_notification_command</literal>,
which is a user-definable script specified in <filename>repmpgr.conf</filename>
and which can use a &repmgr; <link linkend="event-notifications">event notification</link>
to reconfigure the proxy server / connection pooler so it points to the other, still-active node.
Details of the event will be passed as parameters to the script.
</para>
<para>
Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>;
these will be replaced with the appropriate value when the script is executed:
</para>
<variablelist>
<varlistentry>
<term><option>%n</option></term>
<listitem>
<para>
node ID
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%e</option></term>
<listitem>
<para>
event type
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%t</option></term>
<listitem>
<para>
success (1 or 0)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%t</option></term>
<listitem>
<para>
timestamp
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%d</option></term>
<listitem>
<para>
details
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%c</option></term>
<listitem>
<para>
conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%a</option></term>
<listitem>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Note that <literal>%c</literal> and <literal>%a</literal> are only provided with
particular failover events, in this case <varname>bdr_failover</varname>.
</para>
<para>
The provided sample script
(<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>)
is configured as follows:
<programlisting>
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
</para>
<para>
and parses the placeholder parameters like this:
<programlisting>
NODE_ID=$1
EVENT_TYPE=$2
SUCCESS=$3
NEXT_CONNINFO=$4
NEXT_NODE_NAME=$5</programlisting>
</para>
<note>
<para>
The sample script also contains some hard-coded values for the <application>PgBouncer</application>
configuration for both nodes; these will need to be adjusted for your local environment
(ideally the scripts would be maintained as templates and generated by some
kind of provisioning system).
</para>
</note>
<para>
The script performs following steps:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>pauses <application>PgBouncer</application> on all nodes</simpara>
</listitem>
<listitem>
<simpara>recreates the <application>PgBouncer</application> configuration file on each
node using the information provided by <application>repmgrd</application>
(primarily the <varname>conninfo</varname> string) to configure
<application>PgBouncer</application></simpara>
</listitem>
<listitem>
<simpara>reloads the <application>PgBouncer</application> configuration</simpara>
</listitem>
<listitem>
<simpara>executes the <command>RESUME</command> command (in <application>PgBouncer</application>)</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Following successful script execution, any connections to PgBouncer on the failed BDR node
will be redirected to the active node.
</para>
</sect1>
<sect1 id="bdr-monitoring-failover" xreflabel="Node monitoring and failover">
<title>Node monitoring and failover</title>
<para>
At the intervals specified by <varname>monitor_interval_secs</varname>
in <filename>repmgr.conf</filename>, <application>repmgrd</application>
will ping each node to check if it's available. If a node isn't available,
<application>repmgrd</application> will enter failover mode and check <varname>reconnect_attempts</varname>
times at intervals of <varname>reconnect_interval</varname> to confirm the node is definitely unreachable.
This buffer period is necessary to avoid false positives caused by transient
network outages.
</para>
<para>
If the node is still unavailable, <application>repmgrd</application> will enter failover mode and execute
the script defined in <varname>event_notification_command</varname>; an entry will be logged
in the <literal>repmgr.events</literal> table and <application>repmgrd</application> will
(unless otherwise configured) resume monitoring of the node in "degraded" mode until it reappears.
</para>
<para>
<application>repmgrd</application> logfile output during a failover event will look something like this
on one node (usually the node which has failed, here <literal>node2</literal>):
<programlisting>
...
[2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
[2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
[2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
[2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
[2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
[2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
[2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
[2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
[2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
[2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
[2017-07-27 21:09:28] [DETAIL] command is:
/path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
[2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
[2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
...</programlisting>
</para>
<para>
Output on the other node (<literal>node1</literal>) during the same event will look like this:
<programlisting>
...
[2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
[2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
[2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
[2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
[2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
[2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
[2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
[2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
[2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
[2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
...</programlisting>
</para>
<para>
This assumes only the PostgreSQL instance on <literal>node2</literal> has failed. In this case the
<application>repmgrd</application> instance running on <literal>node2</literal> has performed the failover. However if
the entire server becomes unavailable, <application>repmgrd</application> on <literal>node1</literal> will perform
the failover.
</para>
</sect1>
<sect1 id="bdr-node-recovery" xreflabel="Node recovery">
<title>Node recovery</title>
<para>
Following failure of a BDR node, if the node subsequently becomes available again,
a <varname>bdr_recovery</varname> event will be generated. This could potentially be used to
reconfigure PgBouncer automatically to bring the node back into the available pool,
however it would be prudent to manually verify the node's status before
exposing it to the application.
</para>
<para>
If the failed node comes back up and connects correctly, output similar to this
will be visible in the <application>repmgrd</application> log:
<programlisting>
[2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
[2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
[2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
[2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds</programlisting>
</para>
</sect1>
<sect1 id="bdr-complete-shutdown" xreflabel="Shutdown of both nodes">
<title>Shutdown of both nodes</title>
<para>
If both PostgreSQL instances are shut down, <application>repmgrd</application> will try and handle the
situation as gracefully as possible, though with no failover candidates available
there's not much it can do. Should this case ever occur, we recommend shutting
down <application>repmgrd</application> on both nodes and restarting it once the PostgreSQL instances
are running properly.
</para>
</sect1>
</chapter>

View File

@@ -0,0 +1,22 @@
<chapter id="repmgrd-cascading-replication">
<indexterm>
<primary>repmgrd</primary>
<secondary>cascading replication</secondary>
</indexterm>
<title>repmgrd and cascading replication</title>
<para>
Cascading replication - where a standby can connect to an upstream node and not
the primary server itself - was introduced in PostgreSQL 9.2. &repmgr; and
<application>repmgrd</application> support cascading replication by keeping track of the relationship
between standby servers - each node record is stored with the node id of its
upstream ("parent") server (except of course the primary server).
</para>
<para>
In a failover situation where the primary node fails and a top-level standby
is promoted, a standby connected to another standby will not be affected
and continue working as normal (even if the upstream standby it's connected
to becomes the primary node). If however the node's direct upstream fails,
the "cascaded standby" will attempt to reconnect to that node's parent.
</para>
</chapter>

View File

@@ -0,0 +1,557 @@
<chapter id="repmgrd-configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>configuration</secondary>
</indexterm>
<title>repmgrd configuration</title>
<para>
<application>repmgrd</application> is a daemon which runs on each PostgreSQL node,
monitoring the local node, and (unless it's the primary node) the upstream server
(the primary server or with cascading replication, another standby) which it's
connected to.
</para>
<para>
<application>repmgrd</application> can be configured to provide failover
capability in case the primary upstream node becomes unreachable, and/or
provide monitoring data to the &repmgr; metadatabase.
</para>
<sect1 id="repmgrd-basic-configuration">
<title>repmgrd basic configuration</title>
<para>
To use <application>repmgrd</application>, its associated function library <emphasis>must</emphasis> be
included via <filename>postgresql.conf</filename> with:
<programlisting>
shared_preload_libraries = 'repmgr'</programlisting>
</para>
<para>
Changing this setting requires a restart of PostgreSQL; for more details see
the <ulink url="https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>.
</para>
<sect2 id="repmgrd-automatic-failover-configuration">
<title>automatic failover configuration</title>
<para>
If using automatic failover, the following <application>repmgrd</application> options *must* be set in
<filename>repmgr.conf</filename> :
<programlisting>
failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
</para>
<para>
Adjust file paths as appropriate; alway specify the full path to the &repmgr; binary.
</para>
<note>
<para>
&repmgr; will not apply <option>pg_bindir</option> when executing <option>promote_command</option>
or <option>follow_command</option>; these can be user-defined scripts so must always be
specified with the full path.
</para>
</note>
<para>
Note that the <literal>--log-to-file</literal> option will cause
output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
See <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename>
for further <application>repmgrd</application>-specific settings.
</para>
<para>
When <varname>failover</varname> is set to <literal>automatic</literal>, upon detecting failure
of the current primary, <application>repmgrd</application> will execute one of:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<varname>promote_command</varname> (if the current server is to become the new primary)
</simpara>
</listitem>
<listitem>
<simpara>
<varname>follow_command</varname> (if the current server needs to follow another server which has
become the new primary)
</simpara>
</listitem>
</itemizedlist>
<note>
<para>
These commands can be any valid shell script which results in one of these
two actions happening, but if &repmgr;'s <command>standby follow</command> or
<command>standby promote</command>
commands are not executed (either directly as shown here, or from a script which
performs other actions), the &repmgr; metadata will not be updated and
&repmgr; will no longer function reliably.
</para>
</note>
<para>
The <varname>follow_command</varname> should provide the <literal>--upstream-node-id=%n</literal>
option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
<application>repmgrd</application> with the ID of the new primary node. If this is not provided, &repmgr;
will attempt to determine the new primary by itself, but if the
original primary comes back online after the new primary is promoted, there is a risk that
<command>repmgr standby follow</command> will result in the node continuing to follow
the original primary.
</para>
</sect2>
<sect2 id="repmgrd-service-configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>PostgreSQL service configuration</secondary>
</indexterm>
<title>PostgreSQL service configuration</title>
<para>
If using automatic failover, currently <application>repmgrd</application> will need to execute
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
to restart PostgreSQL on standbys to have them follow a new primary.
</para>
<para>
To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
command appropriate to your operating system via <varname>service_restart_command</varname>
in <filename>repmgr.conf</filename>. If you don't do this, <application>repmgrd</application>
will default to using <command>pg_ctl</command>, which can result in unexpected problems,
particularly on <application>systemd</application>-based systems.
</para>
<para>
For more details, see <xref linkend="configuration-file-service-commands">.
</para>
</sect2>
<sect2 id="repmgrd-monitoring-configuration" xreflabel="repmgrd monitoring configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>monitoring configuration</secondary>
</indexterm>
<title>Monitoring configuration</title>
<para>
To enable monitoring, set:
<programlisting>
monitoring_history=yes</programlisting>
in <filename>repmgr.conf</filename>.
</para>
<para>
The default monitoring interval is 2 seconds; this value can be explicitly set using:
<programlisting>
monitor_interval_secs=&lt;seconds&gt;</programlisting>
in <filename>repmgr.conf</filename>.
</para>
<para>
For more details on monitoring, see <xref linkend="repmgrd-monitoring">.
</para>
</sect2>
<sect2 id="repmgrd-reloading-configuration"xreflabel="reloading repmgrd configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>applying configuration changes</secondary>
</indexterm>
<title>Applying configuration changes to repmgrd</title>
<para>
To apply configuration file changes to a running <application>repmgrd</application>
daemon, execute the operating system's <application>repmgrd</application> service reload command
(see <xref linkend="appendix-packages"> for examples),
or for instances which were manually started, execute <command>kill -HUP</command>, e.g.
<command>kill -HUP `cat /tmp/repmgrd.pid`</command>.
</para>
<tip>
<para>
Check the <application>repmgrd</application> log to see what changes were
applied, or if any issues were encountered when reloading the configuration.
</para>
</tip>
<para>
Note that only the following subset of configuration file parameters can be changed on a
running <application>repmgrd</application> daemon:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<varname>async_query_timeout</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>bdr_local_monitoring_only</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>bdr_recovery_timeout</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>conninfo</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>degraded_monitoring_timeout</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>event_notification_command</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>event_notifications</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>failover</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>follow_command</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>log_facility</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>log_file</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>log_level</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>log_status_interval</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>monitor_interval_secs</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>monitoring_history</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>primary_notification_timeout</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>promote_command</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>reconnect_attempts</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>reconnect_interval</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>repmgrd_standby_startup_timeout</varname>
</simpara>
</listitem>
</itemizedlist>
<para>
The following set of configuration file parameters must be updated via
<command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
as they require changes to the <literal>repmgr.nodes</literal> table so they are visible to
all nodes in the replication cluster:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<varname>node_id</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>node_name</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>data_directory</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>location</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>priority</varname>
</simpara>
</listitem>
</itemizedlist>
<note>
<para>
After executing <command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>,
<application>repmgrd</application> <emphasis>must</emphasis> be restarted for the changes to take effect.
</para>
</note>
</sect2>
</sect1>
<sect1 id="repmgrd-daemon">
<indexterm>
<primary>repmgrd</primary>
<secondary>starting and stopping</secondary>
</indexterm>
<title>repmgrd daemon</title>
<para>
If installed from a package, the <application>repmgrd</application> can be started
via the operating system's service command, e.g. in <application>systemd</application>
using <command>systemctl</command>.
</para>
<para>
See appendix <xref linkend="appendix-packages"> for details of service commands
for different distributions.
</para>
<para>
<application>repmgrd</application> can be started manually like this:
<programlisting>
repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid</programlisting>
and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
</para>
<sect2 id="repmgrd-pid-file" xreflabel="repmgrd's PID file">
<indexterm>
<primary>repmgrd</primary>
<secondary>PID file</secondary>
</indexterm>
<indexterm>
<primary>PID file</primary>
<secondary>repmgrd</secondary>
</indexterm>
<title>repmgrd's PID file</title>
<para>
<application>repmgrd</application> will generate a PID file by default.
</para>
<note>
<simpara>
This is a behaviour change from previous versions (earlier than 4.1), where
the PID file had to be explicitly specified with the command line
parameter <option> --pid-file</option>.
</simpara>
</note>
<para>
The PID file can be specified in <filename>repmgr.conf</filename> with the configuration
parameter <varname>repmgrd_pid_file</varname>.
</para>
<para>
It can also be specified on the command line (as in previous versions) with
the command line parameter <option>--pid-file</option>. Note this will override
any value set in <filename>repmgr.conf</filename> with <varname>repmgrd_pid_file</varname>.
<option>--pid-file</option> may be deprecated in future releases.
</para>
<para>
If a PID file location was specified by the package maintainer, <application>repmgrd</application>
will use that. This only applies if &repmgr; was installed from a package and the package
maintainer has specified the PID file location.
</para>
<para>
If none of the above apply, <application>repmgrd</application> will create a PID file
in the operating system's temporary directory (das etermined by the environment variable
<varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
</para>
<para>
To prevent a PID file being generated at all, provide the command line option
<option>--no-pid-file</option>.
</para>
<para>
To see which PID file <application>repmgrd</application> would use, execute <application>repmgrd</application>
with the option <option>--show-pid-file</option>. <application>repmgrd</application>
will not start if this option is provided. Note that the value shown is the
file <application>repmgrd</application> would use next time it starts, and is
not necessarily the PID file currently in use.
</para>
</sect2>
<sect2 id="repmgrd-configuration-debian-ubuntu">
<indexterm>
<primary>repmgrd</primary>
<secondary>Debian/Ubuntu and daemon configuration</secondary>
</indexterm>
<indexterm>
<primary>Debian/Ubuntu</primary>
<secondary>repmgrd daemon configuration</secondary>
</indexterm>
<title>repmgrd daemon configuration on Debian/Ubuntu</title>
<para>
If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
is required before <application>repmgrd</application> is started as a daemon.
</para>
<para>
This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
looks like this:
<programlisting>
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=no
# configuration file (required)
#REPMGRD_CONF="/path/to/repmgr.conf"
# additional options
REPMGRD_OPTS="--daemonize=false"
# user to run repmgrd as
#REPMGRD_USER=postgres
# repmgrd binary
#REPMGRD_BIN=/usr/bin/repmgrd
# pid file
#REPMGRD_PIDFILE=/var/run/repmgrd.pid</programlisting>
</para>
<para>
Set <varname>REPMGRD_ENABLED</varname> to <literal>yes</literal>, and <varname>REPMGRD_CONF</varname>
to the <filename>repmgr.conf</filename> file you are using.
</para>
<tip>
<para>
See <xref linkend="packages-debian-ubuntu"> for details of the Debian/Ubuntu packages and
typical file locations (including <filename>repmgr.conf</filename>).
</para>
</tip>
<para>
From <application>repmgrd</application> 4.1, ensure <varname>REPMGRD_OPTS</varname> includes
<option>--daemonize=false</option>, as daemonization is handled by the service command.
We recommend setting <varname>repmgrd_pid_file</varname> in <filename>repmgr.conf</filename> to the
same value set in <varname>REPMGRD_PIDFILE</varname> to prevent another <application>repmgrd</application>
instance from being started manually.
</para>
<para>
If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
Also, if you attempted to start <application>repmgrd</application> using <command>systemctl start repmgrd</command>,
you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
rolls.
</para>
</sect2>
</sect1>
<sect1 id="repmgrd-connection-settings">
<title>repmgrd connection settings</title>
<para>
In addition to the &repmgr; configuration settings, parameters in the
<varname>conninfo</varname> string influence how &repmgr; makes a network connection to
PostgreSQL. In particular, if another server in the replication cluster
is unreachable at network level, system network settings will influence
the length of time it takes to determine that the connection is not possible.
</para>
<para>
In particular explicitly setting a parameter for <literal>connect_timeout</literal>
should be considered; the effective minimum value of <literal>2</literal>
(seconds) will ensure that a connection failure at network level is reported
as soon as possible, otherwise depending on the system settings (e.g.
<varname>tcp_syn_retries</varname> in Linux) a delay of a minute or more
is possible.
</para>
<para>
For further details on <varname>conninfo</varname> network connection
parameters, see the
<ulink url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>.
</para>
</sect1>
<sect1 id="repmgrd-log-rotation">
<indexterm>
<primary>log rotation</primary>
<secondary>repmgrd</secondary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>log rotation</secondary>
</indexterm>
<title>repmgrd log rotation</title>
<para>
To ensure the current <application>repmgrd</application> logfile
(specified in <filename>repmgr.conf</filename> with the parameter
<option>log_file</option>) does not grow indefinitely, configure your
system's <command>logrotate</command> to regularly rotate it.
</para>
<para>
Sample configuration to rotate logfiles weekly with retention for
up to 52 weeks and rotation forced if a file grows beyond 100Mb:
<programlisting>
/var/log/repmgr/repmgrd.log {
missingok
compress
rotate 52
maxsize 100M
weekly
create 0600 postgres postgres
postrotate
/usr/bin/killall -HUP repmgrd
endscript
}</programlisting>
</para>
</sect1>
</chapter>

Some files were not shown because too many files have changed in this diff Show More