Compare commits

..

46 Commits

Author SHA1 Message Date
Ian Barwick
1337f68caf doc: note downstream node (dis)connection monitoring in more places 2020-06-09 16:42:45 +09:00
Ian Barwick
00494f1a15 repmgr 4.4.x is not compatible with PostgreSQL 12 or later 2020-02-14 10:42:38 +09:00
Ian Barwick
1460e6400a Add optional check for unsupported future PostgreSQL releases
This is for backbranches to prevent them running against newer
PostgreSQL versions with which they are not compatible, for example
4.4.x with PostgreSQL 12 and later.
2020-02-14 10:39:28 +09:00
Ian Barwick
daa7de1cd0 standby follow: don't attempt to delete slot if new upstream is same as current
An attempt will be made to delete an existing replication slot on the
old upstream node (this is important during e.g. a switchover operation
or when attaching a cascaded standby to a new upstream). However if the
standby is currently attached to the follow target node anyway, the
replication slot should never be deleted.
2019-12-10 15:58:45 +09:00
Ian Barwick
0fbc0f42b5 Prevent use of backend string functions
From PostgreSQL 12, port.h forcibly redefines printf() et al to use
the versions defined by PostgreSQL (pg_printf() et al). As this
causes linking issues in build environments which build pre-Pg12
versions against Pg12's libpq, ensure relevant macros defined
in port.h are undefined.
2019-10-09 09:12:58 +09:00
Ian Barwick
34aa54b766 doc: clarify use of --recovery-conf-only 2019-10-08 10:04:23 +09:00
Ian Barwick
01ec64c330 doc: fix minor formatting error 2019-10-08 10:04:19 +09:00
Ian Barwick
41637f0c43 doc: improve instructions for cloning from Barman 2019-10-07 11:58:04 +09:00
Ian Barwick
b61989bafe doc: improve Barman standby clone example 2019-10-07 11:58:01 +09:00
Ian Barwick
c2987b6c9b doc: clarify barman-cli package usage
As of Barman 2.8, the barman-cli package has been merged with the core
Barman code, so is only requires an explicit mention for Barman 2.0 ~ 2.7.
2019-10-07 11:57:58 +09:00
Ian Barwick
c6d151ef56 doc: remove reference to Barman 1.x.
Barman 1.x is very outdated and should no longer be used anyway.
2019-10-07 11:57:54 +09:00
Ian Barwick
867ebcb52d repmgr: note that --dry-run is not effective with "repmgr daemon status" 2019-08-28 15:17:01 +09:00
Ian Barwick
9abf72af07 doc: fix link text 2019-08-28 15:12:45 +09:00
Ian Barwick
2256c0a81f repmgrd: fix pidfile handling at shutdown 2019-08-19 20:16:21 +09:00
Ian Barwick
701fdd9622 Use appendPQExpBufferStr where appropriate 2019-08-13 16:32:56 +09:00
Ian Barwick
5c1ca6cba5 doc: mention not to use --siblings-follow in the repmgrd promote command
This is noted on the "repmgr standby promote" page but needs repeating
on the repmgrd configuration page.
2019-08-13 11:36:15 +09:00
Ian Barwick
2dce4b06b4 repmgrd: emit node name when reporting follow target attach error
This is consistent with other error messages.
2019-08-13 11:07:07 +09:00
Ian Barwick
f1716a67b0 "standby clone": improve error messages related to extension status
Previously repmgr would emit the "repmgr extension not found on source node"
which depending on context is somewhat misleading, as it may exist
but not be installed, or the user may be attempting to clone from the
wrong database.
2019-08-07 16:45:42 +09:00
Ian Barwick
1d43f1cdb5 Simplify pg_has_role() call
Specifying CURRENT_USER is superfluous here.
2019-08-07 14:42:11 +09:00
Ian Barwick
7cc12f08ed "node check": check role membership when trying to read pg_settings
From PostgreSQL 10, a member of the default roles "pg_monitor" and/or
"pg_read_all_settings" can read pg_settings without requiring superuser
privileges.

Previously, a hint was being emitted about making the repmgr user a
member of one of those groups, but no check for membership was being
made, meaning the check could only be run by a superuser.
2019-08-07 14:27:26 +09:00
Ian Barwick
8484bc6687 Add missing field in init_replication_info()
"upstream_node_id" was not being initialised.
2019-08-07 13:22:54 +09:00
Ian Barwick
d92d008cb3 doc: add note about parallel restore from Barman 2019-07-18 09:26:27 +09:00
Ian Barwick
38adf5daca Add 4.4 release date 2019-07-09 11:32:18 +09:00
Ian Barwick
4a8684e37b doc: define entity &releasedate;
This should be used wherever we need to show the latest release
date.

Don't use this in the release notes however, as it will be easy to
forget to update it when adding notes for a new release.
2019-07-09 11:30:05 +09:00
Ian Barwick
9c9f4c8c32 doc: update compatibility matrix
Use &repmgrversion; entity to generate the current version number and
prevent document bitrot.

Also define a "release-current" ID attribute for ease of linking to
the current release notes.

Per notification from the mailing list.
2019-07-09 11:10:15 +09:00
Ian Barwick
0e1319d788 doc: update release notes 2019-07-04 11:22:06 +09:00
Ian Barwick
9c784f804e doc: update "repmgr cluster show" examples 2019-06-27 15:36:09 +09:00
Ian Barwick
13097e30f0 doc: fix typo 2019-06-27 15:33:49 +09:00
Ian Barwick
40b6c92129 doc: update release notes
Finalize release date.
2019-06-26 15:57:17 +09:00
Ian Barwick
761d65526c Finalize version number
4.4
2019-06-26 15:56:33 +09:00
Ian Barwick
a13a1232e9 doc: document optional configuration settings 2019-06-20 16:50:30 +09:00
Ian Barwick
65965b3ba4 4.4rc1 2019-06-20 16:21:54 +09:00
Ian Barwick
629d2b8f85 doc: clean up release notes
Remove tabs.
2019-06-20 16:18:41 +09:00
Ian Barwick
23c285b73b doc: fix typo 2019-06-12 16:29:17 +09:00
Ian Barwick
915fb7d617 note that "standby follow" requires a primary to be available
While it's technically possible to have a standby follow another
standby while the primary is not available, repmgr will not be able
to update its metadata, which will cause Confusion and Chaos.

Update the documentation to make this clear, and provide a more helpful
error message if this situation occurs. The operation previously
failed anyway, but with an unhelpful message about not being able to
find a node record.
2019-06-11 15:18:41 +09:00
Ian Barwick
ae141b9d32 4.4beta2 2019-06-10 15:18:41 +09:00
Ian Barwick
d035550723 doc: add missing space 2019-06-10 09:02:51 +09:00
Ian Barwick
c7692b5d84 doc: improve repmgr.conf settings documentation 2019-06-07 12:50:05 +09:00
Ian Barwick
08b7f1294b doc: improve configuration documentation 2019-06-07 12:17:49 +09:00
Ian Barwick
81d01bf0e8 Canonicalize the data directory path when parsing the configuration file
This ensures the provided path matches the path PostgreSQL reports as its
data directory.
2019-06-07 09:53:44 +09:00
Ian Barwick
089c778e49 Fix extension version number query 2019-06-06 12:46:30 +09:00
Ian Barwick
b4b5681762 standby follow: remove some ineffective code
For some reason we were taking the trouble to extract an appliction_name
from the local node's conninfo, but this was being subsequently overwritten
with the node name (which is what we want anyway).
2019-06-06 12:15:23 +09:00
Ian Barwick
e5ef549aa7 doc: update release notes 2019-06-06 11:30:43 +09:00
Ian Barwick
cfc41392c3 Ensure parsed value of --upstream-conninfo is written to recovery.conf
Previously it was being parsed (a step which ensures any "application_name"
set by the caller is changed to the node name), but the original string
was being copied to "primary_conninfo" anyway.
2019-06-06 11:30:40 +09:00
Ian Barwick
55dc4f7a5f Remove redundant comment in .sql files 2019-06-04 13:46:30 +09:00
Ian Barwick
6616712346 4.4beta1 2019-06-04 13:22:56 +09:00
115 changed files with 7032 additions and 12494 deletions

3
.gitignore vendored
View File

@@ -53,6 +53,3 @@ repmgr
repmgrd
repmgr4
repmgrd4
# generated files
configfile-scan.c

View File

@@ -2,7 +2,7 @@ License and Contributions
=========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2020, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
Copyright 2010-2019, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers.

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2020, 2ndQuadrant Limited
Copyright (c) 2010-2019, 2ndQuadrant Limited
All rights reserved.
This program is free software: you can redistribute it and/or modify

59
HISTORY
View File

@@ -1,63 +1,6 @@
5.2.0 2020-10-22
general: add support for PostgreSQL 13 (Ian)
general: remove support for PostgreSQL 9.3 (Ian)
config: add support for file inclusion directives (Ian)
repmgr: "primary unregister --force" will unregister an active primary
with no registered standby nodes (Ian)
repmgr: add option --verify-backup to "standby clone" (Ian)
repmgr: "standby clone" honours --waldir option if set in
"pg_basebackup_options" (Ian)
repmgr: add option --db-connection to "node check" (Ian)
repmgr: report database connection error if the --optformat option was
provided to "node check" (Ian)
repmgr: improve "node rejoin" checks (Ian)
repmgr: enable "node rejoin" to join a target with a lower timeline (Ian)
repmgr: support pg_rewind's automatic crash recovery in Pg13 and later (Ian)
repmgr: improve output formatting for cluster matrix/crosscheck (Ian)
repmgr: improve database connection failure error checking on the
demotion candidate during "standby switchover" (Ian)
repmgr: make repmgr metadata tables dumpable (Ian)
repmgr: fix issue with tablespace mapping when cloning from Barman;
GitHub #650 (Ian)
repmgr: improve handling of pg_control read errors (Ian)
repmgrd: add additional optional parameters to "failover_validation command"
(spaskalev; GitHub #651)
repmgrd: ensure primary connection is reset if same as upstream;
GitHub #633 (Ian)
5.1.0 2020-04-13
repmgr: remove BDR 2.x support
repmgr: don't query upstream's data directory (Ian)
repmgr: rename --recovery-conf-only to --replication-conf-only (Ian)
repmgr: ensure postgresql.auto.conf is created with correct permissions (Ian)
repmgr: minimize requirement to check upstream data directory location
during "standby clone" (Ian)
repmgr: warn about missing pg_rewind prerequisites when excuting
"standby clone" (Ian)
repmgr: add --upstream option to "node check"
repmgr: report error code on follow/rejoin failure due to non-available
0 replication slot (Ian)
repmgr: ensure "node rejoin" checks for available replication slots (Ian)
repmgr: improve "standby switchover" completion checks (Ian)
repmgr: add replication configuration file ownership check to
"standby switchover" (Ian)
repmgr: check the demotion candidate's registered repmgr.conf file can
be found (laixiong; GitHub 615)
repmgr: consolidate replication connection code (Ian)
repmgr: check permissions for "pg_promote()" and fall back to pg_ctl
if necessary (Ian)
repmgr: in --dry-run mode, display promote command which will be used (Ian)
repmgr: enable "service_promote_command" in PostgreSQL 12 (Ian)
repmgr: accept option -S/--superuser for "node check"; GitHub #612 (Ian)
5.0 2019-10-15
general: add PostgreSQL 12 support (Ian)
general: parse configuration file using flex (Ian)
repmgr: rename "repmgr daemon ..." commands to "repmgr service ..." (Ian)
4.4.1 2019-??-??
repmgr: improve data directory check (Ian)
repmgr: improve extension check during "standby clone" (Ian)
repmgr: pass provided log level when executing repmgr remotely (Ian)
repmgrd: fix handling of upstream node change check (Ian)
4.4 2019-06-27
repmgr: improve "daemon status" output (Ian)

View File

@@ -2,7 +2,6 @@
# Makefile.global.in
# @configure_input@
# Can only be built using pgxs
USE_PGXS=1
@@ -15,9 +14,6 @@ ifeq ($(vpath_build),yes)
VPATH := $(repmgr_abs_srcdir)/$(repmgr_subdir)
USE_VPATH :=$(VPATH)
endif
SED=@SED@
GIT_WORK_TREE=${repmgr_abs_srcdir}
GIT_DIR=${repmgr_abs_srcdir}/.git
export GIT_DIR
@@ -30,11 +26,3 @@ include $(PGXS)
REPMGR_VERSION=$(shell awk '/^\#define REPMGR_VERSION / { print $3; }' ${repmgr_abs_srcdir}/repmgr_version.h.in | cut -d '"' -f 2)
REPMGR_RELEASE_DATE=$(shell awk '/^\#define REPMGR_RELEASE_DATE / { print $3; }' ${repmgr_abs_srcdir}/repmgr_version.h.in | cut -d '"' -f 2)
FLEX = flex
##########################################################################
#
# Global targets and rules
%.c: %.l
$(FLEX) $(FLEXFLAGS) -o'$@' $<

View File

@@ -11,8 +11,6 @@ EXTENSION = repmgr
DATA = \
repmgr--unpackaged--4.0.sql \
repmgr--unpackaged--5.1.sql \
repmgr--unpackaged--5.2.sql \
repmgr--4.0.sql \
repmgr--4.0--4.1.sql \
repmgr--4.1.sql \
@@ -21,13 +19,7 @@ DATA = \
repmgr--4.2--4.3.sql \
repmgr--4.3.sql \
repmgr--4.3--4.4.sql \
repmgr--4.4.sql \
repmgr--4.4--5.0.sql \
repmgr--5.0.sql \
repmgr--5.0--5.1.sql \
repmgr--5.1.sql \
repmgr--5.1--5.2.sql \
repmgr--5.2.sql
repmgr--4.4.sql
REGRESS = repmgr_extension
@@ -59,19 +51,13 @@ $(info Building against PostgreSQL $(MAJORVERSION))
REPMGR_CLIENT_OBJS = repmgr-client.o \
repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o \
repmgr-action-cluster.o repmgr-action-node.o repmgr-action-service.o repmgr-action-daemon.o \
configdata.o configfile.o configfile-scan.o log.o strutil.o controldata.o dirutil.o compat.o \
dbutils.o sysutils.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o configdata.o configfile.o configfile-scan.o log.o \
dbutils.o strutil.o controldata.o compat.o sysutils.o
repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o repmgr-action-daemon.o \
configfile.o log.o strutil.o controldata.o dirutil.o compat.o dbutils.o sysutils.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o repmgrd-bdr.o configfile.o log.o dbutils.o strutil.o controldata.o compat.o sysutils.o
DATE=$(shell date "+%Y-%m-%d")
repmgr_version.h: repmgr_version.h.in
$(SED) -E 's/REPMGR_VERSION_DATE.*""/REPMGR_VERSION_DATE "$(DATE)"/' $< >$@; \
$(SED) -i -E 's/PG_ACTUAL_VERSION_NUM/PG_ACTUAL_VERSION_NUM $(VERSION_NUM)/' $@
configfile-scan.c: configfile-scan.l
sed '0,/REPMGR_VERSION_DATE/s,\(REPMGR_VERSION_DATE\).*,\1 "$(DATE)",' $< >$@
$(REPMGR_CLIENT_OBJS): repmgr-client.h repmgr_version.h
@@ -112,7 +98,6 @@ maintainer-clean: additional-maintainer-clean
additional-clean:
rm -f *.o
rm -f repmgr_version.h
$(MAKE) -C doc clean
additional-maintainer-clean: clean

View File

@@ -7,30 +7,32 @@ replication capabilities with utilities to set up standby servers, monitor
replication, and perform administrative tasks such as failover or switchover
operations.
PostgreSQL 12, 11, 10, 9.6 and 9.5 are fully supported.
`repmgr 4` is a complete rewrite of the existing `repmgr` codebase, allowing
the use of all of the latest features in PostgreSQL replication.
PostgreSQL 11, 10, 9.6 and 9.5 are fully supported.
PostgreSQL 9.4 and 9.3 are supported, with some restrictions.
`repmgr` is distributed under the GNU GPL 3 and maintained by 2ndQuadrant.
### BDR support
`repmgr 4` supports monitoring of a two-node BDR 2.0 cluster on PostgreSQL 9.6
only. Note that BDR 2.0 is not publicly available; please contact 2ndQuadrant
for details.
Documentation
-------------
The full `repmgr` documentation is available here:
The main `repmgr` documentation is available here:
> [repmgr documentation](https://repmgr.org/docs/current/index.html)
The old `README` file for `repmgr` 3.x is available here:
The `README` file for `repmgr` 3.x is available here:
> https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/README.md
Note that the `repmgr` 3.x series is no longer supported and contains known bugs;
please upgrade to the current `repmgr` version as soon as possible.
Versions
--------
For an overview of `repmgr` versions and PostgreSQL compatibility, see the
[repmgr compatibility matrix](https://repmgr.org/docs/current/install-requirements.html#INSTALL-COMPATIBILITY-MATRIX).
Files
------
@@ -70,8 +72,6 @@ Please report bugs and other issues to:
* https://github.com/2ndQuadrant/repmgr
See
Further information is available at https://repmgr.org/
We'd love to hear from you about how you use repmgr. Case studies and
@@ -98,8 +98,6 @@ Further reading
---------------
* [repmgr documentation](https://repmgr.org/docs/current/index.html)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 1](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-1/)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 2](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-2/)
* https://blog.2ndquadrant.com/repmgr-3-2-is-here-barman-support-brand-new-high-availability-features/
* https://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/
* https://blog.2ndquadrant.com/managing-useful-clusters-repmgr/

View File

@@ -6,7 +6,7 @@
* supported PostgreSQL versions. They're unlikely to change but
* it would be worth keeping an eye on them for any fixes/improvements.
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,6 +1,6 @@
/*
* compat.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,936 +0,0 @@
/*
* configdata.c - contains structs with parsed configuration data
*
* Copyright (c) 2ndQuadrant, 2010-2020
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "repmgr.h"
#include "configfile.h"
/*
* Parsed configuration settings are stored here
*/
t_configuration_options config_file_options;
/*
* Configuration settings are defined here
*/
struct ConfigFileSetting config_file_settings[] =
{
/* ================
* node information
* ================
*/
/* node_id */
{
"node_id",
CONFIG_INT,
{ .intptr = &config_file_options.node_id },
{ .intdefault = UNKNOWN_NODE_ID },
{ .intminval = MIN_NODE_ID },
{},
{}
},
/* node_name */
{
"node_name",
CONFIG_STRING,
{ .strptr = config_file_options.node_name },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.node_name) },
{}
},
/* conninfo */
{
"conninfo",
CONFIG_STRING,
{ .strptr = config_file_options.conninfo },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.conninfo) },
{}
},
/* replication_user */
{
"replication_user",
CONFIG_STRING,
{ .strptr = config_file_options.replication_user },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.replication_user) },
{}
},
/* data_directory */
{
"data_directory",
CONFIG_STRING,
{ .strptr = config_file_options.data_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.data_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* config_directory */
{
"config_directory",
CONFIG_STRING,
{ .strptr = config_file_options.config_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.config_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* pg_bindir */
{
"pg_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.pg_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* repmgr_bindir */
{
"repmgr_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.repmgr_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgr_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* replication_type */
{
"replication_type",
CONFIG_INT,
{ .intptr = &config_file_options.replication_type },
{ .intdefault = REPLICATION_TYPE_PHYSICAL },
{ .intminval = -1 },
{},
{}
},
/* ================
* logging settings
* ================
*/
/*
* log_level
* NOTE: the default for "log_level" is set in log.c and does not need
* to be initialised here
*/
{
"log_level",
CONFIG_STRING,
{ .strptr = config_file_options.log_level },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_level) },
{}
},
/* log_facility */
{
"log_facility",
CONFIG_STRING,
{ .strptr = config_file_options.log_facility },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_facility) },
{}
},
/* log_file */
{
"log_file",
CONFIG_STRING,
{ .strptr = config_file_options.log_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* log_status_interval */
{
"log_status_interval",
CONFIG_INT,
{ .intptr = &config_file_options.log_status_interval },
{ .intdefault = DEFAULT_LOG_STATUS_INTERVAL, },
{ .intminval = 0 },
{},
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* use_replication_slots */
{
"use_replication_slots",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_replication_slots },
{ .booldefault = DEFAULT_USE_REPLICATION_SLOTS },
{},
{},
{}
},
/* pg_basebackup_options */
{
"pg_basebackup_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_basebackup_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_basebackup_options) },
{}
},
/* restore_command */
{
"restore_command",
CONFIG_STRING,
{ .strptr = config_file_options.restore_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.restore_command) },
{}
},
/* tablespace_mapping */
{
"tablespace_mapping",
CONFIG_TABLESPACE_MAPPING,
{ .tablespacemappingptr = &config_file_options.tablespace_mapping },
{},
{},
{},
{}
},
/* recovery_min_apply_delay */
{
"recovery_min_apply_delay",
CONFIG_STRING,
{ .strptr = config_file_options.recovery_min_apply_delay },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.recovery_min_apply_delay) },
{
.process_func = &parse_time_unit_parameter,
.providedptr = &config_file_options.recovery_min_apply_delay_provided
}
},
/* archive_cleanup_command */
{
"archive_cleanup_command",
CONFIG_STRING,
{ .strptr = config_file_options.archive_cleanup_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.archive_cleanup_command) },
{}
},
/* use_primary_conninfo_password */
{
"use_primary_conninfo_password",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_primary_conninfo_password },
{ .booldefault = DEFAULT_USE_PRIMARY_CONNINFO_PASSWORD },
{},
{},
{}
},
/* passfile */
{
"passfile",
CONFIG_STRING,
{ .strptr = config_file_options.passfile },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.passfile) },
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* promote_check_timeout */
{
"promote_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_timeout },
{ .intdefault = DEFAULT_PROMOTE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* promote_check_interval */
{
"promote_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_interval },
{ .intdefault = DEFAULT_PROMOTE_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* =======================
* standby follow settings
* =======================
*/
/* primary_follow_timeout */
{
"primary_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_follow_timeout },
{ .intdefault = DEFAULT_PRIMARY_FOLLOW_TIMEOUT, },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_timeout */
{
"standby_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_follow_timeout },
{ .intdefault = DEFAULT_STANDBY_FOLLOW_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_restart */
{
"standby_follow_restart",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_follow_restart },
{ .booldefault = DEFAULT_STANDBY_FOLLOW_RESTART },
{},
{},
{}
},
/* ===========================
* standby switchover settings
* ===========================
*/
/* shutdown_check_timeout */
{
"shutdown_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.shutdown_check_timeout },
{ .intdefault = DEFAULT_SHUTDOWN_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_reconnect_timeout */
{
"standby_reconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_reconnect_timeout },
{ .intdefault = DEFAULT_STANDBY_RECONNECT_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* wal_receive_check_timeout */
{
"wal_receive_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.wal_receive_check_timeout },
{ .intdefault = DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ====================
* node rejoin settings
* ====================
*/
/* node_rejoin_timeout */
{
"node_rejoin_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.node_rejoin_timeout },
{ .intdefault = DEFAULT_NODE_REJOIN_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ===================
* node check settings
* ===================
*/
/* archive_ready_warning */
{
"archive_ready_warning",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_warning },
{ .intdefault = DEFAULT_ARCHIVE_READY_WARNING },
{ .intminval = 1 },
{},
{}
},
/* archive_ready_critical */
{
"archive_ready_critical",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_critical },
{ .intdefault = DEFAULT_ARCHIVE_READY_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_warning */
{
"replication_lag_warning",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_warning },
{ .intdefault = DEFAULT_REPLICATION_LAG_WARNING },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_critical */
{
"replication_lag_critical",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_critical },
{ .intdefault = DEFAULT_REPLICATION_LAG_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* witness settings
* ================
*/
/* witness_sync_interval */
{
"witness_sync_interval",
CONFIG_INT,
{ .intptr = &config_file_options.witness_sync_interval },
{ .intdefault = DEFAULT_WITNESS_SYNC_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* repmgrd settings
* ================
*/
/* failover */
{
"failover",
CONFIG_FAILOVER_MODE,
{ .failovermodeptr = &config_file_options.failover },
{ .failovermodedefault = FAILOVER_MANUAL },
{},
{},
{}
},
/* location */
{
"location",
CONFIG_STRING,
{ .strptr = config_file_options.location },
{ .strdefault = DEFAULT_LOCATION },
{},
{ .strmaxlen = sizeof(config_file_options.location) },
{}
},
/* priority */
{
"priority",
CONFIG_INT,
{ .intptr = &config_file_options.priority },
{ .intdefault = DEFAULT_PRIORITY, },
{ .intminval = 0 },
{},
{}
},
/* promote_command */
{
"promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.promote_command) },
{}
},
/* follow_command */
{
"follow_command",
CONFIG_STRING,
{ .strptr = config_file_options.follow_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.follow_command) },
{}
},
/* monitor_interval_secs */
{
"monitor_interval_secs",
CONFIG_INT,
{ .intptr = &config_file_options.monitor_interval_secs },
{ .intdefault = DEFAULT_MONITORING_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* reconnect_attempts */
{
"reconnect_attempts",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_attempts },
{ .intdefault = DEFAULT_RECONNECTION_ATTEMPTS },
{ .intminval = 0 },
{},
{}
},
/* reconnect_interval */
{
"reconnect_interval",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_interval },
{ .intdefault = DEFAULT_RECONNECTION_INTERVAL },
{ .intminval = 0 },
{},
{}
},
/* monitoring_history */
{
"monitoring_history",
CONFIG_BOOL,
{ .boolptr = &config_file_options.monitoring_history },
{ .booldefault = DEFAULT_MONITORING_HISTORY },
{},
{},
{}
},
/* degraded_monitoring_timeout */
{
"degraded_monitoring_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.degraded_monitoring_timeout },
{ .intdefault = DEFAULT_DEGRADED_MONITORING_TIMEOUT },
{ .intminval = -1 },
{},
{}
},
/* async_query_timeout */
{
"async_query_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.async_query_timeout },
{ .intdefault = DEFAULT_ASYNC_QUERY_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* primary_notification_timeout */
{
"primary_notification_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_notification_timeout },
{ .intdefault = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_standby_startup_timeout */
{
"repmgrd_standby_startup_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.repmgrd_standby_startup_timeout },
{ .intdefault = DEFAULT_REPMGRD_STANDBY_STARTUP_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_pid_file */
{
"repmgrd_pid_file",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_pid_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_pid_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* standby_disconnect_on_failover */
{
"standby_disconnect_on_failover",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_disconnect_on_failover },
{ .booldefault = DEFAULT_STANDBY_DISCONNECT_ON_FAILOVER },
{},
{},
{}
},
/* sibling_nodes_disconnect_timeout */
{
"sibling_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.sibling_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* connection_check_type */
{
"connection_check_type",
CONFIG_CONNECTION_CHECK_TYPE,
{ .checktypeptr = &config_file_options.connection_check_type },
{ .checktypedefault = DEFAULT_CONNECTION_CHECK_TYPE },
{},
{},
{}
},
/* primary_visibility_consensus */
{
"primary_visibility_consensus",
CONFIG_BOOL,
{ .boolptr = &config_file_options.primary_visibility_consensus },
{ .booldefault = DEFAULT_PRIMARY_VISIBILITY_CONSENSUS },
{},
{},
{}
},
/* always_promote */
{
"always_promote",
CONFIG_BOOL,
{ .boolptr = &config_file_options.always_promote },
{ .booldefault = DEFAULT_ALWAYS_PROMOTE },
{},
{},
{}
},
/* failover_validation_command */
{
"failover_validation_command",
CONFIG_STRING,
{ .strptr = config_file_options.failover_validation_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.failover_validation_command) },
{}
},
/* election_rerun_interval */
{
"election_rerun_interval",
CONFIG_INT,
{ .intptr = &config_file_options.election_rerun_interval },
{ .intdefault = DEFAULT_ELECTION_RERUN_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_check_interval */
{
"child_nodes_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_check_interval },
{ .intdefault = DEFAULT_CHILD_NODES_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_disconnect_min_count */
{
"child_nodes_disconnect_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT },
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_min_count */
{
"child_nodes_connected_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_connected_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT},
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_include_witness */
{
"child_nodes_connected_include_witness",
CONFIG_BOOL,
{ .boolptr = &config_file_options.child_nodes_connected_include_witness },
{ .booldefault = DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS },
{},
{},
{}
},
/* child_nodes_disconnect_timeout */
{
"child_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* child_nodes_disconnect_command */
{
"child_nodes_disconnect_command",
CONFIG_STRING,
{ .strptr = config_file_options.child_nodes_disconnect_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.child_nodes_disconnect_command) },
{}
},
/* ================
* service settings
* ================
*/
/* pg_ctl_options */
{
"pg_ctl_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_ctl_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_ctl_options) },
{}
},
/* service_start_command */
{
"service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_start_command) },
{}
},
/* service_stop_command */
{
"service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_stop_command) },
{}
},
/* service_restart_command */
{
"service_restart_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_restart_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_restart_command) },
{}
},
/* service_reload_command */
{
"service_reload_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_reload_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_reload_command) },
{}
},
/* service_promote_command */
{
"service_promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_promote_command) },
{}
},
/* ========================
* repmgrd service settings
* ========================
*/
/* repmgrd_service_start_command */
{
"repmgrd_service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_start_command) },
{}
},
/* repmgrd_service_stop_command */
{
"repmgrd_service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_stop_command) },
{}
},
/* ===========================
* event notification settings
* ===========================
*/
/* event_notification_command */
{
"event_notification_command",
CONFIG_STRING,
{ .strptr = config_file_options.event_notification_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.event_notification_command) },
{}
},
{
"event_notifications",
CONFIG_EVENT_NOTIFICATION_LIST,
{ .notificationlistptr = &config_file_options.event_notifications },
{},
{},
{},
{}
},
/* ===============
* barman settings
* ===============
*/
/* barman_host */
{
"barman_host",
CONFIG_STRING,
{ .strptr = config_file_options.barman_host },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_host) },
{}
},
/* barman_server */
{
"barman_server",
CONFIG_STRING,
{ .strptr = config_file_options.barman_server },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_server) },
{}
},
/* barman_config */
{
"barman_config",
CONFIG_STRING,
{ .strptr = config_file_options.barman_config },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_config) },
{}
},
/* ==================
* rsync/ssh settings
* ==================
*/
/* rsync_options */
{
"rsync_options",
CONFIG_STRING,
{ .strptr = config_file_options.rsync_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.rsync_options) },
{}
},
/* ssh_options */
{
"ssh_options",
CONFIG_STRING,
{ .strptr = config_file_options.ssh_options },
{ .strdefault = DEFAULT_SSH_OPTIONS },
{},
{ .strmaxlen = sizeof(config_file_options.ssh_options) },
{}
},
/* ==================================
* undocumented experimental settings
* ==================================
*/
/* reconnect_loop_sync */
{
"reconnect_loop_sync",
CONFIG_BOOL,
{ .boolptr = &config_file_options.reconnect_loop_sync },
{ .booldefault = false },
{},
{},
{}
},
/* ==========================
* undocumented test settings
* ==========================
*/
/* promote_delay */
{
"promote_delay",
CONFIG_INT,
{ .intptr = &config_file_options.promote_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
/* failover_delay */
{
"failover_delay",
CONFIG_INT,
{ .intptr = &config_file_options.failover_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
{
"connection_check_query",
CONFIG_STRING,
{ .strptr = config_file_options.connection_check_query },
{ .strdefault = "SELECT 1" },
{},
{ .strmaxlen = sizeof(config_file_options.connection_check_query) },
{}
},
/* End-of-list marker */
{
NULL, CONFIG_INT, {}, {}, {}, {}, {}
}
};

View File

@@ -1,644 +0,0 @@
/*
* Scanner for the configuration file
*/
%{
#include <setjmp.h>
#include <sys/stat.h>
#include <dirent.h>
#include "repmgr.h"
#include "configfile.h"
/*
* flex emits a yy_fatal_error() function that it calls in response to
* critical errors like malloc failure, file I/O errors, and detection of
* internal inconsistency. That function prints a message and calls exit().
* Mutate it to instead call our handler, which jumps out of the parser.
*/
#undef fprintf
#define fprintf(file, fmt, msg) CONF_flex_fatal(msg)
enum
{
CONF_ID = 1,
CONF_STRING = 2,
CONF_INTEGER = 3,
CONF_REAL = 4,
CONF_EQUALS = 5,
CONF_UNQUOTED_STRING = 6,
CONF_QUALIFIED_ID = 7,
CONF_EOL = 99,
CONF_ERROR = 100
};
static unsigned int ConfigFileLineno;
static const char *CONF_flex_fatal_errmsg;
static sigjmp_buf *CONF_flex_fatal_jmp;
static char *CONF_scanstr(const char *s);
static int CONF_flex_fatal(const char *msg);
static bool ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static char *AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file);
%}
%option 8bit
%option never-interactive
%option nodefault
%option noinput
%option nounput
%option noyywrap
%option warn
%option prefix="CONF_yy"
SIGN ("-"|"+")
DIGIT [0-9]
HEXDIGIT [0-9a-fA-F]
UNIT_LETTER [a-zA-Z]
INTEGER {SIGN}?({DIGIT}+|0x{HEXDIGIT}+){UNIT_LETTER}*
EXPONENT [Ee]{SIGN}?{DIGIT}+
REAL {SIGN}?{DIGIT}*"."{DIGIT}*{EXPONENT}?
LETTER [A-Za-z_\200-\377]
LETTER_OR_DIGIT [A-Za-z_0-9\200-\377]
ID {LETTER}{LETTER_OR_DIGIT}*
QUALIFIED_ID {ID}"."{ID}
UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])*
STRING \'([^'\\\n]|\\.|\'\')*\'
%%
\n ConfigFileLineno++; return CONF_EOL;
[ \t\r]+ /* eat whitespace */
#.* /* eat comment (.* matches anything until newline) */
{ID} return CONF_ID;
{QUALIFIED_ID} return CONF_QUALIFIED_ID;
{STRING} return CONF_STRING;
{UNQUOTED_STRING} return CONF_UNQUOTED_STRING;
{INTEGER} return CONF_INTEGER;
{REAL} return CONF_REAL;
= return CONF_EQUALS;
. return CONF_ERROR;
%%
extern bool
ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(base_dir, config_file, NULL, true, 0, NULL, error_list, warning_list);
}
extern bool
ProcessPostgresConfigFile(const char *config_file, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(base_dir, config_file, NULL, true, 0, contents, error_list, warning_list);
}
static bool
ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *abs_path;
bool success = true;
FILE *fp;
/*
* Reject file name that is all-blank (including empty), as that leads to
* confusion --- we'd try to read the containing directory as a file.
*/
if (strspn(config_file, " \t\r\n") == strlen(config_file))
{
return false;
}
/*
* Reject too-deep include nesting depth. This is just a safety check to
* avoid dumping core due to stack overflow if an include file loops back
* to itself. The maximum nesting depth is pretty arbitrary.
*/
if (depth > 10)
{
item_list_append_format(error_list,
_("could not open configuration file \"%s\": maximum nesting depth exceeded"),
config_file);
return false;
}
abs_path = AbsoluteConfigLocation(base_dir, config_file, calling_file);
/* Reject direct recursion */
if (calling_file && strcmp(abs_path, calling_file) == 0)
{
item_list_append_format(error_list,
_("configuration file recursion in \"%s\""),
calling_file);
pfree(abs_path);
return false;
}
fp = fopen(abs_path, "r");
if (!fp)
{
if (strict == false)
{
item_list_append_format(error_list,
"skipping configuration file \"%s\"",
abs_path);
}
else
{
item_list_append_format(error_list,
"could not open configuration file \"%s\": %s",
abs_path,
strerror(errno));
success = false;
}
}
else
{
success = ProcessConfigFp(fp, abs_path, calling_file, depth + 1, base_dir, contents, error_list, warning_list);
}
free(abs_path);
return success;
}
static bool
ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
volatile bool OK = true;
volatile YY_BUFFER_STATE lex_buffer = NULL;
sigjmp_buf flex_fatal_jmp;
int errorcount;
int token;
if (sigsetjmp(flex_fatal_jmp, 1) == 0)
{
CONF_flex_fatal_jmp = &flex_fatal_jmp;
}
else
{
/*
* Regain control after a fatal, internal flex error. It may have
* corrupted parser state. Consequently, abandon the file, but trust
* that the state remains sane enough for yy_delete_buffer().
*/
item_list_append_format(error_list,
"%s at file \"%s\" line %u",
CONF_flex_fatal_errmsg, config_file, ConfigFileLineno);
OK = false;
goto cleanup;
}
/*
* Parse
*/
ConfigFileLineno = 1;
errorcount = 0;
lex_buffer = yy_create_buffer(fp, YY_BUF_SIZE);
yy_switch_to_buffer(lex_buffer);
/* This loop iterates once per logical line */
while ((token = yylex()))
{
char *opt_name = NULL;
char *opt_value = NULL;
if (token == CONF_EOL) /* empty or comment line */
continue;
/* first token on line is option name */
if (token != CONF_ID && token != CONF_QUALIFIED_ID)
goto parse_error;
opt_name = pstrdup(yytext);
/* next we have an optional equal sign; discard if present */
token = yylex();
if (token == CONF_EQUALS)
token = yylex();
/* now we must have the option value */
if (token != CONF_ID &&
token != CONF_STRING &&
token != CONF_INTEGER &&
token != CONF_REAL &&
token != CONF_UNQUOTED_STRING)
goto parse_error;
if (token == CONF_STRING) /* strip quotes and escapes */
opt_value = CONF_scanstr(yytext);
else
opt_value = pstrdup(yytext);
/* now we'd like an end of line, or possibly EOF */
token = yylex();
if (token != CONF_EOL)
{
if (token != 0)
goto parse_error;
/* treat EOF like \n for line numbering purposes, cf bug 4752 */
ConfigFileLineno++;
}
/* Handle include files */
if (base_dir != NULL && strcasecmp(opt_name, "include_dir") == 0)
{
/*
* An include_dir directive isn't a variable and should be
* processed immediately.
*/
if (!ProcessConfigDirectory(base_dir, opt_value, config_file,
depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include_if_exists") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
false, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
true, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else
{
/* OK, process the option name and value */
if (contents != NULL)
{
key_value_list_replace_or_set(contents,
opt_name,
opt_value);
}
else
{
parse_configuration_item(error_list,
warning_list,
opt_name,
opt_value);
}
}
/* break out of loop if read EOF, else loop for next line */
if (token == 0)
break;
continue;
parse_error:
/* release storage if we allocated any on this line */
if (opt_name)
pfree(opt_name);
if (opt_value)
pfree(opt_value);
/* report the error */
if (token == CONF_EOL || token == 0)
{
item_list_append_format(error_list,
_("syntax error in file \"%s\" line %u, near end of line"),
config_file, ConfigFileLineno - 1);
}
else
{
item_list_append_format(error_list,
_("syntax error in file \"%s\" line %u, near token \"%s\""),
config_file, ConfigFileLineno, yytext);
}
OK = false;
errorcount++;
/*
* To avoid producing too much noise when fed a totally bogus file,
* give up after 100 syntax errors per file (an arbitrary number).
* Also, if we're only logging the errors at DEBUG level anyway, might
* as well give up immediately. (This prevents postmaster children
* from bloating the logs with duplicate complaints.)
*/
if (errorcount >= 100)
{
fprintf(stderr,
_("too many syntax errors found, abandoning file \"%s\"\n"),
config_file);
break;
}
/* resync to next end-of-line or EOF */
while (token != CONF_EOL && token != 0)
token = yylex();
/* break out of loop on EOF */
if (token == 0)
break;
}
cleanup:
yy_delete_buffer(lex_buffer);
return OK;
}
/*
* Read and parse all config files in a subdirectory in alphabetical order
*
* includedir is the absolute or relative path to the subdirectory to scan.
*
* See ProcessConfigFp for further details.
*/
static bool
ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *directory;
DIR *d;
struct dirent *de;
char **filenames;
int num_filenames;
int size_filenames;
bool status;
/*
* Reject directory name that is all-blank (including empty), as that
* leads to confusion --- we'd read the containing directory, typically
* resulting in recursive inclusion of the same file(s).
*/
if (strspn(includedir, " \t\r\n") == strlen(includedir))
{
item_list_append_format(error_list,
_("empty configuration directory name: \"%s\""),
includedir);
return false;
}
directory = AbsoluteConfigLocation(base_dir, includedir, calling_file);
d = opendir(directory);
if (d == NULL)
{
item_list_append_format(error_list,
_("could not open configuration directory \"%s\": %s"),
directory,
strerror(errno));
status = false;
goto cleanup;
}
/*
* Read the directory and put the filenames in an array, so we can sort
* them prior to processing the contents.
*/
size_filenames = 32;
filenames = (char **) palloc(size_filenames * sizeof(char *));
num_filenames = 0;
while ((de = readdir(d)) != NULL)
{
struct stat st;
char filename[MAXPGPATH];
/*
* Only parse files with names ending in ".conf". Explicitly reject
* files starting with ".". This excludes things like "." and "..",
* as well as typical hidden files, backup files, and editor debris.
*/
if (strlen(de->d_name) < 6)
continue;
if (de->d_name[0] == '.')
continue;
if (strcmp(de->d_name + strlen(de->d_name) - 5, ".conf") != 0)
continue;
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
if (stat(filename, &st) == 0)
{
if (!S_ISDIR(st.st_mode))
{
/* Add file to array, increasing its size in blocks of 32 */
if (num_filenames >= size_filenames)
{
size_filenames += 32;
filenames = (char **) repalloc(filenames,
size_filenames * sizeof(char *));
}
filenames[num_filenames] = pstrdup(filename);
num_filenames++;
}
}
else
{
/*
* stat does not care about permissions, so the most likely reason
* a file can't be accessed now is if it was removed between the
* directory listing and now.
*/
item_list_append_format(error_list,
_("could not stat file \"%s\": %s"),
filename, strerror(errno));
status = false;
goto cleanup;
}
}
if (num_filenames > 0)
{
int i;
qsort(filenames, num_filenames, sizeof(char *), pg_qsort_strcmp);
for (i = 0; i < num_filenames; i++)
{
if (!ProcessConfigFile(base_dir, filenames[i], calling_file,
true, depth, contents,
error_list, warning_list))
{
status = false;
goto cleanup;
}
}
}
status = true;
cleanup:
if (d)
closedir(d);
pfree(directory);
return status;
}
/*
* scanstr
*
* Strip the quotes surrounding the given string, and collapse any embedded
* '' sequences and backslash escapes.
*
* the string returned is palloc'd and should eventually be pfree'd by the
* caller.
*/
static char *
CONF_scanstr(const char *s)
{
char *newStr;
int len,
i,
j;
Assert(s != NULL && s[0] == '\'');
len = strlen(s);
Assert(s != NULL);
Assert(len >= 2);
Assert(s[len - 1] == '\'');
/* Skip the leading quote; we'll handle the trailing quote below */
s++, len--;
/* Since len still includes trailing quote, this is enough space */
newStr = palloc(len);
for (i = 0, j = 0; i < len; i++)
{
if (s[i] == '\\')
{
i++;
switch (s[i])
{
case 'b':
newStr[j] = '\b';
break;
case 'f':
newStr[j] = '\f';
break;
case 'n':
newStr[j] = '\n';
break;
case 'r':
newStr[j] = '\r';
break;
case 't':
newStr[j] = '\t';
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
{
int k;
long octVal = 0;
for (k = 0;
s[i + k] >= '0' && s[i + k] <= '7' && k < 3;
k++)
octVal = (octVal << 3) + (s[i + k] - '0');
i += k - 1;
newStr[j] = ((char) octVal);
}
break;
default:
newStr[j] = s[i];
break;
} /* switch */
}
else if (s[i] == '\'' && s[i + 1] == '\'')
{
/* doubled quote becomes just one quote */
newStr[j] = s[++i];
}
else
newStr[j] = s[i];
j++;
}
/* We copied the ending quote to newStr, so replace with \0 */
Assert(j > 0 && j <= len);
newStr[--j] = '\0';
return newStr;
}
/*
* Given a configuration file or directory location that may be a relative
* path, return an absolute one. We consider the location to be relative to
* the directory holding the calling file, or to DataDir if no calling file.
*/
static char *
AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file)
{
char abs_path[MAXPGPATH];
if (is_absolute_path(location))
return strdup(location);
if (calling_file != NULL)
{
strlcpy(abs_path, calling_file, sizeof(abs_path));
get_parent_directory(abs_path);
join_path_components(abs_path, abs_path, location);
canonicalize_path(abs_path);
}
else if (base_dir != NULL)
{
join_path_components(abs_path, base_dir, location);
canonicalize_path(abs_path);
}
else
{
strlcpy(abs_path, location, sizeof(abs_path));
}
return strdup(abs_path);
}
/*
* Flex fatal errors bring us here. Stash the error message and jump back to
* ParseConfigFp(). Assume all msg arguments point to string constants; this
* holds for flex 2.5.31 (earliest we support) and flex 2.5.35 (latest as of
* this writing). Otherwise, we would need to copy the message.
*
* We return "int" since this takes the place of calls to fprintf().
*/
static int
CONF_flex_fatal(const char *msg)
{
CONF_flex_fatal_errmsg = msg;
siglongjmp(*CONF_flex_fatal_jmp, 1);
return 0; /* keep compiler quiet */
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
/*
* configfile.h
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
*
* This program is free software: you can redistribute it and/or modify
@@ -28,12 +28,6 @@
/* magic number for use in t_recovery_conf */
#define TARGET_TIMELINE_LATEST 0
/*
* This is defined in src/include/utils.h, however it's not practical
* to include that from a frontend application.
*/
#define PG_AUTOCONF_FILENAME "postgresql.auto.conf"
extern bool config_file_found;
extern char config_file_path[MAXPGPATH];
@@ -78,55 +72,6 @@ typedef struct TablespaceList
} TablespaceList;
typedef enum
{
CONFIG_BOOL,
CONFIG_INT,
CONFIG_STRING,
CONFIG_FAILOVER_MODE,
CONFIG_CONNECTION_CHECK_TYPE,
CONFIG_EVENT_NOTIFICATION_LIST,
CONFIG_TABLESPACE_MAPPING
} ConfigItemType;
typedef struct ConfigFileSetting
{
const char *name;
ConfigItemType type;
union
{
int *intptr;
char *strptr;
bool *boolptr;
failover_mode_opt *failovermodeptr;
ConnectionCheckType *checktypeptr;
EventNotificationList *notificationlistptr;
TablespaceList *tablespacemappingptr;
} val;
union {
int intdefault;
const char *strdefault;
bool booldefault;
failover_mode_opt failovermodedefault;
ConnectionCheckType checktypedefault;
} defval;
union {
int intminval;
} minval;
union {
int strmaxlen;
} maxval;
struct {
void (*process_func)(const char *, const char *, char *, ItemList *errors);
void (*postprocess_func)(const char *, const char *, char *, ItemList *errors);
bool *providedptr;
} process;
} ConfigFileSetting;
/* Declare the main configfile structure for client applications */
extern ConfigFileSetting config_file_settings[];
typedef struct
{
/* node information */
@@ -164,7 +109,6 @@ typedef struct
/* standby follow settings */
int primary_follow_timeout;
int standby_follow_timeout;
bool standby_follow_restart;
/* standby switchover settings */
int shutdown_check_timeout;
@@ -202,7 +146,6 @@ typedef struct
int sibling_nodes_disconnect_timeout;
ConnectionCheckType connection_check_type;
bool primary_visibility_consensus;
bool always_promote;
char failover_validation_command[MAXPGPATH];
int election_rerun_interval;
int child_nodes_check_interval;
@@ -212,6 +155,10 @@ typedef struct
int child_nodes_disconnect_timeout;
char child_nodes_disconnect_command[MAXPGPATH];
/* BDR settings */
bool bdr_local_monitoring_only;
bool bdr_recovery_timeout;
/* service settings */
char pg_ctl_options[MAXLEN];
char service_start_command[MAXPGPATH];
@@ -238,35 +185,79 @@ typedef struct
char rsync_options[MAXLEN];
char ssh_options[MAXLEN];
/*
* undocumented settings
*
* These settings are for testing or experimential features
* and may be changed without notice.
*/
/* experimental settings */
bool reconnect_loop_sync;
/* test settings */
/* undocumented test settings */
int promote_delay;
int failover_delay;
char connection_check_query[MAXLEN];
} t_configuration_options;
/*
* The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file
*/
#define T_CONFIGURATION_OPTIONS_INITIALIZER { \
/* node information */ \
UNKNOWN_NODE_ID, "", "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL, \
/* log settings */ \
"", "", "", DEFAULT_LOG_STATUS_INTERVAL, \
/* standby clone settings */ \
false, "", "", { NULL, NULL }, "", false, "", false, "", \
/* standby promote settings */ \
DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
/* standby follow settings */ \
DEFAULT_PRIMARY_FOLLOW_TIMEOUT, \
DEFAULT_STANDBY_FOLLOW_TIMEOUT, \
/* standby switchover settings */ \
DEFAULT_SHUTDOWN_CHECK_TIMEOUT, \
DEFAULT_STANDBY_RECONNECT_TIMEOUT, \
DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT, \
/* node rejoin settings */ \
DEFAULT_NODE_REJOIN_TIMEOUT, \
/* node check settings */ \
DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
/* witness settings */ \
DEFAULT_WITNESS_SYNC_INTERVAL, \
/* repmgrd settings */ \
FAILOVER_MANUAL, DEFAULT_LOCATION, DEFAULT_PRIORITY, "", "", \
DEFAULT_MONITORING_INTERVAL, \
DEFAULT_RECONNECTION_ATTEMPTS, \
DEFAULT_RECONNECTION_INTERVAL, \
false, -1, \
DEFAULT_ASYNC_QUERY_TIMEOUT, \
DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT, \
-1, "", false, DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT, \
CHECK_PING, true, "", DEFAULT_ELECTION_RERUN_INTERVAL, \
DEFAULT_CHILD_NODES_CHECK_INTERVAL, \
DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT, \
DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT, \
DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS, \
DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT, "", \
/* BDR settings */ \
false, DEFAULT_BDR_RECOVERY_TIMEOUT, \
/* service settings */ \
"", "", "", "", "", "", \
/* repmgrd service settings */ \
"", "", \
/* event notification settings */ \
"", "", { NULL, NULL }, \
/* barman settings */ \
"", "", "", \
/* rsync/ssh settings */ \
"", "", \
/* undocumented test settings */ \
0 \
}
/* Declare the main configfile structure for client applications */
extern t_configuration_options config_file_options;
typedef struct
{
char slot[MAXLEN];
char wal_method[MAXLEN];
char waldir[MAXPGPATH];
bool no_slot; /* from PostgreSQL 10 */
} t_basebackup_options;
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", "", false }
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false }
typedef enum
@@ -323,11 +314,8 @@ typedef struct
void set_progname(const char *argv0);
const char *progname(void);
void load_config(const char *config_file, bool verbose, bool terse, char *argv0);
bool reload_config(t_server_type server_type);
void dump_config(void);
void parse_configuration_item(ItemList *error_list, ItemList *warning_list, const char *name, const char *value);
void load_config(const char *config_file, bool verbose, bool terse, t_configuration_options *options, char *argv0);
bool reload_config(t_configuration_options *orig_options, t_server_type server_type);
bool parse_recovery_conf(const char *data_dir, t_recovery_conf *conf);
@@ -340,9 +328,6 @@ int repmgr_atoi(const char *s,
ItemList *error_list,
int minval);
void parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemList *errors);
void repmgr_canonicalize_path(const char *name, const char *value, char *config_item, ItemList *errors);
bool parse_pg_basebackup_options(const char *pg_basebackup_options,
t_basebackup_options *backup_options,
int server_version_num,
@@ -350,20 +335,11 @@ bool parse_pg_basebackup_options(const char *pg_basebackup_options,
int parse_output_to_argv(const char *string, char ***argv_array);
void free_parsed_argv(char ***argv_array);
const char *format_failover_mode(failover_mode_opt failover);
/* called by repmgr-client and repmgrd */
void exit_with_cli_errors(ItemList *error_list, const char *repmgr_command);
void print_item_list(ItemList *item_list);
const char *print_connection_check_type(ConnectionCheckType type);
char *print_event_notification_list(EventNotificationList *list);
char *print_tablespace_mapping(TablespaceList *tablespacemappingptr);
extern bool modify_auto_conf(const char *data_dir, KeyValueList *items);
extern bool ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list);
extern bool ProcessPostgresConfigFile(const char *config_file, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
#endif /* _REPMGR_CONFIGFILE_H_ */

154
configure vendored
View File

@@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for repmgr 5.2.0.
# Generated by GNU Autoconf 2.69 for repmgr 4.4.
#
# Report bugs to <repmgr@googlegroups.com>.
#
@@ -11,7 +11,7 @@
# This configure script is free software; the Free Software Foundation
# gives unlimited permission to copy, distribute and modify it.
#
# Copyright (c) 2010-2020, 2ndQuadrant Ltd.
# Copyright (c) 2010-2019, 2ndQuadrant Ltd.
## -------------------- ##
## M4sh Initialization. ##
## -------------------- ##
@@ -582,16 +582,13 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='repmgr'
PACKAGE_TARNAME='repmgr'
PACKAGE_VERSION='5.2.0'
PACKAGE_STRING='repmgr 5.2.0'
PACKAGE_VERSION='4.4'
PACKAGE_STRING='repmgr 4.4'
PACKAGE_BUGREPORT='repmgr@googlegroups.com'
PACKAGE_URL='https://repmgr.org/'
ac_subst_vars='LTLIBOBJS
LIBOBJS
HAVE_SED
HAVE_GSED
HAVE_GNUSED
vpath_build
SED
PG_CONFIG
@@ -1181,7 +1178,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures repmgr 5.2.0 to adapt to many kinds of systems.
\`configure' configures repmgr 4.4 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1242,7 +1239,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of repmgr 5.2.0:";;
short | recursive ) echo "Configuration of repmgr 4.4:";;
esac
cat <<\_ACEOF
@@ -1316,14 +1313,14 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
repmgr configure 5.2.0
repmgr configure 4.4
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
This configure script is free software; the Free Software Foundation
gives unlimited permission to copy, distribute and modify it.
Copyright (c) 2010-2020, 2ndQuadrant Ltd.
Copyright (c) 2010-2019, 2ndQuadrant Ltd.
_ACEOF
exit
fi
@@ -1335,7 +1332,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by repmgr $as_me 5.2.0, which was
It was created by repmgr $as_me 4.4, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@@ -1824,7 +1821,7 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([0-9]*\)\.\([0-9]*\)$/\1\2/')
if test "$version_num_int" -lt '94'; then
if test "$version_num_int" -lt '93'; then
as_fn_error $? "repmgr is not compatible with detected PostgreSQL version: $version_num" "$LINENO" 5
fi
else
@@ -1850,133 +1847,6 @@ else
fi
# Extract the first word of "gnused", so it can be a program name with args.
set dummy gnused; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_GNUSED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_GNUSED"; then
ac_cv_prog_HAVE_GNUSED="$HAVE_GNUSED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_GNUSED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_GNUSED" && ac_cv_prog_HAVE_GNUSED="no"
fi
fi
HAVE_GNUSED=$ac_cv_prog_HAVE_GNUSED
if test -n "$HAVE_GNUSED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_GNUSED" >&5
$as_echo "$HAVE_GNUSED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
# Extract the first word of "gsed", so it can be a program name with args.
set dummy gsed; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_GSED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_GSED"; then
ac_cv_prog_HAVE_GSED="$HAVE_GSED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_GSED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_GSED" && ac_cv_prog_HAVE_GSED="no"
fi
fi
HAVE_GSED=$ac_cv_prog_HAVE_GSED
if test -n "$HAVE_GSED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_GSED" >&5
$as_echo "$HAVE_GSED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
# Extract the first word of "sed", so it can be a program name with args.
set dummy sed; ac_word=$2
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
$as_echo_n "checking for $ac_word... " >&6; }
if ${ac_cv_prog_HAVE_SED+:} false; then :
$as_echo_n "(cached) " >&6
else
if test -n "$HAVE_SED"; then
ac_cv_prog_HAVE_SED="$HAVE_SED" # Let the user override the test.
else
as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
for as_dir in $PATH
do
IFS=$as_save_IFS
test -z "$as_dir" && as_dir=.
for ac_exec_ext in '' $ac_executable_extensions; do
if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
ac_cv_prog_HAVE_SED="yes"
$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
break 2
fi
done
done
IFS=$as_save_IFS
test -z "$ac_cv_prog_HAVE_SED" && ac_cv_prog_HAVE_SED="no"
fi
fi
HAVE_SED=$ac_cv_prog_HAVE_SED
if test -n "$HAVE_SED"; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_SED" >&5
$as_echo "$HAVE_SED" >&6; }
else
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
fi
if test "$HAVE_GNUSED" = yes; then
SED=gnused
else
if test "$HAVE_GSED" = yes; then
SED=gsed
else
SED=sed
fi
fi
ac_config_files="$ac_config_files Makefile"
ac_config_files="$ac_config_files Makefile.global"
@@ -2487,7 +2357,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by repmgr $as_me 5.2.0, which was
This file was extended by repmgr $as_me 4.4, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@@ -2550,7 +2420,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
repmgr config.status 5.2.0
repmgr config.status 4.4
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@@ -1,6 +1,6 @@
AC_INIT([repmgr], [5.2.0], [repmgr@googlegroups.com], [repmgr], [https://repmgr.org/])
AC_INIT([repmgr], [4.4], [repmgr@googlegroups.com], [repmgr], [https://repmgr.org/])
AC_COPYRIGHT([Copyright (c) 2010-2020, 2ndQuadrant Ltd.])
AC_COPYRIGHT([Copyright (c) 2010-2019, 2ndQuadrant Ltd.])
AC_CONFIG_HEADER(config.h)
@@ -32,7 +32,7 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([[0-9]]*\)\.\([[0-9]]*\)$/\1\2/')
if test "$version_num_int" -lt '94'; then
if test "$version_num_int" -lt '93'; then
AC_MSG_ERROR([repmgr is not compatible with detected PostgreSQL version: $version_num])
fi
else
@@ -57,22 +57,6 @@ else
fi
AC_SUBST(vpath_build)
AC_CHECK_PROG(HAVE_GNUSED,gnused,yes,no)
AC_CHECK_PROG(HAVE_GSED,gsed,yes,no)
AC_CHECK_PROG(HAVE_SED,sed,yes,no)
if test "$HAVE_GNUSED" = yes; then
SED=gnused
else
if test "$HAVE_GSED" = yes; then
SED=gsed
else
SED=sed
fi
fi
AC_SUBST(SED)
AC_CONFIG_FILES([Makefile])
AC_CONFIG_FILES([Makefile.global])
AC_OUTPUT

View File

@@ -73,16 +73,7 @@ while(<$fh>) {
if ($param eq 'data_directory') {
$data_directory_found = 1;
}
# Don't quote numbers
if ($value =~ /^\d+$/) {
push @outp, sprintf(q|%s=%s|, $param, $value);
}
# Quote everything else
else {
$value =~ s/'/''/g;
push @outp, sprintf(q|%s='%s'|, $param, $value);
}
push @outp, $line;
}
}
@@ -92,6 +83,6 @@ print join("\n", @outp);
print "\n";
if ($data_directory_found == 0) {
print "data_directory=''\n";
print "data_directory=\n";
}

View File

@@ -6,7 +6,7 @@
* running. For that reason we can't use on the pg_control_*() functions
* provided in PostgreSQL 9.6 and later.
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -90,9 +90,7 @@ get_system_identifier(const char *data_directory)
uint64 system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
system_identifier = control_file_info->system_identifier;
system_identifier = control_file_info->system_identifier;
pfree(control_file_info);
@@ -100,21 +98,19 @@ get_system_identifier(const char *data_directory)
}
bool
get_db_state(const char *data_directory, DBState *state)
DBState
get_db_state(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
bool control_file_processed;
DBState state;
control_file_info = get_controlfile(data_directory);
control_file_processed = control_file_info->control_file_processed;
if (control_file_processed == true)
*state = control_file_info->state;
state = control_file_info->state;
pfree(control_file_info);
return control_file_processed;
return state;
}
@@ -126,8 +122,7 @@ get_latest_checkpoint_location(const char *data_directory)
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
checkPoint = control_file_info->checkPoint;
checkPoint = control_file_info->checkPoint;
pfree(control_file_info);
@@ -139,12 +134,11 @@ int
get_data_checksum_version(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
int data_checksum_version = UNKNOWN_DATA_CHECKSUM_VERSION;
int data_checksum_version = -1;
control_file_info = get_controlfile(data_directory);
if (control_file_info->control_file_processed == true)
data_checksum_version = (int) control_file_info->data_checksum_version;
data_checksum_version = (int) control_file_info->data_checksum_version;
pfree(control_file_info);
@@ -283,19 +277,8 @@ get_controlfile(const char *DataDir)
return control_file_info;
}
if (version_num >= 120000)
{
#if PG_ACTUAL_VERSION_NUM >= 120000
expected_size = sizeof(ControlFileData12);
ControlFileDataPtr = palloc0(expected_size);
#endif
}
else if (version_num >= 110000)
{
expected_size = sizeof(ControlFileData11);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90500)
if (version_num >= 90500)
{
expected_size = sizeof(ControlFileData95);
ControlFileDataPtr = palloc0(expected_size);
@@ -305,6 +288,12 @@ get_controlfile(const char *DataDir)
expected_size = sizeof(ControlFileData94);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90300)
{
expected_size = sizeof(ControlFileData93);
ControlFileDataPtr = palloc0(expected_size);
}
if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
{
@@ -323,7 +312,6 @@ get_controlfile(const char *DataDir)
if (version_num >= 120000)
{
#if PG_ACTUAL_VERSION_NUM >= 120000
ControlFileData12 *ptr = (struct ControlFileData12 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
@@ -332,10 +320,6 @@ get_controlfile(const char *DataDir)
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
#else
fprintf(stderr, "ERROR: please use a repmgr version built for PostgreSQL 12\n");
exit(ERR_BAD_CONFIG);
#endif
}
else if (version_num >= 110000)
{
@@ -370,6 +354,17 @@ get_controlfile(const char *DataDir)
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
else if (version_num >= 90300)
{
ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
pfree(ControlFileDataPtr);

View File

@@ -1,6 +1,6 @@
/*
* controldata.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -12,7 +12,6 @@
#include "postgres_fe.h"
#include "catalog/pg_control.h"
#define MAX_VERSION_STRING 24
/*
* A simplified representation of pg_control containing only those fields
@@ -31,7 +30,9 @@ typedef struct
} ControlFileInfo;
typedef struct CheckPoint94
/* Same for 9.3, 9.4 */
typedef struct CheckPoint93
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
@@ -51,10 +52,10 @@ typedef struct CheckPoint94
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestActiveXid;
} CheckPoint94;
} CheckPoint93;
/* Same for 9.5, 9.6, 10, 11 */
/* Same for 9.5, 9.6, 10, HEAD */
typedef struct CheckPoint95
{
XLogRecPtr redo; /* next RecPtr available when we began to
@@ -82,50 +83,65 @@ typedef struct CheckPoint95
} CheckPoint95;
#if PG_ACTUAL_VERSION_NUM >= 120000
/*
* Following fields removed in PostgreSQL 12;
*
* uint32 nextXidEpoch;
* TransactionId nextXid;
*
* and replaced by:
*
* FullTransactionId nextFullXid;
*/
typedef struct CheckPoint12
typedef struct ControlFileData93
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
TimeLineID ThisTimeLineID; /* current TLI */
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
FullTransactionId nextFullXid; /* next free full transaction ID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
Oid oldestXidDB; /* database with minimum datfrozenxid */
MultiXactId oldestMulti; /* cluster-wide minimum datminmxid */
Oid oldestMultiDB; /* database with minimum datminmxid */
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestCommitTsXid; /* oldest Xid with valid commit
* timestamp */
TransactionId newestCommitTsXid; /* newest Xid with valid commit
* timestamp */
uint64 system_identifier;
/*
* Oldest XID still running. This is only needed to initialize hot standby
* mode from an online checkpoint, so we only bother calculating this for
* online checkpoints and only when wal_level is replica. Otherwise it's
* set to InvalidTransactionId.
*/
TransactionId oldestActiveXid;
} CheckPoint12;
#endif
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
int MaxConnections;
int max_prepared_xacts;
int max_locks_per_xact;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
/* flag indicating internal format of timestamp, interval, time */
bool enableIntTimes; /* int64 storage enabled? */
/* flags indicating pass-by-value status of various types */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
/* Are data pages protected by checksums? Zero if no checksum version */
uint32 data_checksum_version;
} ControlFileData93;
/*
* Following field added since 9.3:
*
* int max_worker_processes;
*/
typedef struct ControlFileData94
{
@@ -139,7 +155,7 @@ typedef struct ControlFileData94
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint94 checkPointCopy; /* copy of last check point record */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
@@ -317,11 +333,19 @@ typedef struct ControlFileData11
} ControlFileData11;
#if PG_ACTUAL_VERSION_NUM >= 120000
/*
* Following field added in Pg12:
*
* int max_wal_senders;
*
* Following fields removed:
*
* uint32 nextXidEpoch;
* TransactionId nextXid;
*
* and replaced by:
*
* FullTransactionId nextFullXid;
*/
typedef struct ControlFileData12
@@ -335,7 +359,7 @@ typedef struct ControlFileData12
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
CheckPoint12 checkPointCopy; /* copy of last check point record */
CheckPoint checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
@@ -374,10 +398,9 @@ typedef struct ControlFileData12
uint32 data_checksum_version;
} ControlFileData12;
#endif
extern int get_pg_version(const char *data_directory, char *version_string);
extern bool get_db_state(const char *data_directory, DBState *state);
extern DBState get_db_state(const char *data_directory);
extern const char *describe_db_state(DBState state);
extern int get_data_checksum_version(const char *data_directory);
extern uint64 get_system_identifier(const char *data_directory);

1706
dbutils.c

File diff suppressed because it is too large Load Diff

121
dbutils.h
View File

@@ -1,7 +1,7 @@
/*
* dbutils.h
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -60,6 +60,11 @@
"NULL AS attached "
#define BDR2_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_name, node_local_dsn, ''"
#define BDR3_NODES_COLUMNS "ns.node_id, 0, 0, ns.node_name, ns.interface_connstr, ns.peer_state_name"
#define ERRBUFF_SIZE 512
typedef enum
@@ -67,7 +72,8 @@ typedef enum
UNKNOWN = 0,
PRIMARY,
STANDBY,
WITNESS
WITNESS,
BDR
} t_server_type;
typedef enum
@@ -119,21 +125,14 @@ typedef enum
typedef enum
{
/* unable to query "pg_stat_replication" or other error */
NODE_ATTACHED_UNKNOWN = -1,
/* node has record in "pg_stat_replication" and state is not "streaming" */
NODE_ATTACHED,
/* node has record in "pg_stat_replication" but state is not "streaming" */
NODE_NOT_ATTACHED,
/* node has no record in "pg_stat_replication" */
NODE_DETACHED
NODE_DETACHED,
NODE_ATTACHED
} NodeAttached;
typedef enum
{
SLOT_UNKNOWN = -1,
SLOT_NOT_FOUND,
SLOT_NOT_PHYSICAL,
SLOT_INACTIVE,
SLOT_ACTIVE
} ReplSlotStatus;
@@ -171,7 +170,6 @@ typedef struct
char current_timestamp[MAXLEN];
bool in_recovery;
TimeLineID timeline_id;
char timeline_id_str[MAXLEN];
XLogRecPtr last_wal_receive_lsn;
XLogRecPtr last_wal_replay_lsn;
char last_xact_replay_timestamp[MAXLEN];
@@ -326,6 +324,45 @@ typedef struct s_connection_user
#define T_CONNECTION_USER_INITIALIZER { "", false }
/* represents an entry in bdr.bdr_nodes */
typedef struct s_bdr_node_info
{
char node_sysid[MAXLEN];
uint32 node_timeline;
uint32 node_dboid;
char node_name[MAXLEN];
char node_local_dsn[MAXLEN];
char peer_state_name[MAXLEN];
} t_bdr_node_info;
#define T_BDR_NODE_INFO_INITIALIZER { \
"", InvalidOid, InvalidOid, \
"", "", "" \
}
/* structs to store a list of BDR node records */
typedef struct BdrNodeInfoListCell
{
struct BdrNodeInfoListCell *next;
t_bdr_node_info *node_info;
} BdrNodeInfoListCell;
typedef struct BdrNodeInfoList
{
BdrNodeInfoListCell *head;
BdrNodeInfoListCell *tail;
int node_count;
} BdrNodeInfoList;
#define T_BDR_NODE_INFO_LIST_INITIALIZER { \
NULL, \
NULL, \
0 \
}
typedef struct
{
char filepath[MAXPGPATH];
@@ -335,7 +372,6 @@ typedef struct
#define T_CONFIGFILE_INFO_INITIALIZER { "", "", false }
typedef struct
{
int size;
@@ -345,7 +381,6 @@ typedef struct
#define T_CONFIGFILE_LIST_INITIALIZER { 0, 0, NULL }
typedef struct
{
uint64 system_identifier;
@@ -385,6 +420,10 @@ typedef struct RepmgrdInfo {
/* utility functions */
XLogRecPtr parse_lsn(const char *str);
extern void
wrap_ddl_query(PQExpBufferData *query_buf, int replication_type, const char *fmt,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
bool atobool(const char *value);
/* connection functions */
@@ -393,19 +432,13 @@ PGconn *establish_db_connection(const char *conninfo,
PGconn *establish_db_connection_quiet(const char *conninfo);
PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
const bool exit_on_error);
PGconn *establish_db_connection_with_replacement_param(const char *conninfo,
const char *param,
const char *value,
const bool exit_on_error);
PGconn *establish_replication_connection_from_conn(PGconn *conn, const char *repluser);
PGconn *establish_replication_connection_from_conninfo(const char *conninfo, const char *repluser);
PGconn *establish_primary_db_connection(PGconn *conn,
const bool exit_on_error);
PGconn *get_primary_connection(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
PGconn *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
PGconn *duplicate_connection(PGconn *conn, const char *user, bool replication);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
bool connection_has_pg_settings(PGconn *conn);
void close_connection(PGconn **conn);
/* conninfo manipulation functions */
@@ -418,7 +451,6 @@ void conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
void param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
void param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
char *param_get(t_conninfo_param_list *param_list, const char *param);
bool validate_conninfo_string(const char *conninfo_str, char **errmsg);
bool parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params);
char *param_list_to_string(t_conninfo_param_list *param_list);
char *normalize_conninfo_string(const char *conninfo_str);
@@ -435,7 +467,6 @@ bool set_config(PGconn *conn, const char *config_param, const char *config_valu
bool set_config_bool(PGconn *conn, const char *config_param, bool state);
int guc_set(PGconn *conn, const char *parameter, const char *op, const char *value);
bool get_pg_setting(PGconn *conn, const char *setting, char *output);
bool get_pg_setting_bool(PGconn *conn, const char *setting, bool *output);
bool get_pg_setting_int(PGconn *conn, const char *setting, int *output);
bool alter_system_int(PGconn *conn, const char *name, int value);
bool pg_reload_conf(PGconn *conn);
@@ -451,12 +482,6 @@ bool identify_system(PGconn *repl_conn, t_system_identification *identification
uint64 system_identifier(PGconn *conn);
TimeLineHistoryEntry *get_timeline_history(PGconn *repl_conn, TimeLineID tli);
/* user/role information functions */
bool can_execute_pg_promote(PGconn *conn);
bool connection_has_pg_monitor_role(PGconn *conn, const char *subrole);
bool is_replication_role(PGconn *conn, char *rolname);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
/* repmgrd shared memory functions */
bool repmgrd_set_local_node_id(PGconn *conn, int local_node_id);
int repmgrd_get_local_node_id(PGconn *conn);
@@ -496,7 +521,6 @@ bool get_local_node_record(PGconn *conn, int node_id, t_node_info *node_info);
bool get_primary_node_record(PGconn *conn, t_node_info *node_info);
bool get_all_node_records(PGconn *conn, NodeInfoList *node_list);
bool get_all_nodes_count(PGconn *conn, int *count);
void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
bool get_child_nodes(PGconn *conn, int node_id, NodeInfoList *node_list);
@@ -535,14 +559,10 @@ PGresult *get_event_records(PGconn *conn, int node_id, const char *node_name,
/* replication slot functions */
void create_slot_name(char *slot_name, int node_id);
bool create_replication_slot_sql(PGconn *conn, char *slot_name, PQExpBufferData *error_msg);
bool create_replication_slot_replprot(PGconn *conn, PGconn *repl_conn, char *slot_name, PQExpBufferData *error_msg);
bool drop_replication_slot_sql(PGconn *conn, char *slot_name);
bool drop_replication_slot_replprot(PGconn *repl_conn, char *slot_name);
bool create_replication_slot(PGconn *conn, char *slot_name, PQExpBufferData *error_msg);
bool drop_replication_slot(PGconn *conn, char *slot_name);
RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
int get_free_replication_slot_count(PGconn *conn, int *max_replication_slots);
int get_free_replication_slot_count(PGconn *conn);
int get_inactive_replication_slots(PGconn *conn, KeyValueList *list);
/* tablespace functions */
@@ -594,14 +614,35 @@ XLogRecPtr get_last_wal_receive_location(PGconn *conn);
void init_replication_info(ReplInfo *replication_info);
bool get_replication_info(PGconn *conn, t_server_type node_type, ReplInfo *replication_info);
int get_replication_lag_seconds(PGconn *conn);
TimeLineID get_node_timeline(PGconn *conn, char *timeline_id_str);
TimeLineID get_node_timeline(PGconn *conn);
void get_node_replication_stats(PGconn *conn, t_node_info *node_info);
NodeAttached is_downstream_node_attached(PGconn *conn, char *node_name, char **node_state);
NodeAttached is_downstream_node_attached(PGconn *conn, char *node_name);
void set_upstream_last_seen(PGconn *conn, int upstream_node_id);
int get_upstream_last_seen(PGconn *conn, t_server_type node_type);
bool is_wal_replay_paused(PGconn *conn, bool check_pending_wal);
/* BDR functions */
int get_bdr_version_num(void);
void get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list);
RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info);
bool is_bdr_db(PGconn *conn, PQExpBufferData *output);
bool is_bdr_db_quiet(PGconn *conn);
bool is_active_bdr_node(PGconn *conn, const char *node_name);
bool is_bdr_repmgr(PGconn *conn);
char *get_default_bdr_replication_set(PGconn *conn);
bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
void add_extension_tables_to_bdr_replication_set(PGconn *conn);
bool bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name);
ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name);
void get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf);
bool am_bdr_failover_handler(PGconn *conn, int node_id);
void unset_bdr_failover_handler(PGconn *conn);
bool bdr_node_has_repmgr_set(PGconn *conn, const char *node_name);
bool bdr_node_set_repmgr_set(PGconn *conn, const char *node_name);
/* miscellaneous debugging functions */
const char *print_node_status(NodeStatus node_status);
const char *print_pqping_status(PGPing ping_status);

View File

@@ -3,7 +3,7 @@
* dirmod.c
* directory handling functions
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/*
* dirutil.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -95,7 +95,6 @@ clean:
rm -f repmgr.html
rm -f repmgr-A4.pdf
rm -f repmgr-US.pdf
rm -f html/*
maintainer-clean:
rm -rf html

View File

@@ -3,486 +3,437 @@
<title>FAQ (Frequently Asked Questions)</title>
<indexterm>
<primary>FAQ (Frequently Asked Questions)</primary>
<primary>FAQ (Frequently Asked Questions)</primary>
</indexterm>
<sect1 id="faq-general" xreflabel="General">
<title>General</title>
<sect1 id="faq-general" xreflabel="General">
<title>General</title>
<sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
<title>What's the difference between the repmgr versions?</title>
<para>
&repmgr; 4 is a complete rewrite of the previous &repmgr; code base
and implements &repmgr; as a PostgreSQL extension. It
supports all PostgreSQL versions from 9.3 (although some &repmgr;
features are not available for PostgreSQL 9.3 and 9.4).
</para>
<note>
<para>
&repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides
support for the revised replication configuration mechanism in PostgreSQL 12.
</para>
<para>
Support for PostgreSQL 9.3 is no longer available from &repmgr; 5.2.
</para>
</note>
<para>
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via &repmgrd;, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series is no longer maintained.
</para>
<para>
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
no longer maintained.
</para>
<para>
See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-xrepmgr-version-diff" xreflabel="Version differences">
<title>What's the difference between the repmgr versions?</title>
<para>
&repmgr; 4 is a complete rewrite of the existing &repmgr; code base
and implements &repmgr; as a PostgreSQL extension. It
supports all PostgreSQL versions from 9.3 (although some &repmgr;
features are not available for PostgreSQL 9.3 and 9.4).
</para>
<para>
&repmgr; 3.x builds on the improved replication facilities added
in PostgreSQL 9.3, as well as improved automated failover support
via &repmgrd;, and is not compatible with PostgreSQL 9.2
and earlier. We recommend upgrading to &repmgr; 4, as the &repmgr; 3.x
series is no longer maintained.
</para>
<para>
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is
no longer maintained.
</para>
<para>
See also <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
and <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
<title>What's the advantage of using replication slots?</title>
<para>
Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed
by all standby servers. This means standby servers should never
fail due to not being able to retrieve required WAL files from the
primary.
</para>
<para>
However this does mean that if a standby is no longer connected to the
primary, the presence of the replication slot will cause WAL files
to be retained indefinitely, and eventually lead to disk space
exhaustion.
</para>
<sect2 id="faq-replication-slots-advantage" xreflabel="Advantages of replication slots">
<title>What's the advantage of using replication slots?</title>
<para>
Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed
by all standby servers. This means standby servers should never
fail due to not being able to retrieve required WAL files from the
primary.
</para>
<para>
However this does mean that if a standby is no longer connected to the
primary, the presence of the replication slot will cause WAL files
to be retained indefinitely, and eventually lead to disk space
exhaustion.
</para>
<tip>
<para>
2ndQuadrant's recommended configuration is to configure
<ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
source of WAL files, rather than maintain replication slots for
each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
</para>
</tip>
</sect2>
<tip>
<para>
2ndQuadrant's recommended configuration is to configure
<ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
source of WAL files, rather than maintain replication slots for
each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
</para>
</tip>
</sect2>
<sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
<title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
<para>
Normally at least same number as the number of standbys which will connect
to the node. Note that changes to <varname>max_replication_slots</varname> require a server
restart to take effect, and as there is no particular penalty for unused
replication slots, setting a higher figure will make adding new nodes
easier.
</para>
</sect2>
<sect2 id="faq-replication-slots-number" xreflabel="Number of replication slots">
<title>How many replication slots should I define in <varname>max_replication_slots</varname>?</title>
<para>
Normally at least same number as the number of standbys which will connect
to the node. Note that changes to <varname>max_replication_slots</varname> require a server
restart to take effect, and as there is no particular penalty for unused
replication slots, setting a higher figure will make adding new nodes
easier.
</para>
</sect2>
<sect2 id="faq-hash-index" xreflabel="Hash indexes">
<title>Does &repmgr; support hash indexes?</title>
<para>
Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
for use in streaming replication in PostgreSQL 9.6 and earlier. See the
<ulink url="https://www.postgresql.org/docs/9.6/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
for details.
</para>
<para>
From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
in a streaming replication cluster.
</para>
</sect2>
<sect2 id="faq-hash-index" xreflabel="Hash indexes">
<title>Does &repmgr; support hash indexes?</title>
<para>
Before PostgreSQL 10, hash indexes were not WAL logged and are therefore not suitable
for use in streaming replication in PostgreSQL 9.6 and earlier. See the
<ulink url="https://www.postgresql.org/docs/9.6/sql-createindex.html#AEN80279">PostgreSQL documentation</ulink>
for details.
</para>
<para>
From PostgreSQL 10, this restriction has been lifted and hash indexes can be used
in a streaming replication cluster.
</para>
</sect2>
<sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
<title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
<para>
For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
<link linkend="performing-switchover">switchover</link> promoting it to a primary,
then upgrade the former primary.
</para>
<para>
For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html">pg_upgrade</ulink>
and recloning standbys from this.
</para>
<para>
To minimize downtime during major upgrades from PostgreSQL 9.4 and later,
<ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
</para>
</sect2>
<sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
<title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
<para>
For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
<link linkend="performing-switchover">switchover</link> promoting it to a primary,
then upgrade the former primary.
</para>
<para>
For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html">pg_upgrade</ulink>
and recloning standbys from this.
</para>
<para>
To minimize downtime during major upgrades from PostgreSQL 9.4 and later,
<ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
</para>
</sect2>
<sect2 id="faq-libdir-repmgr-error">
<title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
<para>
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
<sect2 id="faq-libdir-repmgr-error">
<title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
<para>
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
<sect2 id="faq-old-packages">
<title>How can I obtain old versions of &repmgr; packages?</title>
<para>
See appendix <xref linkend="packages-old-versions"/> for details.
</para>
</sect2>
<sect2 id="faq-old-packages">
<title>How can I obtain old versions of &repmgr; packages?</title>
<para>
See appendix <xref linkend="packages-old-versions"/> for details.
</para>
</sect2>
<sect2 id="faq-repmgr-required-for-replication">
<title>Is &repmgr; required for streaming replication?</title>
<para>
No.
</para>
<para>
&repmgr; (together with &repmgrd;) assists with
<emphasis>managing</emphasis> replication. It does not actually perform replication, which
is part of the core PostgreSQL functionality.
</para>
</sect2>
<sect2 id="faq-repmgr-required-for-replication">
<title>Is &repmgr; required for streaming replication?</title>
<para>
No.
</para>
<para>
&repmgr; (together with &repmgrd;) assists with
<emphasis>managing</emphasis> replication. It does not actually perform replication, which
is part of the core PostgreSQL functionality.
</para>
</sect2>
<sect2 id="faq-what-if-repmgr-uninstalled">
<title>Will replication stop working if &repmgr; is uninstalled?</title>
<para>
No. See preceding question.
</para>
</sect2>
<sect2 id="faq-what-if-repmgr-uninstalled">
<title>Will replication stop working if &repmgr; is uninstalled?</title>
<para>
No. See preceding question.
</para>
</sect2>
<sect2 id="faq-version-mix">
<title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
<para>
Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
&repmgr; (in particular &repmgrd;)
may not run, or run properly, or in the worst case (if different &repmgrd;
versions are running and there are differences in the failover implementation) break
your replication cluster.
</para>
<para>
If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
&repmgr; will function, but we strongly recommend always running the same version
to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
differently to the older version.
</para>
<para>
See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-version-mix">
<title>Does it matter if different &repmgr; versions are present in the replication cluster?</title>
<para>
Yes. If different &quot;major&quot; &repmgr; versions (e.g. 3.3.x and 4.1.x) are present,
&repmgr; (in particular &repmgrd;)
may not run, or run properly, or in the worst case (if different &repmgrd;
versions are running and there are differences in the failover implementation) break
your replication cluster.
</para>
<para>
If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
&repmgr; will function, but we strongly recommend always running the same version
to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
differently to the older version.
</para>
<para>
See also <link linkend="faq-upgrade-repmgr">Should I upgrade &repmgr;?</link>.
</para>
</sect2>
<sect2 id="faq-upgrade-repmgr">
<title>Should I upgrade &repmgr;?</title>
<para>
Yes.
</para>
<para>
We don't release new versions for fun, you know. Upgrading may require a little effort,
but running an older &repmgr; version with bugs which have since been fixed may end up
costing you more effort. The same applies to PostgreSQL itself.
</para>
</sect2>
<sect2 id="faq-upgrade-repmgr">
<title>Should I upgrade &repmgr;?</title>
<para>
Yes.
</para>
<para>
We don't release new versions for fun, you know. Upgrading may require a little effort,
but running an older &repmgr; version with bugs which have since been fixed may end up
costing you more effort. The same applies to PostgreSQL itself.
</para>
<sect2 id="faq-repmgr-conf-data-directory">
<title>Why do I need to specify the data directory location in repmgr.conf?</title>
<para>
In some circumstances &repmgr; may need to access a PostgreSQL data
directory while the PostgreSQL server is not running, e.g. to confirm
it shut down cleanly during a <link linkend="performing-switchover">switchover</link>.
</para>
<para>
Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
earlier, where the <literal>repmgr</literal> user is not a superuser; in that
case the <literal>repmgr</literal> user will not be able to access the
<literal>data_directory</literal> configuration setting, access to which is restricted
to superusers.
</para>
<para>
In PostgreSQL 10 and later, non-superusers can be added to the
<ulink url="https://www.postgresql.org/docs/current/default-roles.html">default role</ulink>
<option>pg_read_all_settings</option> (or the meta-role <option>pg_monitor</option>)
which will enable them to read this setting.
</para>
</sect2>
</sect2>
<sect2 id="faq-third-party-packages" xreflabel="Compatability with third party vendor packages">
<title>Are &repmgr; packages compatible with <literal>$third_party_vendor</literal>'s packages?</title>
<para>
&repmgr; packages provided by 2ndQuadrant are compatible with the community-provided PostgreSQL
packages and any software provided by 2ndQuadrant.
</para>
<para>
A number of other vendors provide their own versions of PostgreSQL packages, often with different
package naming schemes and/or file locations.
</para>
<para>
We cannot guarantee that &repmgr; packages will be compatible with these packages.
It may be possible to override package dependencies (e.g. <literal>rpm --nodeps</literal>
for CentOS-based systems or <literal>dpkg --force-depends</literal> for Debian-based systems).
</para>
</sect2>
</sect1>
<sect2 id="faq-repmgr-conf-data-directory">
<title>Why do I need to specify the data directory location in repmgr.conf?</title>
<para>
In some circumstances &repmgr; may need to access a PostgreSQL data
directory while the PostgreSQL server is not running, e.g. to confirm
it shut down cleanly during a <link linkend="performing-switchover">switchover</link>.
</para>
<para>
Additionally, this provides support when using &repmgr; on PostgreSQL 9.6 and
earlier, where the <literal>repmgr</literal> user is not a superuser; in that
case the <literal>repmgr</literal> user will not be able to access the
<literal>data_directory</literal> configuration setting, access to which is restricted
to superusers. (In PostgreSQL 10 and later, non-superusers can be added to the
group <option>pg_read_all_settings</option> which will enable them to read this setting).
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgr" xreflabel="repmgr">
<title><command>repmgr</command></title>
<sect1 id="faq-repmgr" xreflabel="repmgr">
<title><command>repmgr</command></title>
<sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
<title>Can I register an existing PostgreSQL server with repmgr?</title>
<para>
Yes, any existing PostgreSQL server which is part of the same replication
cluster can be registered with &repmgr;. There's no requirement for a
standby to have been cloned using &repmgr;.
</para>
</sect2>
<sect2 id="faq-register-existing-node" xreflabel="registering an existing node">
<title>Can I register an existing PostgreSQL server with repmgr?</title>
<para>
Yes, any existing PostgreSQL server which is part of the same replication
cluster can be registered with &repmgr;. There's no requirement for a
standby to have been cloned using &repmgr;.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-other-source" >
<title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
<sect2 id="faq-repmgr-clone-other-source" >
<title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
<para>
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
<command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>
can be used to create the correct <filename>recovery.conf</filename> file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
<link linkend="repmgr-standby-register">register the node</link> as usual.
</para>
</sect2>
<para>
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
<command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
can be used to create the correct <filename>recovery.conf</filename> file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
<link linkend="repmgr-standby-register">register the node</link> as usual.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf" >
<title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
<para>
See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf" >
<title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
<para>
See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
</para>
</sect2>
<sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
<title>How can a failed primary be re-added as a standby?</title>
<para>
This is a two-stage process. First, the failed primary's data directory
must be re-synced with the current primary; secondly the failed primary
needs to be re-registered as a standby.
</para>
<para>
It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
directory, which will usually be much
faster than re-cloning the server. However <command>pg_rewind</command> can only
be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
data checksums were enabled when the cluster was initialized.
</para>
<para>
Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
</para>
<para>
&repmgr; provides the command <command>repmgr node rejoin</command> which can
optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"/>
documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
</para>
<para>
If <command>pg_rewind</command> cannot be used, then the data directory will need
to be re-cloned from scratch.
</para>
<sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
<title>How can a failed primary be re-added as a standby?</title>
<para>
This is a two-stage process. First, the failed primary's data directory
must be re-synced with the current primary; secondly the failed primary
needs to be re-registered as a standby.
</para>
<para>
It's possible to use <command>pg_rewind</command> to re-synchronise the existing data
directory, which will usually be much
faster than re-cloning the server. However <command>pg_rewind</command> can only
be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
data checksums were enabled when the cluster was initialized.
</para>
<para>
Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
</para>
<para>
&repmgr; provides the command <command>repmgr node rejoin</command> which can
optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"/>
documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind"/>.
</para>
<para>
If <command>pg_rewind</command> cannot be used, then the data directory will need
to be re-cloned from scratch.
</para>
</sect2>
</sect2>
<sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
<title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
<para>
Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
with the <literal>--dry-run</literal> option; this will report any configuration problems
which need to be rectified.
</para>
</sect2>
<sect2 id="faq-repmgr-check-configuration" xreflabel="Check PostgreSQL configuration">
<title>Is there an easy way to check my primary server is correctly configured for use with &repmgr;?</title>
<para>
Execute <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
with the <literal>--dry-run</literal> option; this will report any configuration problems
which need to be rectified.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
<title>When cloning a standby, how can I get &repmgr; to copy
<filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
directory in <filename>/etc</filename>?</title>
<para>
Use the command line option <literal>--copy-external-config-files</literal>. For more details
see <xref linkend="repmgr-standby-clone-config-file-copying"/>.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-skip-config-files" xreflabel="">
<title>When cloning a standby, how can I get &repmgr; to copy
<filename>postgresql.conf</filename> and <filename>pg_hba.conf</filename> from the PostgreSQL configuration
directory in <filename>/etc</filename>?</title>
<para>
Use the command line option <literal>--copy-external-config-files</literal>. For more details
see <xref linkend="repmgr-standby-clone-config-file-copying"/>.
</para>
</sect2>
<sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
<para>
No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;.
If you later decide to run &repmgrd;, you just need to add
<literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
</para>
</sect2>
<sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
<para>
No, the <literal>repmgr</literal> shared library is only needed when running &repmgrd;.
If you later decide to run &repmgrd;, you just need to add
<literal>shared_preload_libraries = 'repmgr'</literal> and restart PostgreSQL.
</para>
</sect2>
<sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
<title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title>
<para>
<command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database
with a normal connection to query metadata. The <literal>replication</literal> connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user).
</para>
</sect2>
<sect2 id="faq-repmgr-permissions" xreflabel="Replication permission problems">
<title>I've provided replication permission for the <literal>repmgr</literal> user in <filename>pg_hba.conf</filename>
but <command>repmgr</command>/&repmgrd; complains it can't connect to the server... Why?</title>
<para>
<command>repmgr</command> and &repmgrd; need to be able to connect to the repmgr database
with a normal connection to query metadata. The <literal>replication</literal> connection
permission is for PostgreSQL's streaming replication (and doesn't necessarily need to be the <literal>repmgr</literal> user).
</para>
</sect2>
<sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
<title>When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?</title>
<para>
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If &repmgr; needs to establish a connection to the primary
server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
node, and if necessary scan the replication cluster until it locates the active primary.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-provide-primary-conninfo" xreflabel="Providing primary connection parameters">
<title>When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?</title>
<para>
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If &repmgr; needs to establish a connection to the primary
server, it can retrieve this from the <literal>repmgr.nodes</literal> table on the local
node, and if necessary scan the replication cluster until it locates the active primary.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
<title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
<para>
Provide the option <literal>--waldir</literal> (<literal>--xlogdir</literal> in PostgreSQL 9.6
and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>.
</para>
<para>
In &repmgr; 5.2 and later, this setting will also be honoured when cloning from Barman.
</para>
</sect2>
<sect2 id="faq-repmgr-clone-waldir-xlogdir" xreflabel="Providing a custom WAL directory">
<title>When cloning a standby, how do I ensure the WAL files are placed in a custom directory?</title>
<para>
Provide the option <literal>--waldir</literal> (<literal>--xlogdir</literal> in PostgreSQL 9.6
and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>.
</para>
</sect2>
<sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
<title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
table?</title>
<para>
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the <literal>repmgr.nodes</literal> table.
</para>
</sect2>
<sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
<title>Why is there no foreign key on the <literal>node_id</literal> column in the <literal>repmgr.events</literal>
table?</title>
<para>
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the <literal>repmgr.nodes</literal> table.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf">
<title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title>
<para>
This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed.
</para>
<para>
The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
even if the string does not contain any characters which need escaping.
</para>
</sect2>
<sect2 id="faq-repmgr-exclude-metadata-from-dump" xreflabel="Excluding repmgr metadata from pg_dump output">
<title>How can I exclude &repmgr; metadata from <application>pg_dump</application> output?</title>
<para>
Beginning with &repmgr; 5.2, the metadata tables associated with the &repmgr; extension
(<literal>repmgr.nodes</literal>, <literal>repmgr.events</literal> and <literal>repmgr.monitoring_history</literal>)
have been marked as dumpable as they contain configuration and user-generated data.
</para>
<para>
To exclude these from <application>pg_dump</application> output, add the flag <option>--exclude-schema=repmgr</option>.
</para>
<para>
To exclude individual &repmgr; metadata tables from <application>pg_dump</application> output, add the flag
e.g. <option>--exclude-table=repmgr.monitoring_history</option>. This flag can be provided multiple times
to exclude individual tables,
</para>
</sect2>
</sect1>
<sect1 id="faq-repmgrd" xreflabel="repmgrd">
<title>&repmgrd;</title>
<sect2 id="faq-repmgr-recovery-conf-quoted-values" xreflabel="Quoted values in recovery.conf">
<title>Why are some values in <filename>recovery.conf</filename> surrounded by pairs of single quotes?</title>
<para>
This is to ensure that user-supplied values which are written as parameter values in <filename>recovery.conf</filename>
are escaped correctly and do not cause errors when <filename>recovery.conf</filename> is parsed.
</para>
<para>
The escaping is performed by an internal PostgreSQL routine, which leaves strings consisting
of digits and alphabetical characters only as-is, but wraps everything else in pairs of single quotes,
even if the string does not contain any characters which need escaping.
</para>
</sect2>
<sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
<title>How can I prevent a node from ever being promoted to primary?</title>
<para>
In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
<command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
</para>
<para>
Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
be considered as a promotion candidate.
</para>
</sect2>
</sect1>
<sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
<title>Does &repmgrd; support delayed standbys?</title>
<para>
&repmgrd; can monitor delayed standbys - those set up with
<varname>recovery_min_apply_delay</varname> set to a non-zero value
in <filename>recovery.conf</filename> - but as it's not currently possible
to directly examine the value applied to the standby, &repmgrd;
may not be able to properly evaluate the node as a promotion candidate.
</para>
<para>
We recommend that delayed standbys are explicitly excluded from promotion
by setting <varname>priority</varname> to <literal>0</literal> in
<filename>repmgr.conf</filename>.
</para>
<para>
Note that after registering a delayed standby, &repmgrd; will only start
once the metadata added in the primary node has been replicated.
</para>
</sect2>
<sect1 id="faq-repmgrd" xreflabel="repmgrd">
<title>&repmgrd;</title>
<sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
<title>How can I get &repmgrd; to rotate its logfile?</title>
<para>
Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation"/>.
</para>
</sect2>
<sect2 id="faq-repmgrd-prevent-promotion" xreflabel="Prevent standby from being promoted to primary">
<title>How can I prevent a node from ever being promoted to primary?</title>
<para>
In <filename>repmgr.conf</filename>, set its priority to a value of <literal>0</literal>; apply the changed setting with
<command><link linkend="repmgr-standby-register">repmgr standby register --force</link></command>.
</para>
<para>
Additionally, if <varname>failover</varname> is set to <literal>manual</literal>, the node will never
be considered as a promotion candidate.
</para>
</sect2>
<sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
<title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title>
<para>
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
<literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if
<varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
be monitored, if desired.
</para>
</sect2>
<sect2 id="faq-repmgrd-delayed-standby" xreflabel="Delayed standby support">
<title>Does &repmgrd; support delayed standbys?</title>
<para>
&repmgrd; can monitor delayed standbys - those set up with
<varname>recovery_min_apply_delay</varname> set to a non-zero value
in <filename>recovery.conf</filename> - but as it's not currently possible
to directly examine the value applied to the standby, &repmgrd;
may not be able to properly evaluate the node as a promotion candidate.
</para>
<para>
We recommend that delayed standbys are explicitly excluded from promotion
by setting <varname>priority</varname> to <literal>0</literal> in
<filename>repmgr.conf</filename>.
</para>
<para>
Note that after registering a delayed standby, &repmgrd; will only start
once the metadata added in the primary node has been replicated.
</para>
</sect2>
<sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
<title>
&repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details.
</para>
</sect2>
<sect2 id="faq-repmgrd-logfile-rotate" xreflabel="repmgrd logfile rotation">
<title>How can I get &repmgrd; to rotate its logfile?</title>
<para>
Configure your system's <literal>logrotate</literal> service to do this; see <xref linkend="repmgrd-log-rotation"/>.
</para>
<sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
<title>
&repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
</title>
<para>
&repmgrd; does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, &repmgrd;
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if &repmgrd; is not running on other nodes.
</para>
<para>
In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
is out-of-date, which may lead to incorrect failover behaviour.
</para>
<para>
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
starting &repmgrd;.
</para>
</sect2>
</sect2>
</sect1>
<sect2 id="faq-repmgrd-recloned-no-start" xreflabel="repmgrd not restarting after node cloned">
<title>I've recloned a failed primary as a standby, but &repmgrd; refuses to start?</title>
<para>
Check you registered the standby after recloning. If unregistered, the standby
cannot be considered as a promotion candidate even if <varname>failover</varname> is set to
<literal>automatic</literal>, which is probably not what you want. &repmgrd; will start if
<varname>failover</varname> is set to <literal>manual</literal> so the node's replication status can still
be monitored, if desired.
</para>
</sect2>
<sect2 id="faq-repmgrd-pg-bindir" xreflabel="repmgrd does not apply pg_bindir to promote_command or follow_command">
<title>
&repmgrd; ignores pg_bindir when executing <varname>promote_command</varname> or <varname>follow_command</varname>
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details.
</para>
</sect2>
<sect2 id="faq-repmgrd-startup-no-upstream" xreflabel="repmgrd does not start if upstream node is not running">
<title>
&repmgrd; aborts startup with the error "<literal>upstream node must be running before repmgrd can start</literal>"
</title>
<para>
&repmgrd; does this to avoid starting up on a replication cluster
which is not in a healthy state. If the upstream is unavailable, &repmgrd;
may initiate a failover immediately after starting up, which could have unintended side-effects,
particularly if &repmgrd; is not running on other nodes.
</para>
<para>
In particular, it's possible that the node's local copy of the <literal>repmgr.nodes</literal> copy
is out-of-date, which may lead to incorrect failover behaviour.
</para>
<para>
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
starting &repmgrd;.
</para>
</sect2>
</sect1>
</appendix>

View File

@@ -122,7 +122,7 @@
<row>
<entry>Package name example:</entry>
<entry><filename>repmgr11-4.4.0-1.rhel7.x86_64</filename></entry>
<entry><filename>repmgr10-4.0.4-1.rhel7.x86_64</filename></entry>
</row>
<row>
@@ -132,12 +132,12 @@
<row>
<entry>Installation command:</entry>
<entry><literal>yum install repmgr11</literal></entry>
<entry><literal>yum install repmgr10</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/pgsql-11/bin</filename></entry>
<entry><filename>/usr/pgsql-10/bin</filename></entry>
</row>
<row>
@@ -147,22 +147,22 @@
<row>
<entry>Configuration file location:</entry>
<entry><filename>/etc/repmgr/11/repmgr.conf</filename></entry>
<entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
</row>
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/pgsql/11/data</filename></entry>
<entry><filename>/var/lib/pgsql/10/data</filename></entry>
</row>
<row>
<entry>repmgrd service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] repmgr11</command></entry>
<entry><command>systemctl [start|stop|restart|reload] repmgr10</command></entry>
</row>
<row>
<entry>repmgrd service file location:</entry>
<entry><filename>/usr/lib/systemd/system/repmgr11.service</filename></entry>
<entry><filename>/usr/lib/systemd/system/repmgr10.service</filename></entry>
</row>
<row>
@@ -253,14 +253,20 @@
</indexterm>
<para>
&repmgr; <literal>.deb</literal> packages are provided by 2ndQuadrant as well as the
&repmgr; <literal>.deb</literal> packages are provided via the
PostgreSQL Community APT repository, and are available for each community-supported
PostgreSQL version, currently supported Debian releases, and currently supported
Ubuntu LTS releases.
</para>
<sect2 id="packages-apt-repository">
<title>APT repositories</title>
<title>APT repository</title>
<para>
&repmgr; packages are available from the PostgreSQL Community APT repository,
which is updated immediately after each &repmgr; release.
</para>
<table id="apt-2ndquadrant-repository">
<title>2ndQuadrant public repository</title>
@@ -285,7 +291,7 @@
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://apt.postgresql.org/">https://apt.postgresql.org/</ulink></entry>
<entry><ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
@@ -317,7 +323,7 @@
<row>
<entry>Package name example:</entry>
<entry><filename>postgresql-11-repmgr</filename></entry>
<entry><filename>postgresql-10-repmgr</filename></entry>
</row>
<row>
@@ -327,12 +333,12 @@
<row>
<entry>Installation command:</entry>
<entry><literal>apt-get install postgresql-11-repmgr</literal></entry>
<entry><literal>apt-get install postgresql-10-repmgr</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/lib/postgresql/11/bin</filename></entry>
<entry><filename>/usr/lib/postgresql/10/bin</filename></entry>
</row>
<row>
@@ -347,12 +353,12 @@
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/postgresql/11/main</filename></entry>
<entry><filename>/var/lib/postgresql/10/main</filename></entry>
</row>
<row>
<entry>PostgreSQL service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] postgresql@11-main</command></entry>
<entry><command>systemctl [start|stop|restart|reload] postgresql@10-main</command></entry>
</row>
@@ -376,11 +382,11 @@
</table>
<note>
<para>
When using Debian packages, instead of using the <application>systemd</application> service
command directly, it's recommended to execute <command>pg_ctlcluster</command>
(as <literal>root</literal>, either directly or via <command>sudo</command>), e.g.:
Instead of using the <application>systemd</application> service command directly,
it's recommended to execute <command>pg_ctlcluster</command> (as <literal>root</literal>,
either directly or via <command>sudo</command>), e.g.:
<programlisting>
<command>pg_ctlcluster 11 main [start|stop|restart|reload]</command></programlisting>
<command>pg_ctlcluster 10 main [start|stop|restart|reload]</command></programlisting>
</para>
<para>
For pre-<application>systemd</application> systems, <command>pg_ctlcluster</command>
@@ -471,7 +477,7 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
<title>Debian/Ubuntu</title>
<para>
An archive of old packages (<literal>3.3.2</literal> and later) for Debian/Ubuntu-based systems is available here:
<ulink url="https://apt-archive.postgresql.org/">https://apt-archive.postgresql.org/</ulink>
<ulink url="http://atalia.postgresql.org/morgue/r/repmgr/">http://atalia.postgresql.org/morgue/r/repmgr/</ulink>
</para>
</sect2>

View File

@@ -15,658 +15,15 @@
See also: <xref linkend="upgrading-repmgr"/>
</para>
<!-- remember to update the release date in ../repmgr_version.h.in -->
<sect1 id="release-5.2.0">
<title id="release-current">Release 5.2.0</title>
<para><emphasis>Thu 22 October, 2020</emphasis></para>
<para>
&repmgr; 5.2.0 is a major release.
</para>
<para>
This release provides support for <ulink url="https://www.postgresql.org/docs/13/release-13.html">PostgreSQL 13</ulink>, released in September 2020.
</para>
<para>
This release removes support for PostgreSQL 9.3, which was
<ulink url="https://www.postgresql.org/docs/9.3/release-9-3-25.html">designated EOL in November 2018</ulink>.
</para>
<sect2>
<title>General improvements</title>
<para>
<itemizedlist>
<listitem>
<para>
Configuration: support <command>include</command>, <command>include_dir</command> and
<command>include_if_exists</command> directives (see <xref linkend="configuration-file-include-directives"/>).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Improve sanity check failure log output from the demotion candidate.
</para>
<para>
If database connection configuration is not consistent across all nodes, it's
possible remote &repmgr; invocations (e.g. during switchover, from the promotion candidate
to the demotion candidate) will not be able to connect to the database. This will
now be explicitly reported as a database connection failure, rather than as a failure
of the respective sanity check.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link> /
<link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link>:
improve text mode output format, in particular so that node identifiers of arbitrary length are
displayed correctly.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-primary-unregister">repmgr primary unregister</link>:
the <option>--force</option> can be provided to unregister an active primary node, provided
it has no registered standby nodes.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>: new option
<option>--verify-backup</option> to run PostgreSQL's
<ulink url="https://www.postgresql.org/docs/13/app-pgverifybackup.html">pg_verifybackup</ulink>
utility after cloning a standby to verify the integrity of the copied data
(PostgreSQL 13 and later).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
when cloning from Barman, setting <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
<option>pg_basebackup_options</option> will cause &repmgr; to create
a WAL directory outside of the main data directory and symlink
it from there, in the same way as would happen when cloning
using <application>pg_basebackup</application>.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-follow">repmgr standby follow</link>:
In PostgreSQL 13 and later, a standby no longer requires a restart to
follow a new upstream node.
</para>
<para>
The old behaviour (always restarting the standby to follow a new node)
can be restored by setting the configuration file parameter
<varname>standby_follow_restart</varname> to <literal>true</literal>.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
enable a node to attach to a target node even the target node
has a lower timeline (PostgreSQL 9.6 and later).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
in PostgreSQL 13 and later, support <application>pg_rewind</application>'s
ability to automatically run crash recovery on a PostgreSQL instance
which was not shut down cleanly.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check">repmgr node check</link>:
option <option>--db-connection</option> added to check if &repmgr;
can connect to the database on the local node.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check">repmgr node check</link>:
report database connection error if the <option>--optformat</option> was provided.
</para>
</listitem>
<listitem>
<para>
Improve handling of pg_control read errors.
</para>
</listitem>
<listitem>
<para>
It is now possible to dump the contents of &repmgr; metadata tables with
<application>pg_dump</application>.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>repmgrd enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
Following additional parameters can be provided to <varname>failover_validation_command</varname>:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>%n</literal>: node ID</simpara>
</listitem>
<listitem>
<simpara><literal>%a</literal>: node name</simpara>
</listitem>
<listitem>
<simpara><literal>%v</literal>: number of visible nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%u</literal>: number of shared upstream nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%t</literal>: total number of nodes</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
<listitem>
<para>
Configuration option <varname>always_promote</varname> (default: <literal>false</literal>)
to control whether a node should be promoted if the &repmgr; metadata is not up-to-date
on that node.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
fix issue with cloning from Barman where the tablespace mapping file was
not flushed to disk before attempting to retrieve files from Barman. GitHub #650.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
ensure that when verifying a standby node has attached to its upstream, the
node has started streaming before confirming the success of the rejoin operation.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: ensure primary connection is reset if same as upstream. GitHub #633.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.1.0">
<title>Release 5.1.0</title>
<para><emphasis>Mon 13 April, 2020</emphasis></para>
<para>
&repmgr; 5.1.0 is a major release.
</para>
<para>
For details on how to upgrade an existing &repmgr; installation, see
documentation section <link linkend="upgrading-major-version">Upgrading a major version release</link>.
</para>
<para>
If &repmgrd; is in use, a PostgreSQL restart <emphasis>is</emphasis> required;
in that case we suggest combining this &repmgr; upgrade with the next PostgreSQL
minor release, which will require a PostgreSQL restart in any case.
</para>
<sect2>
<title>Compatibility changes</title>
<para>
The <link linkend="repmgr-standby-clone"><command>repmgr standby clone</command></link>
<option>--recovery-conf-only</option> option has been renamed to
<option>--replication-conf-only</option>. <option>--recovery-conf-only</option> will
still be accepted as an alias.
</para>
</sect2>
<sect2>
<title>General improvements</title>
<para>
<itemizedlist>
<listitem>
<para>
The requirement that the &repmgr; user is a database superuser has been
removed as far as possible.
</para>
<para>
In theory, &repmgr; can be operated with a normal database user for managing
the &repmgr; database, and a separate replication user for managing replication
connections (and replication slots, if these are in use).
</para>
<para>
Some operations will still require superuser permissions, e.g. for issuing
a <command>CHECKPOINT</command> as par of a switchover operation; in this case
a valid superuser should be provided with the <option>-S</option>/<option>--superuser</option>
option.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone"><command>repmgr standby clone</command></link>:
Warn if neither of data page checksums or <option>wal_log_hints</option> are active,
as this will preclude later usage of <command>pg_rewind</command>.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>:
when executed with <option>--dry-run</option>, the method which would be used to promote the node
will be displayed.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>:
Improve logging and checking of potential failure situations.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Replication configuration files (PostgreSQL 11 and earlier:
<filename>recovery.conf</filename>; PostgreSQL 12 and later: <filename>postgresql.auto.conf</filename>)
will be checked to ensure they are owned by the same user who owns the PostgreSQL
data directory.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Provide additional information in <option>--dry-run mode</option> output.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Checks that the demotion candidate's registered repmgr.conf file can be found, to
prevent confusing references to an incorrectly configured data directory. GitHub 615.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check"><command>repmgr node check</command></link>:
accept option <option>-S</option>/<option>--superuser</option>. GitHub #621.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check"><command>repmgr node check</command></link>:
add <option>--upstream</option> option to check whether the node is attached
to the expected upstream node.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Ensure <link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link>
checks for available replication slots on the rejoin target.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link> and
<link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link> will now return
an error code if the operation fails if a replication slot is not available or cannot
be created on the follow/rejoin target.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>:
in <option>--dry-run mode</option>, display promote command which will be executed.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>
will check if the <literal>repmgr</literal> user has permission to execute
<function>pg_promote()</function> and fall back to <command>pg_ctl promote</command> if
necessary.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
check for demotion candidate reattachment as late as possible to avoid spurious failure
reports.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: check for presence of <option>promote_command</option> and
<option>follow_command</option> on receipt of <literal>SIGHUP</literal>. GitHub 614.
</para>
</listitem>
<listitem>
<para>
Fix situation where replication connections were not created correctly, which
could lead to spurious replication connection failures in some situations, e.g.
where password files are used.
</para>
</listitem>
<listitem>
<para>
Ensure <filename>postgresql.auto.conf</filename> is created with
correct permissions (PostgreSQL 12 and later).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.0">
<title>Release 5.0</title>
<para><emphasis>Tue 15 October, 2019</emphasis></para>
<para>
&repmgr; 5.0 is a major release.
</para>
<para>
For details on how to upgrade an existing &repmgr; installation, see
documentation section <link linkend="upgrading-major-version">Upgrading a major version release</link>.
</para>
<para>
If &repmgrd; is in use, a PostgreSQL restart <emphasis>is</emphasis> required;
in that case we suggest combining this &repmgr; upgrade with the next PostgreSQL
minor release, which will require a PostgreSQL restart in any case.
</para>
<sect2>
<title>Compatibility changes</title>
<sect3 id="repmgr-5-0-config-file-parsing">
<title>Configuration file parsing has been made stricter</title>
<para>
&repmgr; now parses <link linkend="configuration-file-format">configuration files</link>
in the same way that PostgreSQL itself does, which means some files used with
earlier &repmgr; versions may need slight modification before they can be used
with &repmgr; 5 and later.
</para>
<para>
The main change is that string parameters should always be enclosed in single quotes.
</para>
<para>
For example, in &repmgr; 4.4 and earlier, the following <filename>repmgr.conf</filename>
entry was valid:
<programlisting>
conninfo=host=node1 user=repmgr dbname=repmgr connect_timeout=2</programlisting>
This must now be changed to:
<programlisting>
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>
</para>
<para>
Note that simple string identifiers (e.g. <literal>node_name=node1</literal>)
may remain unquoted, though we recommend always enclosing
strings in single quotes.
</para>
<para>
Additionally, leading/trailing white space between single quotes will no longer
be trimmed; the entire string between single quotes will be
taken literally.
</para>
<para>
Strings enclosed in double quotes (e.g. <literal>node_name=&quot;node1&quot;</literal>)
will now be rejected; previously they were accepted, but the double quotes were
interpreted as part of the string, which was a frequent cause of confusion.
</para>
<para>
This syntax matches that used by PostgreSQL itself.
</para>
</sect3>
<sect3>
<title>Some &quot;repmgr daemon ...&quot; commands renamed</title>
<para>
Some &quot;<command>repmgr daemon ...</command>&quot; commands have been renamed to
&quot;<command>repmgr service ...</command>&quot; as they have a cluster-wide effect
and to avoid giving the impression they affect only the local &repmgr; daemon.
</para>
<para>
The following commands are affected:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<command>repmgr daemon pause</command>
(now <link linkend="repmgr-service-pause"><command>repmgr service pause</command></link>)
</simpara>
</listitem>
<listitem>
<simpara>
<command>repmgr daemon unpause</command>
(now <link linkend="repmgr-service-unpause"><command>repmgr service unpause</command></link>)
</simpara>
</listitem>
<listitem>
<simpara>
<command>repmgr daemon status</command>
(now <link linkend="repmgr-service-status"><command>repmgr service status</command></link>)
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
The &quot;<command>repmgr daemon ...</command>&quot; form will still be accepted
for backwards compatibility.
</para>
</sect3>
<sect3>
<title>Some deprecated command line options removed</title>
<para>
The following command line options, which have been deprecated since &repmgr; 3.3
(and which no longer had any effect other than to generate a warning about their use)
have been removed:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><option>--data-dir</option></simpara>
</listitem>
<listitem>
<simpara><option>--no-conninfo-password</option></simpara>
</listitem>
<listitem>
<simpara><option>--recovery-min-apply-delay</option></simpara>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2>
<sect2>
<title>General enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
Support for PostgreSQL 12 added.
</para>
<para>
Beginning with PostgreSQL 12, replication configuration has been integrated
into the main PostgreSQL configuraton system and the conventional
<filename>recovery.conf</filename> file is no longer valid.
</para>
<para>
&repmgr; has been modified to be compatible with this change.
</para>
<para>
&repmgr; additionally takes advantage of the new <command>pg_promote()</command>
function, which enables a standby to be promoted to primary using an SQL
command.
</para>
<para>
For an overview of general changes to replication configuration, see this blog entry:
<ulink url="https://www.2ndquadrant.com/en/blog/replication-configuration-changes-in-postgresql-12/">Replication configuration changes in PostgreSQL 12</ulink>
</para>
</listitem>
<listitem>
<para>
The &repmgr; configuration file is now parsed using
<command>flex</command>, meaning it will be parsed in
the same way as PostgreSQL parses its own configuration
files.
</para>
<para>
This makes configuration file parsing more robust
and consistent.
</para>
<para>
See item <link linkend="repmgr-5-0-config-file-parsing">Configuration file parsing has been made stricter</link>
for details.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone"><command>repmgr standby clone</command></link>:
checks for availability of the &repmgr; extension on the upstream node have
been improved and error messages improved.
</para>
</listitem>
<listitem>
<para>
When executing &repmgr; remotely, if the &repmgr; log level was explicitly
provided (with <option>-L</option>/<option>--log-level</option>), that log level
will be passed to the remote &repmgr;.
</para>
<para>
This makes it possible to return log output when executing repmgr
remotely at a different level to the one defined in the remote
&repmgr;'s <filename>repmgr.conf</filename>.
</para>
<para>
This is particularly useful when <literal>DEBUG</literal> output is required.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Check role membership when trying to read <literal>pg_settings</literal>.
</para>
<para>
Previously &repmgr; assumed only superusers could read <literal>pg_settings</literal>,
but from PostgreSQL 10, all members of the roles <literal>pg_read_all_settings</literal>
or <literal>pg_monitor</literal> are permitted to do this as well.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: Fix handling of upstream node change check.
</para>
<para>
&repmgrd; has a check to see if the upstream node has unexpectedly
changed, e.g. if the repmgrd service is paused and the PostgreSQL
instance has been pointed to another node.
</para>
<para>
However this check was relying on the node record on the local node
being up-to-date, which may not be the case immediately after a
failover, when the node is still replaying records updated prior
to the node's own record being updated. In this case it will
mistakenly assume the node is following the original primary
and attempt to restart monitoring, which will fail as the original
primary is no longer available.
</para>
<para>
To prevent this, the node's record on the upstream node is checked
to see if the reported upstream <literal>node_id</literal> matches
the expected <literal>node_id</literal>. GitHub #587/#588.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.4">
<title>Release 4.4</title>
<para><emphasis>Thu 27 June, 2019</emphasis></para>
<title id="release-current">Release 4.4</title>
<para><emphasis>27 June, 2019</emphasis></para>
<para>
&repmgr; 4.4 is a major release.
</para>
<para>
For details on how to upgrade an existing &repmgr; installation, see
For details on how to upgrade an existing &repmgr; instrallation, see
documentation section <link linkend="upgrading-major-version">Upgrading a major version release</link>.
</para>
<para>
@@ -783,7 +140,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>:
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>:
make output similar to that of
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>
for consistency and to make it easier to identify nodes not in the expected
@@ -801,7 +158,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>
and <link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>:
and <link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>:
show the upstream node name as reported by each individual node - this helps visualise
situations where the cluster is in an unexpected state, and provide a better idea of the
actual cluster state.
@@ -819,7 +176,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>
and <link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>:
and <link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>:
check if a node is attached to its advertised upstream node, and issue a
warning if the node is not attached.
</para>
@@ -841,7 +198,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
execute a custom script.
</para>
<para>
This provided an additional method for fencing an isolated primary node, and/or taking
This provides an additional method for fencing an isolated primary node, and/or taking
other action if one or more standys become disconnected.
</para>
<para>
@@ -967,7 +324,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
&repmgr; 4.3 is a major release.
</para>
<para>
For details on how to upgrade an existing &repmgr; installation, see
For details on how to upgrade an existing &repmgr; instrallation, see
documentation section <link linkend="upgrading-major-version">Upgrading a major version release</link>.
</para>
<para>
@@ -1036,7 +393,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
<listitem>
<para>
<link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>
additionally displays the node priority and the interval (in seconds) since the
&repmgrd; instance last verified its upstream node was available.
</para>

View File

@@ -43,12 +43,6 @@
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
PostgreSQL version
</simpara>
</listitem>
<listitem>
<simpara>
&repmgr; version
@@ -74,18 +68,9 @@
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL 11 and earlier: contents of the <filename>recovery.conf</filename> file
(suitably anonymized if necessary).
</simpara>
</listitem>
<listitem>
<simpara>
PostgreSQL 12 and later: contents of the <filename>postgresql.auto.conf</filename> file
(suitably anonymized if necessary), and whether or not the PostgreSQL data directory
contains the files <filename>standby.signal</filename> and/or <filename>recovery.signal</filename>.
PostgreSQL version
</simpara>
</listitem>
@@ -105,8 +90,8 @@
</para>
<para>
In all cases it is <emphasis>extremely</emphasis> useful to receive
as much detail as possible on how to reliably reproduce
an issue.
information on how to reliably reproduce an issue with as much detail as
possible.
</para>
</sect1>

8
doc/bdr-failover.md Normal file
View File

@@ -0,0 +1,8 @@
BDR failover with repmgrd
=========================
This document has been integrated into the main `repmgr` documentation
and is now located here:
> [BDR failover with repmgrd](https://repmgr.org/docs/current/repmgrd-bdr.html)

View File

@@ -46,7 +46,6 @@
<para>
WAL management on the primary becomes much easier as there's no need
to use replication slots, and <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>)
does not need to be set.
</para>
</listitem>
@@ -65,9 +64,8 @@
</para>
<para>
Barman's parallel restore facility can be used by executing it manually on
the Barman server and configuring replication on the resulting cloned
standby using
<command><link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link></command>.
the Barman server and integrating the resulting cloned standby using
<command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>.
</para>
</note>
@@ -104,29 +102,16 @@
under the &quot;<literal>barman</literal>&quot; user account,
<filename>repmgr.conf</filename> should contain the following entries:
<programlisting>
barman_host='barman@barmansrv'
barman_server='pg'</programlisting>
</para>
<para>
Here <literal>pg</literal> corresponds to a section in Barman's configuration file for a specific
server backup configuration, which would look something like:
<programlisting>
[pg]
description = "Main cluster"
...
</programlisting>
</para>
<para>
More details on Barman configuration can be found in the
<ulink url="https://docs.pgbarman.org/">Barman documentation</ulink>'s
<ulink url="https://docs.pgbarman.org/#configuration">configuration section</ulink>.
barman_host=barman@barmansrv
barman_server=somedb</programlisting>
</para>
<note>
<para>
To use a non-default Barman configuration file on the Barman server,
specify this in <filename>repmgr.conf</filename> with <filename>barman_config</filename>:
<programlisting>
barman_config='/path/to/barman.conf'</programlisting>
barman_config=/path/to/barman.conf</programlisting>
</para>
</note>
@@ -148,15 +133,6 @@ description = "Main cluster"
section in <command>man 5 ssh_config</command> for more details.
</simpara>
</tip>
<para>
If you wish to place WAL files in a location outside the main
PostgreSQL data directory, set <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
<option>pg_basebackup_options</option> to the target directory
(must be an absolute filepath). &repmgr; will create and
symlink to this directory in exactly the same way
<application>pg_basebackup</application> would.
</para>
<para>
It's now possible to clone a standby from Barman, e.g.:
<programlisting>
@@ -212,9 +188,9 @@ description = "Main cluster"
<filename>/usr/bin/barman-wal-restore</filename>,
<filename>repmgr.conf</filename> should include the following lines:
<programlisting>
barman_host='barman@barmansrv'
barman_server='pg'
restore_command='/usr/bin/barman-wal-restore barmansrv pg %f %p'</programlisting>
barman_host=barman@barmansrv
barman_server=somedb
restore_command=/usr/bin/barman-wal-restore barmansrv somedb %f %p</programlisting>
</para>
<note>
<simpara>
@@ -244,8 +220,7 @@ description = "Main cluster"
that any standby connected to the primary using a replication slot will always
be able to retrieve the required WAL files. This removes the need to manually
manage WAL file retention by estimating the number of WAL files that need to
be maintained on the primary using <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>).
be maintained on the primary using <varname>wal_keep_segments</varname>.
Do however be aware that if a standby is disconnected, WAL will continue to
accumulate on the primary until either the standby reconnects or the replication
slot is dropped.
@@ -449,13 +424,6 @@ description = "Main cluster"
WAL directory. Any WALs generated during the cloning process will be copied here, and
a symlink will automatically be created from the main data directory.
</para>
<tip>
<para>
The <literal>--waldir</literal> (<literal>--xlogdir</literal>) option,
if present in <varname>pg_basebackup_options</varname>, will be honoured by &repmgr;
when cloning from Barman (&repmgr; 5.2 and later).
</para>
</tip>
<para>
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
for more details of available options.
@@ -476,8 +444,10 @@ description = "Main cluster"
<para>
The recommended way to do this is to store the password in the <literal>postgres</literal> system
user's <filename>~/.pgpass</filename> file. For more information on using the password file, see
the documentation section <xref linkend="configuration-password-file"/>.
user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the
environment variable <varname>PGPASSWORD</varname>, however this is not recommended for
security reasons. For more details see the
<ulink url="https://www.postgresql.org/docs/current/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
</para>
<note>
@@ -499,6 +469,19 @@ description = "Main cluster"
will need to be set during any action which causes <filename>recovery.conf</filename> to be
rewritten, e.g. <xref linkend="repmgr-standby-follow"/>.
</para>
<para>
It is of course also possible to include the password value in the <varname>conninfo</varname>
string for each node, but this is obviously a security risk and should be avoided.
</para>
<para>
From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname>
parameter in connection strings, which can be used to specify a password file other than
the default <filename>~/.pgpass</filename>.
</para>
<para>
To have &repmgr; write a custom password file in <varname>primary_conninfo</varname>,
specify its location in <varname>passfile</varname> in <filename>repmgr.conf</filename>.
</para>
</sect2>
<sect2 id="cloning-advanced-replication-user" xreflabel="Separate replication user">
@@ -514,34 +497,6 @@ description = "Main cluster"
cloning a node or executing <xref linkend="repmgr-standby-follow"/>.
</para>
</sect2>
<sect2 id="cloning-advanced-tablespace-mapping" xreflabel="Tablespace mapping">
<title>Tablespace mapping</title>
<indexterm>
<primary>tablespace mapping</primary>
</indexterm>
<para>
&repmgr; provides a <option>tablespace_mapping</option> configuration
file option, which will makes it possible to map the tablespace on the source node to
a different location on the local node.
</para>
<para>
To use this, add <option>tablespace_mapping</option> to <filename>repmgr.conf</filename>
like this:
<programlisting>
tablespace_mapping='/var/lib/pgsql/tblspc1=/data/pgsql/tblspc1'
</programlisting>
</para>
<para>
where the left-hand value represents the tablespace on the source node,
and the right-hand value represents the tablespace on the standby to be cloned.
</para>
<para>
This parameter can be provided multiple times.
</para>
</sect2>
</sect1>

View File

@@ -66,8 +66,17 @@
</term>
<listitem>
<para>
Must be <literal>physical</literal> (the default).
Must be one of <literal>physical</literal> (for standard streaming replication)
or <literal>bdr</literal>.
</para>
<note>
<para>
Replication type <literal>bdr</literal> can only be used with BDR 2.x
</para>
<para>
BDR 3.x users should use <literal>physical</literal>.
</para>
</note>
</listitem>
</varlistentry>
@@ -110,27 +119,6 @@
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-ssh-options" xreflabel="ssh_options">
<term><varname>ssh_options</varname> (<type>string</type>)
<indexterm>
<primary><varname>ssh_options</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Options to append to the <command>ssh</command> command when executed
by &repmgr;.
</para>
<para>
We recommend adding <literal>-q</literal> to suppress any superfluous
SSH chatter such as login banners, and also an explicit
<option>ConnectTimeout</option> value,
e.g.:
<programlisting>
ssh_options='-q -o ConnectTimeout=10'</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>

View File

@@ -31,8 +31,6 @@
<secondary>format</secondary>
</indexterm>
<para>
<filename>repmgr.conf</filename> is a plain text file with one parameter/value
combination per line.
@@ -41,10 +39,14 @@
Whitespace is insignificant (except within a quoted parameter value) and blank lines are ignored.
Hash marks (<literal>#</literal>) designate the remainder of the line as a comment.
Parameter values that are not simple identifiers or numbers should be single-quoted.
Note that single quote cannot be embedded in a parameter value.
</para>
<para>
To embed a single quote in a parameter value, write either two quotes (preferred) or backslash-quote.
</para>
<important>
<para>
&repmgr; will interpret double-quotes as being part of a string value; only use single quotes
to quote parameter values.
</para>
</important>
<para>
Example of a valid <filename>repmgr.conf</filename> file:
@@ -54,77 +56,9 @@
node_id=1
node_name= node1
conninfo ='host=node1 dbname=repmgr user=repmgr connect_timeout=2'
data_directory = '/var/lib/pgsql/12/data'</programlisting>
data_directory = /var/lib/pgsql/11/data</programlisting>
</para>
<note>
<para>
Beginning with <link linkend="release-5.0">repmgr 5.0</link>, configuration
file parsing has been tightened up and now matches the way PostgreSQL
itself parses configuration files.
</para>
<para>
This means <filename>repmgr.conf</filename> files used with earlier &repmgr;
versions may need slight modification before they can be used with &repmgr; 5
and later.
</para>
<para>
The main change is that &repmgr; requires most string values to be
enclosed in single quotes. For example, this was previously valid:
<programlisting>
conninfo=host=node1 user=repmgr dbname=repmgr connect_timeout=2</programlisting>
but must now be changed to:
<programlisting>
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlisting>
</para>
</note>
<sect3 id="configuration-file-include-directives" xreflabel="configuration file include directives">
<title>Configuration file include directives</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>include directives</secondary>
</indexterm>
<para>
From &repmgr; 5.2, the configuration file can contain the following include directives:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>include</option>: include the specified file,
either as an absolute path or path relative to the current file
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_if_exists</option>: include the specified file.
The file is specified as an absolute path or path relative to the current file.
However, if it does not exist, an error will not be raised.
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_dir</option>: include files in the specified directory
which have the <filename>.conf</filename> suffix.
The directory is specified either as an absolute path or path
relative to the current file
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
These behave in exactly the same way as the PostgreSQL configuration file processing;
see the <ulink url="https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES">PostgreSQL documentation</ulink>
for additional details.
</para>
</sect3>
</sect2>
@@ -227,14 +161,6 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
</itemizedlist>
</para>
<para>
In examples provided in this documentation, it is assumed the configuration file is located
at <filename>/etc/repmgr.conf</filename>. If &repmgr; is installed from a package, the
configuration file will probably be located at another location specified by the packager;
see appendix <xref linkend="appendix-packages"/> for configuration file locations in
different packaging systems.
</para>
<para>
Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
an error will be raised if it is not found or not readable, and no attempt will be made to
@@ -255,61 +181,6 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
</sect2>
<sect2 id="configuration-file-postgresql-major-upgrades" xreflabel="configuration file and PostgreSQL major version upgrades">
<title>Configuration file and PostgreSQL major version upgrades</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>ostgreSQL major version upgrades</secondary>
</indexterm>
<para>
When upgrading the PostgreSQL cluster to a new major version, <filename>repmgr.conf</filename>
will probably needed to be updated.
</para>
<para>
Usually <option>pg_bindir</option> and <option>data_directory</option> will need to be modified,
particularly if the default package locations are used, as these usually change.
</para>
<para>
It's also possible the location of <filename>repmgr.conf</filename> itself will change
(e.g. from <filename>/etc/repmgr/11/repmgr.conf</filename> to <filename>/etc/repmgr/12/repmgr.conf</filename>).
This is stored as part of the &repmgr; metadata and is used by &repmgr; to execute &repmgr; remotely
(e.g. during a <link linkend="performing-switchover">switchover operation</link>).
</para>
<para>
If the content and/or location of <filename>repmgr.conf</filename> has changed, the &repmgr; metadata
needs to be updated to reflect this. The &repmgr; metadata can be updated on each node with:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<link linkend="repmgr-primary-register">
<command>repmgr primary register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-standby-register">
<command>repmgr standby register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-witness-register">
<command>repmgr witness register --force -f /path/to/repmgr.conf -h primary_host</command>
</link>
</simpara>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>

View File

@@ -1,175 +0,0 @@
<sect1 id="configuration-password-management" xreflabel="password management">
<title>Password Management</title>
<indexterm>
<primary>passwords</primary>
</indexterm>
<sect2 id="configuration-password-management-options" xreflabel="password management options">
<title>Password Management Options</title>
<indexterm>
<primary>passwords</primary>
<secondary>options for managing</secondary>
</indexterm>
<para>
For security purposes it's desirable to protect database access using a password.
</para>
<para>
PostgreSQL has three ways of providing a password:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
including the password in the <option>conninfo</option> string
(e.g. &quot;<literal>host=node1 dbname=repmgr user=repmgr password=foo</literal>&quot;)
</simpara>
</listitem>
<listitem>
<simpara>
exporting the password as an environment variable (<envar>PGPASSWORD</envar>)
</simpara>
</listitem>
<listitem>
<simpara>
storing the password in a dedicated password file
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
We strongly advise against including the password in the <option>conninfo</option> string, as
this will result in the database password being exposed in various places, including in the
<filename>repmgr.conf</filename> file, the <literal>repmgr.nodes</literal> table, any output
generated by &repmgr; which lists the node <option>conninfo</option> strings (e.g.
<link linkend="repmgr-cluster-show">repmgr cluster show</link>) and in the &repmgr; log file,
particularly at <option>log_level=DEBUG</option>.
</para>
<note>
<para>
Currently &repmgr; does not fully support use of the <option>password</option> option in the
<option>conninfo</option> string.
</para>
</note>
<para>
Exporting the password as an environment variable (<envar>PGPASSWORD</envar>) is considered
less insecure, but the PostgreSQL documentation explicitly recommends against doing this:
<blockquote>
<attribution><ulink url="https://www.postgresql.org/docs/current/libpq-envars.html">Environment Variables</ulink></attribution>
<para>
<envar>PGPASSWORD</envar> behaves the same as the <option>password</option>
connection parameter. Use of this environment variable
is not recommended for security reasons, as some operating systems
allow non-root users to see process environment variables via
<application>ps</application>; instead consider using a password file.
</para>
</blockquote>
</para>
<para>
The most secure option for managing passwords is to use a dedicated password file; see the following
section for more details.
</para>
</sect2>
<sect2 id="configuration-password-file" xreflabel="password file">
<title>Using a password file</title>
<indexterm>
<primary>pgpass</primary>
</indexterm>
<indexterm>
<primary>.pgpass</primary>
</indexterm>
<indexterm>
<primary>passwords</primary>
<secondary>using a password file</secondary>
</indexterm>
<para>
The most secure way of storing passwords is in a password file,
which by default is <filename>~/.pgpass</filename>. This file
can only be read by the system user who owns the file, and
PostgreSQL will refuse to use the file unless read/write
permissions are restricted to the file owner. The password(s)
contained in the file will not be directly accessed by
&repmgr; (or any other libpq-based client software such as <application>psql</application>).
</para>
<para>
For full details see the
<ulink url="https://www.postgresql.org/docs/current/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
</para>
<para>
For use with &repmgr;, the <filename>~/.pgpass</filename> must two entries for each
node in the replication cluster: one for the &repmgr; user who accesses the &repmgr; metadatabase,
and one for replication connections (regardless of whether a dedicated replication user is used).
The file must be present on each node in the replication cluster.
</para>
<para>
A <filename>~/.pgpass</filename> file for a 3-node cluster where the <literal>repmgr</literal> database user
is used for both for accessing the &repmgr; metadatabase and for replication connections would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:replication:repmgr:foo
node2:5432:repmgr:repmgr:foo
node2:5432:replication:repmgr:foo
node3:5432:repmgr:repmgr:foo
node3:5432:replication:repmgr:foo</programlisting>
If a dedicated replication user (here: <literal>repluser</literal>) is in use, the file would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:replication:repluser:foo
node2:5432:repmgr:repmgr:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:replication:repluser:foo</programlisting>
If you are planning to use the <option>-S</option>/<option>--superuser</option> option,
there must also be an entry enabling the superuser to connect to the &repmgr; database.
Assuming the superuser is <literal>postgres</literal>, the file would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:repmgr:postgres:foo
node1:5432:replication:repluser:foo
node2:5432:repmgr:repmgr:foo
node2:5432:repmgr:postgres:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:repmgr:postgres:foo
node3:5432:replication:repluser:foo</programlisting>
</para>
<para>
The <filename>~/.pgpass</filename> file can be simplified with the use of wildcards if
there is no requirement to restrict provision of passwords to particular hosts, ports
or databases. The preceding file could then be formatted like this:
<programlisting>
*:*:*:repmgr:foo
*:*:*:postgres:foo
</programlisting>
</para>
<note>
<para>
It's possible to specify an alternative location for the <filename>~/.pgpass</filename> file, either via
the environment variable <envar>PGPASSFILE</envar>, or (from PostgreSQL 9.6) using the
<varname>passfile</varname> parameter in connection strings.
</para>
<para>
If using the <varname>passfile</varname> parameter, it's essential to ensure the file is in the same
location on all nodes, as when connecting to a remote node, the file referenced is the one on the
local node.
</para>
<para>
Additionally, you <emphasis>must</emphasis> specify the passfile location in <filename>repmgr.conf</filename>
with the <option>passfile</option> option so &repmgr; can write the correct path when creating the
<option>primary_conninfo</option> parameter for replication configuration on standbys.
</para>
</note>
</sect2>
</sect1>

View File

@@ -1,25 +0,0 @@
<sect1 id="configuration-permissions" xreflabel="Database user permissions">
<title>repmgr database user permissions</title>
<indexterm>
<primary>configuration</primary>
<secondary>database user permissions</secondary>
</indexterm>
<para>
&repmgr; requires that the database defined in the <varname>conninfo</varname>
setting contains the <literal>repmgr</literal> extension. The database user defined in the
<varname>conninfo</varname> setting must be able to access this database and
the database objects contained within the extension.
</para>
<para>
The <literal>repmgr</literal> extension can only be installed by a superuser.
If the &repmgr; user is a superuser, &repmgr; will create the extension automatically.
</para>
<para>
Alternatively, the extension can be created manually by a superuser
(with &quot;<command>CREATE EXTENSION repmgr</command>&quot;) before executing
<link linkend="repmgr-primary-register">repmgr primary register</link>.
</para>
</sect1>

View File

@@ -148,19 +148,9 @@
instances in the replication cluster which may potentially become a primary server or
(in cascading replication) the upstream server of a standby.
</para>
<para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-WAL-SENDERS">max_wal_senders</ulink>.
</para>
<note>
<para>
From <productname>PostgreSQL 12</productname>, <option>max_wal_senders</option>
<emphasis>must</emphasis> be set to the same or a higher value as the primary node
(at the time the node was cloned), otherwise the standby will refuse
to start (unless <option>hot_standby</option> is set to <literal>off</literal>, which
will prevent the node from accepting queries).
</para>
</note>
</listitem>
</varlistentry>
@@ -270,7 +260,7 @@
<varlistentry>
<term><option>wal_keep_segments</option> / <option>wal_keep_size</option></term>
<term><option>wal_keep_segments</option></term>
<listitem>
@@ -279,36 +269,25 @@
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<indexterm>
<primary>wal_keep_size</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
Normally there is no need to set <option>wal_keep_segments</option>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>; default: <literal>0</literal>),
as it is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL
segments are available to standbys. Replication slots and/or an archiving solution
such as Barman are recommended to ensure standbys have a reliable
Normally there is no need to set <option>wal_keep_segments</option> (default: <literal>0</literal>), as it
is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL segments are available to standbys.
Replication slots and/or an archiving solution such as Barman are recommended to ensure standbys have a reliable
source of WAL segments at all times.
</para>
<para>
The only reason ever to set <option>wal_keep_segments</option> / <option>wal_keep_size</option>
is you have you have configured <option>pg_basebackup_options</option>
The only reason ever to set <option>wal_keep_segments</option> is you have
you have configured <option>pg_basebackup_options</option>
in <filename>repmgr.conf</filename> to include the setting <literal>--wal-method=fetch</literal>
(PostgreSQL 9.6 and earlier: <literal>--xlog-method=fetch</literal>)
<emphasis>and</emphasis> you have <emphasis>not</emphasis> set <option>restore_command</option>
in <filename>repmgr.conf</filename> to fetch WAL files from a reliable source such as Barman,
in which case you'll need to set <option>wal_keep_segments</option>
to a sufficiently high number to ensure that all WAL files required by the standby
are retained. However we do not recommend WAL retention in this way.
are retained. However we do not recommend managing replication in this way.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS">wal_keep_segments</ulink>.
<!--
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SIZE">wal_keep_size</ulink>.
-->
</para>
</listitem>
</varlistentry>
@@ -329,7 +308,23 @@
&configuration-file-optional-settings;
&configuration-file-log-settings;
&configuration-file-service-commands;
&configuration-permissions;
&configuration-password-management;
<sect1 id="configuration-permissions" xreflabel="Database user permissions">
<title>repmgr database user permissions</title>
<indexterm>
<primary>configuration</primary>
<secondary>database user permissions</secondary>
</indexterm>
<para>
&repmgr; will create an extension database containing objects
for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname>
setting must be able to access all objects. Additionally, superuser permissions
are required to install the &repmgr; extension. The easiest way to do this
is create the &repmgr; user as a superuser, however if this is not
desirable, the &repmgr; user can be created as a normal user and a
superuser specified with <literal>--superuser</literal> when registering a &repmgr; node.
</para>
</sect1>
</chapter>

View File

@@ -117,6 +117,10 @@
<literal>conninfo</literal> string of the primary node
(<xref linkend="repmgr-standby-register"/> and <xref linkend="repmgr-standby-follow"/>)
</para>
<para>
<literal>conninfo</literal> string of the next available node
(<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
@@ -126,6 +130,9 @@
<para>
name of the current primary node (<xref linkend="repmgr-standby-register"/> and <xref linkend="repmgr-standby-follow"/>)
</para>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
@@ -141,7 +148,7 @@
the notification types can be filtered to explicitly named ones using the
<varname>event_notifications</varname> parameter, e.g.:
<programlisting>
event_notifications='primary_register,standby_register,witness_register'</programlisting>
event_notifications=primary_register,standby_register,witness_register</programlisting>
</para>
@@ -266,6 +273,28 @@
</itemizedlist>
</para>
<para>
Events generated by &repmgrd; (BDR mode):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>bdr_failover</literal></simpara>
</listitem>
<listitem>
<simpara><literal>bdr_reconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal>bdr_recovery</literal></simpara>
</listitem>
<listitem>
<simpara><literal>bdr_register</literal></simpara>
</listitem>
<listitem>
<simpara><literal>bdr_unregister</literal></simpara>
</listitem>
</itemizedlist>
</para>
<para>
Note that under some circumstances (e.g. when no replication cluster primary
could be located), it will not be possible to write an entry into the

View File

@@ -21,8 +21,6 @@
<!ENTITY configuration-file-optional-settings SYSTEM "configuration-file-optional-settings.xml">
<!ENTITY configuration-file-log-settings SYSTEM "configuration-file-log-settings.xml">
<!ENTITY configuration-file-service-commands SYSTEM "configuration-file-service-commands.xml">
<!ENTITY configuration-permissions SYSTEM "configuration-permissions.xml">
<!ENTITY configuration-password-management SYSTEM "configuration-password-management.xml">
<!ENTITY cloning-standbys SYSTEM "cloning-standbys.xml">
<!ENTITY promoting-standby SYSTEM "promoting-standby.xml">
<!ENTITY follow-new-primary SYSTEM "follow-new-primary.xml">
@@ -35,6 +33,7 @@
<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.xml">
<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.xml">
<!ENTITY repmgrd-operation SYSTEM "repmgrd-operation.xml">
<!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.xml">
<!ENTITY repmgr-primary-register SYSTEM "repmgr-primary-register.xml">
<!ENTITY repmgr-primary-unregister SYSTEM "repmgr-primary-unregister.xml">
@@ -55,11 +54,11 @@
<!ENTITY repmgr-cluster-crosscheck SYSTEM "repmgr-cluster-crosscheck.xml">
<!ENTITY repmgr-cluster-event SYSTEM "repmgr-cluster-event.xml">
<!ENTITY repmgr-cluster-cleanup SYSTEM "repmgr-cluster-cleanup.xml">
<!ENTITY repmgr-service-status SYSTEM "repmgr-service-status.xml">
<!ENTITY repmgr-service-pause SYSTEM "repmgr-service-pause.xml">
<!ENTITY repmgr-service-unpause SYSTEM "repmgr-service-unpause.xml">
<!ENTITY repmgr-daemon-status SYSTEM "repmgr-daemon-status.xml">
<!ENTITY repmgr-daemon-start SYSTEM "repmgr-daemon-start.xml">
<!ENTITY repmgr-daemon-stop SYSTEM "repmgr-daemon-stop.xml">
<!ENTITY repmgr-daemon-pause SYSTEM "repmgr-daemon-pause.xml">
<!ENTITY repmgr-daemon-unpause SYSTEM "repmgr-daemon-unpause.xml">
<!ENTITY appendix-release-notes SYSTEM "appendix-release-notes.xml">
<!ENTITY appendix-faq SYSTEM "appendix-faq.xml">

View File

@@ -26,18 +26,10 @@
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink>; see following
section for details.
</para>
<note>
<para>
Currently the <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink> provides
support for RedHat/CentOS versions 5, 6 and 7. Support for version 8 is
available via the PGDG repository; see below for details.
</para>
</note>
<para>
RPM packages for &repmgr; are also available via Yum through
the PostgreSQL Global Development Group (PGDG) RPM repository
(<ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink>).
the PostgreSQL Global Development Group RPM repository
(<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>).
Follow the instructions for your distribution (RedHat, CentOS,
Fedora, etc.) and architecture as detailed there. Note that it can take some days
for new &repmgr; packages to become available via the this repository.
@@ -50,10 +42,6 @@
customers, as the PostgreSQL filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
</para>
<para>
See also <link linkend="appendix-faq">FAQ</link> entry
<xref linkend="faq-third-party-packages"/>.
</para>
</note>
<para>
@@ -93,9 +81,9 @@
(this enables the 2ndQuadrant repository as a source of &repmgr; packages).
</para>
<para>
For example, for PostgreSQL 11 on CentOS, execute:
For example, for PostgreSQL 10 on CentOS, execute:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/11/rpm | sudo bash</programlisting>
curl https://dl.2ndquadrant.com/default/release/get/10/rpm | sudo bash</programlisting>
</para>
<para>
@@ -111,8 +99,8 @@ curl https://dl.2ndquadrant.com/default/release/get/9.6/rpm | sudo bash</program
sudo yum repolist</programlisting>
The output should contain two entries like this:
<programlisting>
2ndquadrant-dl-default-release-pg11/7/x86_64 2ndQuadrant packages (PG11) for 7 - x86_64 18
2ndquadrant-dl-default-release-pg11-debug/7/x86_64 2ndQuadrant packages (PG11) for 7 - x86_64 - Debug 8</programlisting>
2ndquadrant-dl-default-release-pg10/7/x86_64 2ndQuadrant packages (PG10) for 7 - x86_64 4
2ndquadrant-dl-default-release-pg10-debug/7/x86_64 2ndQuadrant packages (PG10) for 7 - x86_64 - Debug 3</programlisting>
</para>
</listitem>
@@ -120,7 +108,7 @@ sudo yum repolist</programlisting>
<para>
Install the &repmgr; version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting>
sudo yum install repmgr11</programlisting>
sudo yum install repmgr10</programlisting>
</para>
<note>
<para>
@@ -165,23 +153,19 @@ yum search repmgr</programlisting>
To install a specific package version, execute <command>yum --showduplicates list</command>
for the package in question:
<programlisting>
[root@localhost ~]# yum --showduplicates list repmgr11
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: ftp.tsukuba.wide.ad.jp
* epel: nrt.edge.kernel.org
* extras: ftp.tsukuba.wide.ad.jp
* updates: ftp.tsukuba.wide.ad.jp
Installed Packages
repmgr11.x86_64 4.4.0-1.rhel7 @pgdg11
Available Packages
repmgr11.x86_64 4.2-1.el7 2ndquadrant-dl-default-release-pg11
repmgr11.x86_64 4.2-2.el7 2ndquadrant-dl-default-release-pg11
repmgr11.x86_64 4.3-1.el7 2ndquadrant-dl-default-release-pg11
repmgr11.x86_64 4.4-1.el7 2ndquadrant-dl-default-release-pg11</programlisting>
[root@localhost ~]# yum --showduplicates list repmgr10
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp
Available Packages
repmgr10.x86_64 4.0.3-1.rhel7 pgdg10
repmgr10.x86_64 4.0.4-1.rhel7 pgdg10
repmgr10.x86_64 4.0.5-1.el7 2ndquadrant-repo-10</programlisting>
then append the appropriate version number to the package name with a hyphen, e.g.:
<programlisting>
[root@localhost ~]# yum install repmgr11-4.3-1.el7</programlisting>
[root@localhost ~]# yum install repmgr10-4.0.3-1.rhel7</programlisting>
</para>
<para>
@@ -206,7 +190,7 @@ repmgr11.x86_64 4.4-1.el7 2nd
</indexterm>
<para>.deb packages for &repmgr; are available from the
PostgreSQL Community APT repository (<ulink url="https://apt.postgresql.org/">https://apt.postgresql.org/</ulink>).
PostgreSQL Community APT repository (<ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink>).
Instructions can be found in the APT section of the PostgreSQL Wiki
(<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
</para>
@@ -236,13 +220,13 @@ repmgr11.x86_64 4.4-1.el7 2nd
<itemizedlist>
<listitem>
<para>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the 2ndQuadrant repository as a source of &repmgr; packages) by executing:
(this enables the 2ndQuadrant repository as a source of &repmgr; packages) by executing:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash</programlisting>
</para>
curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash</programlisting>
</para>
<note>
<para>
This will automatically install the following additional packages, if not already present:
@@ -258,20 +242,20 @@ repmgr11.x86_64 4.4-1.el7 2nd
</note>
</listitem>
<listitem>
<para>
Install the &repmgr; version appropriate for your PostgreSQL version (e.g. <literal>repmgr11</literal>):
<listitem>
<para>
Install the &repmgr; version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting>
sudo apt-get install postgresql-11-repmgr</programlisting>
</para>
sudo apt-get install postgresql-10-repmgr</programlisting>
</para>
<note>
<para>
For packages for PostgreSQL 9.6 and earlier, the package name includes
a period between major and minor version numbers, e.g.
<literal>postgresql-9.6-repmgr</literal>.
For packages for PostgreSQL 9.6 and earlier, the package name includes
a period between major and minor version numbers, e.g.
<literal>postgresql-9.6-repmgr</literal>.
</para>
</note>
</listitem>
</listitem>
</itemizedlist>

View File

@@ -14,7 +14,7 @@
</para>
<para>
&repmgr; &repmgrversion; is compatible with all PostgreSQL versions from 9.4. See
&repmgr; &repmgrversion; is compatible with all PostgreSQL versions from 9.3. See
section <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
for an overview of version compatibility.
</para>
@@ -99,9 +99,6 @@
<entry>
&repmgr; version
</entry>
<entry>
Supported?
</entry>
<entry>
Latest release
</entry>
@@ -112,62 +109,12 @@
</thead>
<tbody>
<row>
<entry>
&repmgr; 5.2
</entry>
<entry>
YES
</entry>
<entry>
<link linkend="release-current">&repmgrversion;</link> (&releasedate;)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13
</entry>
</row>
<row>
<entry>
&repmgr; 5.1
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.1.0">5.1.0</link> (2020-04-13)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
</row>
<row>
<entry>
&repmgr; 5.0
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.0">5.0</link> (2019-10-15)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
</row>
<row>
<entry>
&repmgr; 4.x
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-4.4">4.4</link> (2019-06-27)
<link linkend="release-current">&repmgrversion;</link> (&releasedate;)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11
@@ -178,9 +125,6 @@
<entry>
&repmgr; 3.x
</entry>
<entry>
NO
</entry>
<entry>
<ulink url="https://repmgr.org/release-notes-3.3.2.html">3.3.2</ulink> (2017-05-30)
</entry>
@@ -193,9 +137,6 @@
<entry>
&repmgr; 2.x
</entry>
<entry>
NO
</entry>
<entry>
<ulink url="https://repmgr.org/release-notes-2.0.3.html">2.0.3</ulink> (2015-04-16)
</entry>
@@ -213,61 +154,28 @@
The &repmgr; 2.x and 3.x series are no longer maintained or supported.
We strongly recommend upgrading to the latest &repmgr; version.
</para>
<para>
Following the release of &repmgr; 5.0, there will be no further releases of
the &repmgr; 4.x series. Note that &repmgr; 5.x is an incremental development
of the 4.x series and &repmgr; 4.x users should upgrade to this as soon as possible.
</para>
</important>
</sect2>
<sect2 id="install-postgresql-93-94">
<title>PostgreSQL 9.4 support</title>
<indexterm>
<primary>PostgreSQL 9.4</primary>
<secondary>repmgr support</secondary>
</indexterm>
<para>
Note that some &repmgr; functionality is not available in PostgreSQL 9.4:
Note that some &repmgr; functionality is not available in PostgreSQL 9.3 and PostgreSQL 9.4:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
In PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
PostgreSQL 9.3 does not support replication slots, so corresponding &repmgr; functionality
is not available.
</para>
</listitem>
<listitem>
<para>
In PostgreSQL 9.3 and PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
distribution. <command>pg_rewind</command> will need to be compiled separately to be able
to use any &repmgr; functionality which takes advantage of it.
</para>
</listitem>
</itemizedlist>
<warning>
<para>
PostgreSQL 9.3 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.3/release-9-3-25.html">9.3.25</ulink>
in November 2018) and will no longer be updated with security or bugfixes.
</para>
<para>
Beginning with &repmgr; 5.2, &repmgr; no longer supports PostgreSQL 9.3.
</para>
<para>
PostgreSQL 9.4 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.4/release-9-4-26.html">9.4.26</ulink>
in February 2020) and will no longer be updated with security or bugfixes.
</para>
<para>
We recommend that users of these versions migrate to a supported PostgreSQL version
as soon as possible.
</para>
<para>
For further details, see the <ulink url="https://www.postgresql.org/support/versioning/">PostgreSQL Versioning Policy</ulink>.
</para>
</warning>
</sect2>
</sect1>

View File

@@ -24,7 +24,8 @@
<listitem>
<para>
<literal>Debian</literal> and <literal>Ubuntu</literal>: First
add the <ulink url="https://apt.postgresql.org/">apt.postgresql.org</ulink>
add the <ulink
url="http://apt.postgresql.org/">apt.postgresql.org</ulink>
repository to your <filename>sources.list</filename> if you
have not already done so, and ensure the source repository is enabled.
</para>
@@ -35,8 +36,8 @@
line in the repository file, which is usually
<filename>/etc/apt/sources.list.d/pgdg.list</filename>, e.g.:
<programlisting>
deb https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main
deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlisting>
deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main
deb-src http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlisting>
</para>
</tip>
<para>
@@ -60,9 +61,6 @@ deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlist
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>flex</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libedit-dev</literal></simpara>
</listitem>
@@ -117,9 +115,6 @@ deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlist
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>flex</literal></simpara>
</listitem>
<listitem>
<simpara><literal>libselinux-devel</literal></simpara>
</listitem>
@@ -182,8 +177,7 @@ deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlist
</para>
<para>
There are also tags for each <ulink url="https://github.com/2ndQuadrant/repmgr/releases">&repmgr; release</ulink>, e.g.
<literal><ulink url="https://github.com/2ndQuadrant/repmgr/releases/tag/v4.4.0">v4.4.0</ulink></literal>.
There are also tags for each &repmgr; release, e.g. <literal>v4.2.0</literal>.
</para>
<para>

View File

@@ -3,7 +3,7 @@
<date>2017</date>
<copyright>
<year>2010-2020</year>
<year>2010-2019</year>
<holder>2ndQuadrant, Ltd.</holder>
</copyright>
@@ -11,7 +11,7 @@
<title>Legal Notice</title>
<para>
<productname>repmgr</productname> is Copyright &copy; 2010-2020
<productname>repmgr</productname> is Copyright &copy; 2010-2019
by 2ndQuadrant, Ltd. All rights reserved.
</para>

View File

@@ -167,7 +167,7 @@
For the sake of simplicity, the <literal>repmgr</literal> user is created
as a superuser. If desired, it's possible to create the <literal>repmgr</literal>
user as a normal user. However for certain operations superuser permissions
are required; in this case the command line option <command>--superuser</command>
are requiredl; in this case the command line option <command>--superuser</command>
can be provided to specify a superuser.
</para>
<para>
@@ -254,7 +254,7 @@
</para>
<programlisting>
node_id=1
node_name='node1'
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
</programlisting>
@@ -265,6 +265,7 @@
server. See sections <xref linkend="configuration"/> and <xref linkend="configuration-file"/>
for further details about <filename>repmgr.conf</filename>.
</para>
<note>
<para>
&repmgr; only uses <option>pg_bindir</option> when it executes
@@ -367,7 +368,7 @@
</para>
<programlisting>
node_id=2
node_name='node2'
node_name=node2
conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'</programlisting>
<para>
@@ -477,8 +478,6 @@
latest_end_lsn | 0/7000538
latest_end_time | 2017-08-28 15:20:56.418735+09
slot_name |
sender_host | node1
sender_port | 5432
conninfo | user=repmgr dbname=replication host=node1 application_name=node2
</programlisting>
Note that the <varname>conninfo</varname> value is that generated in <filename>recovery.conf</filename>
@@ -498,12 +497,11 @@
<para>
Check the node is registered by executing <command>repmgr cluster show</command> on the standby:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | 100 | 1 | host=node2 dbname=repmgr user=repmgr</programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+---------+-----------+----------+----------+--------------------------------------
1 | node1 | primary | * running | | default | host=node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | host=node2 dbname=repmgr user=repmgr</programlisting>
</para>
<para>
Both nodes are now registered with &repmgr; and the records have been copied to the standby server.

View File

@@ -18,7 +18,7 @@
<para>
Displays information about each registered node in the replication cluster. This
command polls each registered server and shows its role (<literal>primary</literal> /
<literal>standby</literal>) and status. It polls each server
<literal>standby</literal> / <literal>bdr</literal>) and status. It polls each server
directly and can be run on any node in the cluster; this is also useful when analyzing
connectivity from a particular node.
</para>
@@ -53,13 +53,12 @@
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | running | node1 | default | 100 | 1 | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr
4 | node4 | standby | running | node1 | default | 100 | 1 | host=db_node4 dbname=repmgr user=repmgr
5 | node5 | witness | * running | node1 | default | 0 | n/a | host=db_node5 dbname=repmgr user=repmgr</programlisting>
3 | node3 | standby | running | node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr</programlisting>
</para>
</refsect1>
<refsect1>
@@ -84,20 +83,16 @@
<literal>node4</literal> is running but rejecting connections (from <literal>node3</literal> at least).
<programlisting>
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+----------------------+----------+----------+----------+----------+----------------------------------------------------
1 | node1 | primary | ? unreachable | | default | 100 | | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | ! running as primary | ? node1 | default | 100 | 2 | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | ? node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr
4 | node4 | standby | ? running | ? node1 | default | 100 | | host=db_node4 dbname=repmgr user=repmgr
----+-------+---------+----------------------+----------+----------+----------+----------+-----------------------------------------
1 | node1 | primary | ? unreachable | | default | 100 | 1 | host=db_node1 dbname=repmgr user=repmgr
2 | node2 | standby | ! running as primary | node1 | default | 100 | 2 | host=db_node2 dbname=repmgr user=repmgr
3 | node3 | standby | running | node1 | default | 100 | 1 | host=db_node3 dbname=repmgr user=repmgr
4 | node4 | standby | ? running | node1 | default | 100 | ? | host=db_node4 dbname=repmgr user=repmgr
WARNING: following issues were detected
- unable to connect to node "node1" (ID: 1)
- node "node1" (ID: 1) is registered as an active primary but is unreachable
- node "node2" (ID: 2) is registered as standby but running as primary
- unable to connect to node "node2" (ID: 2)'s upstream node "node1" (ID: 1)
- unable to determine if node "node2" (ID: 2) is attached to its upstream node "node1" (ID: 1)
- unable to connect to node "node3" (ID: 3)'s upstream node "node1" (ID: 1)
- unable to determine if node "node3" (ID: 3) is attached to its upstream node "node1" (ID: 1)
- unable to connect to node "node4" (ID: 4)
HINT: execute with --verbose option to see connection error messages</programlisting>
</para>
@@ -238,7 +233,7 @@
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status"/>, <xref linkend="repmgr-node-check"/>, <xref linkend="repmgr-service-status"/>
<xref linkend="repmgr-node-status"/>, <xref linkend="repmgr-node-check"/>, <xref linkend="repmgr-daemon-status"/>
</para>
</refsect1>

View File

@@ -1,6 +1,6 @@
<refentry id="repmgr-service-pause">
<refentry id="repmgr-daemon-pause">
<indexterm>
<primary>repmgr service pause</primary>
<primary>repmgr daemon pause</primary>
</indexterm>
<indexterm>
@@ -9,11 +9,11 @@
</indexterm>
<refmeta>
<refentrytitle>repmgr service pause</refentrytitle>
<refentrytitle>repmgr daemon pause</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service pause</refname>
<refname>repmgr daemon pause</refname>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to pause failover operations</refpurpose>
</refnamediv>
@@ -32,29 +32,20 @@
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr service pause</command>, as the &repmgrd; instance
<command>repmgr daemon pause</command>, as the &repmgrd; instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
<para>
<xref linkend="repmgr-service-unpause"/> will instruct all previously paused &repmgrd;
<xref linkend="repmgr-daemon-unpause"/> will instruct all previously paused &repmgrd;
instances to resume normal failover operation.
</para>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL must be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service pause</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service pause</command> can be executed on any active node in the
<command>repmgr daemon pause</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
It will have no effect on previously paused nodes.
</para>
@@ -64,7 +55,7 @@
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf service pause
$ repmgr -f /etc/repmgr.conf daemon pause
NOTICE: node 1 (node1) paused
NOTICE: node 2 (node2) paused
NOTICE: node 3 (node3) paused</programlisting>
@@ -88,7 +79,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr service unpause</command>:
One of the following exit codes will be emitted by <command>repmgr daemon unpause</command>:
</para>
<variablelist>
@@ -116,11 +107,7 @@ NOTICE: node 3 (node3) paused</programlisting>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-unpause"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
<xref linkend="repmgr-daemon-unpause"/>, <xref linkend="repmgr-daemon-status"/>
</para>
</refsect1>
</refentry>

View File

@@ -14,13 +14,13 @@
<refnamediv>
<refname>repmgr daemon start</refname>
<refpurpose>Start the &repmgrd; daemon on the local node</refpurpose>
<refpurpose>Start the &repmgrd; daemon</refpurpose>
</refnamediv>
<refsect1>
<title>Description</title>
<para>
This command starts the &repmgrd; service on the
This command starts the &repmgrd; daemon on the
local node.
</para>
<para>
@@ -197,11 +197,7 @@
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-daemon-stop"/>,
<xref linkend="repmgrd-daemon"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>
<xref linkend="repmgr-daemon-stop"/>, <xref linkend="repmgr-daemon-status"/>, <xref linkend="repmgrd-daemon"/>
</para>
</refsect1>

View File

@@ -1,19 +1,19 @@
<refentry id="repmgr-service-status">
<refentry id="repmgr-daemon-status">
<indexterm>
<primary>repmgr service status</primary>
<primary>repmgr daemon status</primary>
</indexterm>
<indexterm>
<primary>repmgrd</primary>
<secondary>displaying service status</secondary>
<secondary>displaying daemon status</secondary>
</indexterm>
<refmeta>
<refentrytitle>repmgr service status</refentrytitle>
<refentrytitle>repmgr daemon status</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service status</refname>
<refname>repmgr daemon status</refname>
<refpurpose>display information about the status of &repmgrd; on each node in the cluster</refpurpose>
</refnamediv>
@@ -22,33 +22,21 @@
<para>
This command provides an overview over all active nodes in the cluster and the state
of each node's &repmgrd; instance. It can be used to check
the result of <xref linkend="repmgr-service-pause"/> and <xref linkend="repmgr-service-unpause"/>
the result of <xref linkend="repmgr-daemon-pause"/> and <xref linkend="repmgr-daemon-unpause"/>
operations.
</para>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL should be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service status</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service status</command> can be executed on any active node in the
<command>repmgr daemon status</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
</para>
<para>
If a node is not accessible, or PostgreSQL itself is not running on the node,
&repmgr; will not be able to determine the status of that node's &repmgrd; instance,
and &quot;<literal>n/a</literal>&quot; will be displayed in the node's <literal>repmgrd</literal>
column.
If PostgreSQL is not running on a node, &repmgr; will not be able to determine the
status of that node's &repmgrd; instance.
</para>
<note>
<para>
After restarting PostgreSQL on any node, the &repmgrd; instance
@@ -63,7 +51,7 @@
<title>Examples</title>
<para>
&repmgrd; running normally on all nodes:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | no | n/a
@@ -72,8 +60,8 @@
</para>
<para>
&repmgrd; paused on all nodes (using <xref linkend="repmgr-service-pause"/>):
<programlisting>$ repmgr -f /etc/repmgr.conf service status
&repmgrd; paused on all nodes (using <xref linkend="repmgr-daemon-pause"/>):
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | yes | n/a
@@ -83,7 +71,7 @@
<para>
&repmgrd; not running on one node:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+-------------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 96563 | yes | n/a
@@ -101,11 +89,11 @@
<term><option>--csv</option></term>
<listitem>
<para>
<command>repmgr service status</command> accepts an optional parameter <literal>--csv</literal>, which
<command>repmgr daemon status</command> accepts an optional parameter <literal>--csv</literal>, which
outputs the replication cluster's status in a simple CSV format, suitable for
parsing by scripts, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf service status --csv
$ repmgr -f /etc/repmgr.conf daemon status --csv
1,node1,primary,1,1,5722,1,100,-1,default
2,node2,standby,1,0,-1,1,100,1,default
3,node3,standby,1,1,5779,1,100,1,default</programlisting>
@@ -183,7 +171,7 @@
<listitem>
<para>
Display additional information (<literal>location</literal>, <literal>priority</literal>)
about the &repmgr; configuration.
about the &repmgr; configuration.
</para>
</listitem>
</varlistentry>
@@ -192,7 +180,7 @@
<term><option>--verbose</option></term>
<listitem>
<para>
Display the full text of any database connection error messages.
Display the full text of any database connection error messages
</para>
</listitem>
</varlistentry>
@@ -204,12 +192,7 @@
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>,
<xref linkend="repmgr-cluster-show"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
<xref linkend="repmgr-daemon-pause"/>, <xref linkend="repmgr-daemon-unpause"/>, <xref linkend="repmgr-cluster-show"/>
</para>
</refsect1>
</refentry>

View File

@@ -14,7 +14,7 @@
<refnamediv>
<refname>repmgr daemon stop</refname>
<refpurpose>Stop the &repmgrd; daemon on the local node</refpurpose>
<refpurpose>Stop the &repmgrd; daemon</refpurpose>
</refnamediv>
<refsect1>
@@ -194,11 +194,7 @@
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgrd-daemon"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-unpause"/>
<xref linkend="repmgr-daemon-start"/>, <xref linkend="repmgr-daemon-status"/>, <xref linkend="repmgrd-daemon"/>
</para>
</refsect1>

View File

@@ -1,6 +1,6 @@
<refentry id="repmgr-service-unpause">
<refentry id="repmgr-daemon-unpause">
<indexterm>
<primary>repmgr service unpause</primary>
<primary>repmgr daemon unpause</primary>
</indexterm>
<indexterm>
@@ -8,12 +8,13 @@
<secondary>unpausing</secondary>
</indexterm>
<refmeta>
<refentrytitle>repmgr service unpause</refentrytitle>
<refentrytitle>repmgr daemon unpause</refentrytitle>
</refmeta>
<refnamediv>
<refname>repmgr service unpause</refname>
<refname>repmgr daemon unpause</refname>
<refpurpose>Instruct all &repmgrd; instances in the replication cluster to resume failover operations</refpurpose>
</refnamediv>
@@ -22,33 +23,24 @@
<para>
This command can be run on any active node in the replication cluster to instruct all
running &repmgrd; instances to &quot;unpause&quot;
(following a previous execution of <xref linkend="repmgr-service-pause"/>)
(following a previous execution of <xref linkend="repmgr-daemon-pause"/>)
and resume normal failover/monitoring operation.
</para>
<note>
<para>
It's important to wait a few seconds after restarting PostgreSQL on any node before running
<command>repmgr service pause</command>, as the &repmgrd; instance
<command>repmgr daemon pause</command>, as the &repmgrd; instance
on the restarted node will take a second or two before it has updated its status.
</para>
</note>
</refsect1>
<refsect1>
<title>Prerequisites</title>
<para>
PostgreSQL must be accessible on all nodes (using the <varname>conninfo</varname> string shown by
<link linkend="repmgr-cluster-show"><command>repmgr cluster show</command></link>)
from the node where <command>repmgr service pause</command> is executed.
</para>
</refsect1>
<refsect1>
<title>Execution</title>
<para>
<command>repmgr service unpause</command> can be executed on any active node in the
<command>repmgr daemon unpause</command> can be executed on any active node in the
replication cluster. A valid <filename>repmgr.conf</filename> file is required.
It will have no effect on nodes which are not already paused.
</para>
@@ -58,7 +50,7 @@
<title>Example</title>
<para>
<programlisting>
$ repmgr -f /etc/repmgr.conf service unpause
$ repmgr -f /etc/repmgr.conf daemon unpause
NOTICE: node 1 (node1) unpaused
NOTICE: node 2 (node2) unpaused
NOTICE: node 3 (node3) unpaused</programlisting>
@@ -82,7 +74,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<refsect1>
<title>Exit codes</title>
<para>
One of the following exit codes will be emitted by <command>repmgr service unpause</command>:
One of the following exit codes will be emitted by <command>repmgr daemon unpause</command>:
</para>
<variablelist>
@@ -110,11 +102,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-service-pause"/>,
<xref linkend="repmgr-service-status"/>,
<xref linkend="repmgrd-pausing"/>,
<xref linkend="repmgr-daemon-start"/>,
<xref linkend="repmgr-daemon-stop"/>
<xref linkend="repmgr-daemon-pause"/>, <xref linkend="repmgr-daemon-status"/>
</para>
</refsect1>
</refentry>

View File

@@ -31,32 +31,15 @@
<refsect1>
<title>Example</title>
<para>
Execution on the primary server:
<programlisting>
$ repmgr -f /etc/repmgr.conf node check
Node "node1":
Server role: OK (node is primary)
Replication lag: OK (N/A - node is primary)
WAL archiving: OK (0 pending files)
Upstream connection: OK (N/A - is primary)
Downstream servers: OK (2 of 2 downstream nodes attached)
Replication slots: OK (node has no physical replication slots)
Missing replication slots: OK (node has no missing physical replication slots)
Configured data directory: OK (configured "data_directory" is "/var/lib/postgresql/data")</programlisting>
</para>
<para>
Execution on a standby server:
<programlisting>
$ repmgr -f /etc/repmgr.conf node check
Node "node2":
Server role: OK (node is standby)
Replication lag: OK (0 seconds)
WAL archiving: OK (0 pending archive ready files)
Upstream connection: OK (node "node2" (ID: 2) is attached to expected upstream node "node1" (ID: 1))
Downstream servers: OK (this node has no downstream nodes)
Replication slots: OK (node has no physical replication slots)
Missing physical replication slots: OK (node has no missing physical replication slots)
Configured data directory: OK (configured "data_directory" is "/var/lib/postgresql/data")</programlisting>
Missing replication slots: OK (node has no missing physical replication slots)</programlisting>
</para>
</refsect1>
<refsect1>
@@ -74,20 +57,20 @@
<listitem>
<simpara>
<option>--role</option>: checks if the node has the expected role
<literal>--role</literal>: checks if the node has the expected role
</simpara>
</listitem>
<listitem>
<simpara>
<option>--replication-lag</option>: checks if the node is lagging by more than
<literal>--replication-lag</literal>: checks if the node is lagging by more than
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<option>--archive-ready</option>: checks for WAL files which have not yet been archived,
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived,
and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
</simpara>
@@ -95,31 +78,25 @@
<listitem>
<simpara>
<option>--downstream</option>: checks that the expected downstream nodes are attached
<literal>--downstream</literal>: checks that the expected downstream nodes are attached
</simpara>
</listitem>
<listitem>
<simpara>
<option>--upstream</option>: checks that the node is attached to its expected upstream
<literal>--slots</literal>: checks there are no inactive physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<option>--slots</option>: checks there are no inactive physical replication slots
<literal>--missing-slots</literal>: checks there are no missing physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<option>--missing-slots</option>: checks there are no missing physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<option>--data-directory-config</option>: checks the data directory configured in
<literal>--data-directory-config</literal>: checks the data directory configured in
<filename>repmgr.conf</filename> matches the actual data directory.
This check is not directly related to replication, but is useful to verify &repmgr;
is correctly configured.
@@ -131,62 +108,6 @@
</para>
</refsect1>
<refsect1>
<title>Additional checks</title>
<para>
Several checks are provided for diagnostic purposes and are not
included in the general output:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--db-connection</option>: checks if &repmgr; can connect to the
database on the local node.
</simpara>
<simpara>
This option is particularly useful in combination with <command>SSH</command>, as
it can be used to troubleshoot connection issues encountered when &repmgr; is
executed remotely (e.g. during a switchover operation).
</simpara>
</listitem>
<listitem>
<simpara>
<option>--replication-config-owner</option>: checks if the file containing replication
configuration (PostgreSQL 12 and later: <filename>postgresql.auto.conf</filename>;
PostgreSQL 11 and earlier: <filename>recovery.conf</filename>) is
owned by the same user who owns the data directory.
</simpara>
<simpara>
Incorrect ownership of these files (e.g. if they are owned by <literal>root</literal>)
will cause operations which need to update the replication configuration
(e.g. <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
or <link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>)
to fail.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Connection options</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>-S</option>/<option>--superuser</option>: connect as the
named superuser instead of the &repmgr; user
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Output format</title>
<para>
@@ -194,14 +115,14 @@
<listitem>
<simpara>
<option>--csv</option>: generate output in CSV format (not available
<literal>--csv</literal>: generate output in CSV format (not available
for individual checks)
</simpara>
</listitem>
<listitem>
<simpara>
<option>--nagios</option>: generate output in a Nagios-compatible format
<literal>--nagios</literal>: generate output in a Nagios-compatible format
(for individual checks only)
</simpara>
</listitem>
@@ -209,15 +130,13 @@
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
When executing <command>repmgr node check</command> with one of the individual
checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
(even if <option>--nagios</option> is not supplied):
(even if <literal>--nagios</literal> is not supplied):
<itemizedlist spacing="compact" mark="bullet">

View File

@@ -43,12 +43,7 @@
<programlisting>
repmgr node rejoin -d '$conninfo'</programlisting>
where <literal>$conninfo</literal> is the PostgreSQL <literal>conninfo</literal> string of the
<emphasis>current</emphasis> primary node (or that of any reachable node in the cluster, but
<emphasis>not</emphasis> the local node). This is so that &repmgr; can fetch up-to-date information
about the current state of the cluster.
</para>
<para>
where <literal>$conninfo</literal> is the conninfo string of any reachable node in the cluster.
<filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
otherwise available.
</para>
@@ -76,7 +71,7 @@
</para>
<para>
It is only necessary to provide the <application>pg_rewind</application> path
if using PostgreSQL 9.4, and <application>pg_rewind</application>
if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
is not installed in the PostgreSQL <filename>bin</filename> directory.
</para>
</listitem>
@@ -212,18 +207,9 @@
a standby to the current primary, not another standby.
</para>
<para>
The node's PostgreSQL instance must have been shut down cleanly. If this was not the
case, it will need to be started up until it has reached a consistent recovery point,
then shut down cleanly.
</para>
<para>
In PostgreSQL 13 and later, this will be done automatically
if the <option>--force-rewind</option> is provided (even if an actual rewind
is not necessary).
</para>
<para>
With PostgreSQL 12 and earlier, PostgreSQL will need to
be started and shut down manually; see below for the best way to do this.
The node must have been shut down cleanly; if this was not the case, it will
need to be manually started (remove any existing <filename>recovery.conf</filename> file first)
until it has reached a consistent recovery point, then shut down cleanly.
</para>
<tip>
<para>
@@ -235,14 +221,11 @@
rm -f /var/lib/pgsql/data/recovery.conf
postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
</para>
<para>
Note that <filename>standby.signal</filename> (PostgreSQL 11 and earlier:
<filename>recovery.conf</filename>) <emphasis>must</emphasis> be removed
from the data directory for PostgreSQL to be able to start in single
user mode.
</para>
</tip>
<para>
&repmgr; will attempt to verify whether the node can rejoin as-is, or whether
<command>pg_rewind</command> must be used (see following section).
</para>
</refsect1>
<refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
@@ -258,7 +241,7 @@
<command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
node which has diverged from the rest of the cluster, typically a failed primary.
<command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution,
and can be installed from external sources for PostgreSQL 9.4.
and can be installed from external sources for PostgreSQL 9.3 and 9.4.
</para>
<note>
<para>
@@ -300,15 +283,7 @@
to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
</para>
<refsect2 id="repmgr-node-rejoin-pg-rewind-config-files" xreflabel="pg_rewind and configuration files">
<title><command>pg_rewind</command> and configuration file retention</title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<important>
<para>
Be aware that if <command>pg_rewind</command> is executed and actually performs a
rewind operation, any configuration files in the PostgreSQL data directory will be
@@ -316,27 +291,17 @@
</para>
<para>
To prevent this happening, provide a comma-separated list of files to retain
using the <option>--config-file</option> command line option; the specified files
using the <literal>--config-file</literal> command line option; the specified files
will be archived in a temporary directory (whose parent directory can be specified with
<option>--config-archive-dir</option>, default: <filename>/tmp</filename>)
and restored once the rewind operation is complete.
<literal>--config-archive-dir</literal>) and restored once the rewind operation is
complete.
</para>
</refsect2>
</important>
<refsect2 id="repmgr-node-rejoin-pg-rewind-example" xreflabel="example using repmgr node rejoin and pg_rewind">
<title>Example using <command>repmgr node rejoin</command> and <command>pg_rewind</command></title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<para>
Example, first using <option>--dry-run</option>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
<para>
Example, first using <literal>--dry-run</literal>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind --dry-run
INFO: replication connection to the rejoin target node was successful
@@ -352,17 +317,17 @@
pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node3 dbname=repmgr user=repmgr'
INFO: prerequisites for executing NODE REJOIN are met</programlisting>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but is
not an absolute guarantee that actually executing <application>pg_rewind</application>
will succeed. See also section <xref linkend="repmgr-node-rejoin-caveats"/> below.
</para>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but is
not an absolute guarantee that actually executing <application>pg_rewind</application>
will succeed. See also section <xref linkend="repmgr-node-rejoin-caveats"/> below.
</para>
</note>
</note>
<programlisting>
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 3
@@ -374,8 +339,8 @@
NOTICE: starting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 3</programlisting>
</para>
</refsect2>
</para>
</refsect1>
<refsect1 id="repmgr-node-rejoin-caveats" xreflabel="Caveats">
@@ -391,7 +356,7 @@
<command>repmgr node rejoin</command> attempts to determine whether it will succeed by
comparing the timelines and relative WAL positions of the local node (rejoin candidate) and primary
(rejoin target). This is particularly important if planning to use <application>pg_rewind</application>,
which currently (as of PostgreSQL 12) may appear to succeed (or indicate there is no action
which currently (as of PostgreSQL 11) may appear to succeed (or indicate there is no action
needed) but potentially allow an impossible action, such as trying to rejoin a standby to a
primary which is behind the standby. &repmgr; will prevent this situation from occurring.
</para>
@@ -404,11 +369,6 @@
the current standby's PostgreSQL log will contain entries with the text
&quot;<literal>record with incorrect prev-link</literal>&quot;.
</para>
<para>
In PostgreSQL 9.5 and earlier, it is <emphasis>not</emphasis> possible to use
<application>pg_rewind</application> to attach to a target node with a lower
timeline than the local node.
</para>
<para>
We strongly recommend running <command>repmgr node rejoin</command> with the
<option>--dry-run</option> option first. Additionally it might be a good idea
@@ -418,51 +378,6 @@
is running in <option>--dry-run</option> mode.
</para>
<warning>
<para>
In all current PostgreSQL versions (as of September 2020), <application>pg_rewind</application>
contains a corner-case bug which affects standbys in a very specific situation.
</para>
<para>
This situation occurs when a standby was shut down <emphasis>before</emphasis> its
primary node, and an attempt is made to attach this standby to another primary
in the same cluster (following a &quot;split brain&quot; situation where the standby
was connected to the wrong primary). In this case, &repmgr; will correctly determine
that <application>pg_rewind</application> should be executed, however
<application>pg_rewind</application> incorrectly decides that no action is necessary.
</para>
<para>
In this situation, &repmgr; will report something like:
<programlisting>
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 1
DETAIL: rejoin target server's timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
but when executed, <application>pg_rewind</application> will report:
<programlisting>
pg_rewind: servers diverged at WAL location 0/7015540 on timeline 2
pg_rewind: no rewind required</programlisting>
and if an attempt is made to attach the standby to the new primary, PostgreSQL logs on the standby
will contain errors like:
<programlisting>
[2020-09-07 15:01:41 UTC] LOG: 00000: replication terminated by primary server
[2020-09-07 15:01:41 UTC] DETAIL: End of WAL reached on timeline 2 at 0/7015540.
[2020-09-07 15:01:41 UTC] LOG: 00000: new timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
</para>
<para>
Currently it is not possible to resolve this situation using <application>pg_rewind</application>.
A <ulink url="https://www.postgresql.org/message-id/flat/CABvVfJU-LDWvoz4-Yow3Ay5LZYTuPD7eSjjE4kGyNZpXC6FrVQ@mail.gmail.com">patch</ulink>
has been submitted and will hopefully be included in a forthcoming PostgreSQL minor release.
</para>
<para>
As a workaround, start the primary server the standby was previously attached to,
and ensure the standby can be attached to it. If <application>pg_rewind</application> was actually executed,
it will have copied in the <filename>.history</filename> file from the target primary server; this must
be removed. <command>repmgr node rejoin</command> can then be used to attach the standby to the original
primary. Ensure any changes pending on the primary have propogated to the standby. Then shut down the primary
server <emphasis>first</emphasis>, before shutting down the standby. It should then be possible to
use <command>repmgr node rejoin</command> to attach the standby to the new primary.
</para>
</warning>
</refsect1>
<refsect1>

View File

@@ -75,22 +75,8 @@
<para>
Issue a <command>CHECKPOINT</command> before stopping or restarting the node.
</para>
<para>
Note that a superuser connection is required to be able to execute the
<command>CHECKPOINT</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-S</option>/<option>--superuser</option></term>
<listitem>
<para>
Connect as the named superuser instead of the normal &repmgr; user.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
@@ -128,38 +114,38 @@
<para>
See what action would be taken for a restart:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --action=restart --checkpoint --dry-run
[postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --action=restart --checkpoint --dry-run
INFO: a CHECKPOINT would be issued here
INFO: would execute server command "sudo service postgresql-12 restart"</programlisting>
INFO: would execute server command "sudo service postgresql-11 restart"</programlisting>
</para>
<para>
Restart the PostgreSQL instance:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --action=restart --checkpoint
[postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --action=restart --checkpoint
NOTICE: issuing CHECKPOINT
DETAIL: executing server command "sudo service postgresql-12 restart"
Redirecting to /bin/systemctl restart postgresql-12.service</programlisting>
DETAIL: executing server command "sudo service postgresql-11 restart"
Redirecting to /bin/systemctl restart postgresql-11.service</programlisting>
</para>
<para>
List all commands:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --list-actions
[postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --list-actions
Following commands would be executed for each action:
start: "sudo service postgresql-12 start"
stop: "sudo service postgresql-12 stop"
restart: "sudo service postgresql-12 restart"
reload: "sudo service postgresql-12 reload"
promote: "/usr/pgsql-12/bin/pg_ctl -w -D '/var/lib/pgsql/12/data' promote"</programlisting>
start: "sudo service postgresql-11 start"
stop: "sudo service postgresql-11 stop"
restart: "sudo service postgresql-11 restart"
reload: "sudo service postgresql-11 reload"
promote: "/usr/pgsql-11/bin/pg_ctl -w -D '/var/lib/pgsql/11/data' promote"</programlisting>
</para>
<para>
List a single command:
<programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/12/repmgr.conf node service --list-actions --action=promote
/usr/pgsql-12/bin/pg_ctl -w -D '/var/lib/pgsql/12/data' promote </programlisting>
[postgres@node1 ~]$ repmgr -f /etc/repmgr/11/repmgr.conf node service --list-actions --action=promote
/usr/pgsql-11/bin/pg_ctl -w -D '/var/lib/pgsql/11/data' promote </programlisting>
</para>
</refsect1>
</refentry>

View File

@@ -24,15 +24,10 @@
<note>
<para>
&repmgr; will attempt to install the &repmgr; extension as part of this command,
however this will fail if the <literal>repmgr</literal> user is not a superuser.
</para>
<para>
It's possible to install the &repmgr; extension manually before executing
It's possibly to install the &repmgr; extension manually before executing
<command>repmgr primary register</command>; in this case &repmgr; will
detect the presence of the extension and skip that step.
</para>
</note>
</refsect1>
@@ -64,21 +59,6 @@
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para>
The <literal>repmgr</literal> user must be a superuser in order for &repmgr;
to be able to install the <literal>repmgr</literal> extension.
</para>
<para>
If this is not the case, the <literal>repmgr</literal> extension can be installed
manually before executing <command>repmgr primary register</command>.
</para>
<para>
A future &repmgr; release will enable the provision of a <option>--superuser</option>
name for the installation of the extension.
</para>
</refsect1>
<refsect1>
<title>Options</title>

View File

@@ -60,17 +60,6 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force</option></term>
<listitem>
<para>
Forcibly unregister the node if it is registered as an active primary,
as long as it has no registered standbys; or if it is registered as
a primary but running as a standby.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@@ -18,7 +18,7 @@
<para>
<command>repmgr standby clone</command> clones a PostgreSQL node from another
PostgreSQL node, typically the primary, but optionally from any other node in
the cluster or from Barman. It creates the replication configuration required
the cluster or from Barman. It creates the <filename>recovery.conf</filename> file required
to attach the cloned node to the primary node (or another standby, if cascading replication
is in use).
</para>
@@ -85,25 +85,28 @@
</refsect1>
<refsect1 id="repmgr-standby-clone-recovery-conf">
<title>Customising replication configuration</title>
<title>Customising recovery.conf</title>
<indexterm>
<primary>recovery.conf</primary>
<secondary>customising with &quot;repmgr standby clone&quot;</secondary>
</indexterm>
<indexterm>
<primary>replication configuration</primary>
<secondary>customising with &quot;repmgr standby clone&quot;</secondary>
</indexterm>
<para>
By default, &repmgr; will create a minimal replication configuration
By default, &repmgr; will create a minimal <filename>recovery.conf</filename>
containing following parameters:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>primary_conninfo</varname></simpara>
</listitem>
@@ -114,24 +117,9 @@
</itemizedlist>
<para>
For PostgreSQL 11 and earlier, these parameters will also be set:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
</itemizedlist>
<para>
The following additional parameters can be specified in <filename>repmgr.conf</filename>
for inclusion in the replication configuration:
for inclusion in <filename>recovery.conf</filename>:
</para>
<itemizedlist spacing="compact" mark="bullet">
@@ -182,10 +170,7 @@
<programlisting>
pg_basebackup_options='--wal-method=fetch'</programlisting>
and ensure that <literal>wal_keep_segments</literal> (PostgreSQL 13 and later:
<literal>wal_keep_size</literal>) is set to an appropriately high value. Note
however that this is not a particularly reliable way of ensuring sufficient
WAL is retained and is not recommended.
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">
pg_basebackup</ulink> documentation for details.
</para>
@@ -198,35 +183,11 @@
</note>
</refsect1>
<refsect1 id="repmgr-standby-clone-wal-directory">
<title>Placing WAL files into a different directory</title>
<para>
To ensure that WAL files are placed in a directory outside of the main data
directory (e.g. to keep them on a separate disk for performance reasons),
specify the location with <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
the <filename>repmgr.conf</filename> parameter <option>pg_basebackup_options</option>,
e.g.:
<programlisting>
pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
This setting will also be honored by &repmgr; when cloning from Barman
(&repmgr; 5.2 and later).
</para>
</refsect1>
<!-- don't rename this id as it may be used in external links -->
<refsect1 id="repmgr-standby-create-recovery-conf">
<title>Using a standby cloned by another method</title>
<indexterm>
<primary>replication configuration</primary>
<secondary>generating for a standby cloned by another method</secondary>
</indexterm>
<indexterm>
<primary>recovery.conf</primary>
<secondary>generating for a standby cloned by another method</secondary>
@@ -234,7 +195,7 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
<para>
&repmgr; supports standbys cloned by another method (e.g. using <application>barman</application>'s
<command><ulink url="https://docs.pgbarman.org/#recover">barman recover</ulink></command> command).
<command><ulink url="http://docs.pgbarman.org/release/2.5/#recover">barman recover</ulink></command> command).
</para>
<para>
To integrate the standby as a &repmgr; node, once the standby has been cloned,
@@ -253,24 +214,28 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</para>
</tip>
<para>
Then execute the command <command>repmgr standby clone --replication-conf-only</command>.
Then execute the command <command>repmgr standby clone --recovery-conf-only</command>.
This will create the <filename>recovery.conf</filename> file needed to attach
the node to its upstream (in PostgreSQL 12 and later: append replication configuration
to <filename>postgresql.auto.conf</filename>), and will also create a replication slot on the
the node to its upstream, and will also create a replication slot on the
upstream node if required.
</para>
<para>
Note that the upstream node must be running. In PostgreSQL 11 and earlier, an existing
Note that the upstream node must be running. An existing
<filename>recovery.conf</filename> will not be overwritten unless the
<option>-F/--force</option> option is provided.
</para>
<para>
Execute <command>repmgr standby clone --replication-conf-only --dry-run</command>
to check the prerequisites for creating the recovery configuration,
and display the contents of the configuration which would be added without actually
making any changes.
Execute <command>repmgr standby clone --recovery-conf-only --dry-run</command>
to check the prerequisites for creating the <filename>recovery.conf</filename> file,
and display the contents of the file without actually creating it.
</para>
<note>
<para>
<option>--recovery-conf-only</option> was introduced in &repmgr; <link linkend="release-4.0.4">4.0.4</link>.
</para>
</note>
</refsect1>
<refsect1>
@@ -295,9 +260,9 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
Check prerequisites but don't actually clone the standby.
</para>
<para>
If <option>--replication-conf-only</option> specified, the contents of
the generated recovery configuration will be displayed
but not written.
If <option>--recovery-conf-only</option> specified, the contents of
the generated <filename>recovery.conf</filename> file will be displayed
but the file itself not written.
</para>
</listitem>
</varlistentry>
@@ -341,19 +306,11 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</varlistentry>
<varlistentry>
<term><option>--replication-conf-only</option></term>
<term><option> --recovery-conf-only</option></term>
<listitem>
<para>
Create recovery configuration for a previously cloned instance.
Create <filename>recovery.conf</filename> file for a previously cloned instance. &repmgr; 4.0.4 and later.
</para>
<para>
In PostgreSQL 11 and earlier, the replication configuration will be
written to <filename>recovery.conf</filename>.
</para>
<para>
In PostgreSQL 12 and later, the replication configuration will be
written to <filename>postgresql.auto.conf</filename>.
</para>
</listitem>
</varlistentry>
@@ -381,7 +338,7 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
<term><option>--upstream-conninfo</option></term>
<listitem>
<para>
<literal>primary_conninfo</literal> value to include in the recovery configuration
<literal>primary_conninfo</literal> value to write in <filename>recovery.conf</filename>
when the intended upstream server does not yet exist.
</para>
<para>
@@ -399,23 +356,6 @@ pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verify-backup</option></term>
<listitem>
<para>
<!-- update link after Pg13 release -->
Verify a cloned node using the
<ulink url="https://www.postgresql.org/docs/13/app-pgverifybackup.html">pg_verifybackup</ulink>
utility (PostgreSQL 13 and later).
</para>
<para>
This option can currently only be used when cloning directly from an upstream
node.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--without-barman </option></term>
<listitem>

View File

@@ -47,15 +47,7 @@
</para>
<para>
In PostgreSQL 12 and earlier, this command will force a restart of PostgreSQL on the standby node.
</para>
<para>
In PostgreSQL 13 and later, by default this command will signal PostgreSQL to reload its
configuration, which will cause PostgreSQL to follow the new upstream without
a restart. If this behaviour is not desired for whatever reason, the configuration
file parameter <varname>standby_follow_restart</varname> can be set <literal>true</literal>
to always force a restart.
This command will force a restart of PostgreSQL on the standby node.
</para>
<para>

View File

@@ -23,8 +23,7 @@
<important>
<para>
If &repmgrd; is active, you must execute
<command><link linkend="repmgr-service-pause">repmgr service pause</link></command>
(&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-pause">repmgr service pause</link></command>)
<command><link linkend="repmgr-daemon-pause">repmgr daemon pause</link></command>
to temporarily disable &repmgrd; while making any changes
to the replication cluster.
</para>
@@ -66,10 +65,10 @@
Both values can be defined in <filename>repmgr.conf</filename>.
</para>
<warning>
<note>
<para>
In PostgreSQL 12 and earlier, if WAL replay is paused on the standby, and not all
WAL files on the standby have been replayed, &repmgr; will not attempt to promote it.
If WAL replay is paused on the standby, and not all WAL files on the standby have been
replayed, &repmgr; will not attempt to promote it.
</para>
<para>
This is because if WAL replay is paused, PostgreSQL itself will not react to a promote command
@@ -81,15 +80,11 @@
Note that if the standby is in archive recovery, &repmgr; will not be able to determine
if more WAL is pending replay, and will abort the promotion attempt if WAL replay is paused.
</para>
<para>
This restriction does <emphasis>not</emphasis> apply to PostgreSQL 13 and later.
</para>
</warning>
</note>
</refsect1>
<refsect1>
<title>Example</title>
<para>
@@ -98,46 +93,13 @@
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
server promoting
DEBUG: setting node 2 as primary and marking existing primary as failed
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
</para>
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para><emphasis>pg_promote() (PostgreSQL 12 and later)</emphasis></para>
<para>
From PostgreSQL 12, &repmgr; will attempt to use the built-in <function>pg_promote()</function>
function to promote a standby to primary.
</para>
<para>
By default, execution of <function>pg_promote()</function> is restricted to superusers.
If the <literal>repmgr</literal> user does not have permission to execute
<function>pg_promote()</function>, &repmgr; will fall back to using &quot;<command>pg_ctl promote</command>&quot;.
</para>
<tip>
<para>
Execute <command>repmgr standby promote</command> with the <option>--dry-run</option>
to check whether the &repmgr; user has permission to execute <function>pg_promote()</function>.
</para>
<para>
If the <literal>repmgr</literal> user is not a superuser, execution permission for this
function can be granted with e.g.:
<programlisting>
GRANT EXECUTE ON FUNCTION pg_catalog.pg_promote TO repmgr</programlisting>
</para>
<para>
Note that permissions are only effective for the database they are granted in, so
this <emphasis>must</emphasis> be executed in the &repmgr; database to be effective.
</para>
</tip>
<para>
For more details on <function>pg_promote()</function>, see the
<ulink url="https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL-TABLE">PostgreSQL documentation</ulink>.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
@@ -172,42 +134,6 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
This option is relevant in the following situations if <option>--siblings-follow</option> was specified:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
If one or more sibling nodes was not reachable via SSH, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If the promotion candidate has insufficient free walsenders to accomodate the standbys which will
be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If replication slots are in use but the promotion candidate has insufficient free replication slots
to accomodate the standbys which will be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Note that if the <option>-F</option>/<option>--force</option> option is used when any of the above
situations is encountered, the onus is on the user to manually resolve any resulting issues.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
@@ -245,23 +171,6 @@
</simpara>
</listitem>
<listitem>
<indexterm>
<primary>service_promote_command</primary>
<secondary>with &quot;repmgr standby promote &quot;</secondary>
</indexterm>
<simpara>
<literal>service_promote_command</literal>:
a command which will be executed instead of <command>pg_ctl promote</command>
or (in PostgreSQL 12 and later) <function>pg_promote()</function>.
</simpara>
<simpara>
This is intended for systems which provide a package-level promote command,
such as Debian's <application>pg_ctlcluster</application>, to promote the
PostgreSQL from standby to primary.
</simpara>
</listitem>
</itemizedlist>
</para>

View File

@@ -108,10 +108,9 @@
<title>Registering a node not cloned by repmgr</title>
<para>
If you've cloned a standby using another method (e.g. <application>barman</application>'s
<command><ulink url="https://docs.pgbarman.org/#recover">barman recover</ulink></command>
command), register the node as detailed in section
<command>barman recover</command> command), register the node as detailed in section
<xref linkend="repmgr-standby-register-inactive-node"/> then execute
<link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --replication-conf-only</link>
<link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --recovery-conf-only</link>
to generate the appropriate replication configuration.
</para>
</refsect1>

View File

@@ -63,38 +63,6 @@
</refsect1>
<refsect1>
<title>User permission requirements</title>
<para><emphasis>CHECKPOINT</emphasis></para>
<para>
&repmgr; executes <command>CHECKPOINT</command> on the demotion candidate as part of the shutdown
process to ensure it shuts down as smoothly as possible.
</para>
<para>
Note that <command>CHECKPOINT</command> requires database superuser permissions to execute.
If the <literal>repmgr</literal> user is not a superuser, the name of a superuser should be
provided with the <option>-S</option>/<option>--superuser</option>.
</para>
<para>
If &repmgr; is unable to execute the <command>CHECKPOINT</command> command, the switchover
can still be carried out, albeit at a greater risk that the demotion candidate may not
be able to shut down as smoothly as might otherwise have been the case.
</para>
<para><emphasis>pg_promote() (PostgreSQL 12 and later)</emphasis></para>
<para>
From PostgreSQL 12, &repmgr; defaults to using the built-in <command>pg_promote()</command> function to
promote a standby to primary.
</para>
<para>
Note that execution of <function>pg_promote()</function> is restricted to superusers or to
any user who has been granted execution permission for this function. If the &repmgr; user
is not permitted to execute <function>pg_promote()</function>, &repmgr; will fall back to using
&quot;<command>pg_ctl promote</command>&quot;. For more details see
<link linkend="repmgr-standby-promote">repmgr standby promote</link>.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
@@ -147,7 +115,7 @@
<para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(and the prerequisites for using <application>pg_rewind</application> are met).
If using PostgreSQL 9.4, and the <application>pg_rewind</application>
If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
provide its full path. For more details see also <xref linkend="switchover-pg-rewind"/>.
</para>
@@ -216,17 +184,6 @@
</note>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-S</option>/<option>--superuser</option></term>
<listitem>
<para>
Use the named superuser instead of the normal &repmgr; user to perform
actions requiring superuser permissions.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@@ -63,34 +63,6 @@
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually register the witness
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option>/<option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-witness-register-events">
<title>Event notifications</title>
<para>

View File

@@ -26,7 +26,7 @@
<abstract>
<para>
This is the official documentation of &repmgr; &repmgrversion; for
use with PostgreSQL 9.4 - PostgreSQL 13.
use with PostgreSQL 9.3 - PostgreSQL 11.
</para>
<para>
&repmgr; is being continually developed and we strongly recommend using the
@@ -91,6 +91,7 @@
&repmgrd-automatic-failover;
&repmgrd-configuration;
&repmgrd-operation;
&repmgrd-bdr;
</part>
<part id="repmgr-command-reference">
@@ -115,11 +116,11 @@
&repmgr-cluster-crosscheck;
&repmgr-cluster-event;
&repmgr-cluster-cleanup;
&repmgr-service-status;
&repmgr-service-pause;
&repmgr-service-unpause;
&repmgr-daemon-status;
&repmgr-daemon-start;
&repmgr-daemon-stop;
&repmgr-daemon-pause;
&repmgr-daemon-unpause;
</part>
&appendix-release-notes;

View File

@@ -192,23 +192,21 @@
connected. Beginning with <link linkend="release-4.4">&repmgr; 4.4</link>
it is now possible for the affected standbys to build a consensus about whether
the primary is still available to some standbys (&quot;primary visibility consensus&quot;).
This is done by polling each standby (and the witness, if present) for the time it last saw the
primary; if any have seen the primary very recently, it's reasonable
This is done by polling each standby for the time it last saw the primary;
if any have seen the primary very recently, it's reasonable
to infer that the primary is still available and a failover should not be started.
</para>
<para>
The time the primary was last seen by each node can be checked by executing
<link linkend="repmgr-service-status"><command>repmgr service status</command></link>
(&repmgr; 4.2 - 4.4: <link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>)
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>,
which includes this in its output, e.g.:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 27259 | no | n/a
2 | node2 | standby | running | node1 | running | 27272 | no | 1 second(s) ago
3 | node3 | standby | running | node1 | running | 27282 | no | 0 second(s) ago
4 | node4 | witness | * running | node1 | running | 27298 | no | 1 second(s) ago</programlisting>
1 | node1 | primary | * running | | running | 96563 | no | n/a
2 | node2 | standby | running | node1 | running | 96572 | no | 1 second(s) ago
3 | node3 | standby | running | node1 | running | 96584 | no | 0 second(s) ago</programlisting>
</para>
@@ -269,12 +267,11 @@
<para>
If <option>standby_disconnect_on_failover</option> is set to <literal>true</literal> in
<filename>repmgr.conf</filename>, in a failover situation &repmgrd; will forcibly disconnect
the local node's WAL receiver, and wait for the WAL receiver on all sibling nodes to be
disconnected, before making a failover decision.
the local node's WAL receiver before making a failover decision.
</para>
<note>
<para>
<option>standby_disconnect_on_failover</option> is available with PostgreSQL 9.5 and later.
<option>standby_disconnect_on_failover</option> is available from PostgreSQL 9.5 and later.
Additionally this requires that the <literal>repmgr</literal> database user is a superuser.
</para>
</note>
@@ -293,12 +290,6 @@
plus however many seconds it takes to confirm the WAL receiver is disconnected before
&repmgrd; proceeds with the failover decision.
</para>
<para>
&repmgrd; will wait up to <option>sibling_nodes_disconnect_timeout</option> seconds (default:
<literal>30</literal>) to confirm that the WAL receiver on all sibling nodes hase been
disconnected before proceding with the failover operation. If the timeout is reached, the
failover operation will go ahead anyway.
</para>
<para>
Following the failover operation, no matter what the outcome, each node will reconnect its WAL receiver.
</para>
@@ -331,12 +322,11 @@
To use this, <option>failover_validation_command</option> in <filename>repmgr.conf</filename>
to a script executable by the <literal>postgres</literal> system user, e.g.:
<programlisting>
failover_validation_command=/path/to/script.sh %n</programlisting>
failover_validation_command=/path/to/script.sh %n %a</programlisting>
</para>
<para>
The <literal>%n</literal> parameter will be replaced with the node ID when the script is
executed. A number of other parameters are also available, see section
&quot;<xref linkend="repmgrd-automatic-failover-configuration-optional"/>&quot; for details.
The <literal>%n</literal> parameter will be replaced with the node ID, and the
<literal>%a</literal> parameter will be replaced by the node name when the script is executed.
</para>
<para>
This script must return an exit code of <literal>0</literal> to indicate the node should promote itself.

429
doc/repmgrd-bdr.xml Normal file
View File

@@ -0,0 +1,429 @@
<chapter id="repmgrd-bdr">
<title>BDR failover with repmgrd</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>BDR</secondary>
</indexterm>
<indexterm>
<primary>BDR</primary>
</indexterm>
<para>
&repmgr; 4.x provides support for monitoring a pair of BDR 2.x nodes and taking action in
case one of the nodes fails.
</para>
<note>
<simpara>
Due to the nature of BDR 1.x/2.x, it's only safe to use this solution for
a two-node scenario. Introducing additional nodes will create an inherent
risk of node desynchronisation if a node goes down without being cleanly
removed from the cluster.
</simpara>
</note>
<para>
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with &repmgrd; and redirecting queries from the failed node to the remaining
active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script
which is called by &repmgrd; to dynamically
reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
</para>
<note>
<simpara>
This &repmgr; functionality is for BDR 2.x only running on PostgreSQL 9.4/9.6.
It is <emphasis>not</emphasis> required for later BDR versions.
</simpara>
</note>
<sect1 id="bdr-prerequisites" xreflabel="BDR prequisites">
<title>Prerequisites</title>
<important>
<para>
This &repmgr; functionality is for BDR 2.x only running on PostgreSQL 9.4/9.6.
It is <emphasis>not</emphasis> required for later BDR versions.
</para>
</important>
<para>
&repmgr; 4 requires PostgreSQL 9.4 or 9.6 with the BDR 2 extension
enabled and configured for a two-node BDR network. &repmgr; 4 packages
must be installed on each node before attempting to configure
<application>repmgr</application>.
</para>
<note>
<simpara>
&repmgr; 4 will refuse to install if it detects more than two BDR nodes.
</simpara>
</note>
<para>
Application database connections *must* be passed through a proxy server/
connection pooler such as <application>PgBouncer</application>, and it must be possible to dynamically
reconfigure that from &repmgrd;. The example demonstrated in this document
will use <application>PgBouncer</application>
</para>
<para>
The proxy server / connection poolers must <emphasis>not</emphasis>
be installed on the database servers.
</para>
<para>
For this example, it's assumed password-less SSH connections are available
from the PostgreSQL servers to the servers where <application>PgBouncer</application>
runs, and that the user on those servers has permission to alter the
<application>PgBouncer</application> configuration files.
</para>
<para>
PostgreSQL connections must be possible between each node, and each node
must be able to connect to each PgBouncer instance.
</para>
</sect1>
<sect1 id="bdr-configuration" xreflabel="BDR configuration">
<title>Configuration</title>
<para>
A sample configuration for <filename>repmgr.conf</filename> on each
BDR node would look like this:
<programlisting>
# Node information
node_id=1
node_name='node1'
conninfo='host=node1 dbname=bdrtest user=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/data'
replication_type='bdr'
# Event notification configuration
event_notifications=bdr_failover
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a" >> /tmp/bdr-failover.log 2>&amp;1'
# repmgrd options
monitor_interval_secs=5
reconnect_attempts=6
reconnect_interval=5</programlisting>
</para>
<para>
Adjust settings as appropriate; copy and adjust for the second node (particularly
the values <varname>node_id</varname>, <varname>node_name</varname>
and <varname>conninfo</varname>).
</para>
<para>
Note that the values provided for the <varname>conninfo</varname> string
must be valid for connections from <emphasis>both</emphasis> nodes in the
replication cluster. The database must be the BDR-enabled database.
</para>
<para>
If defined, the <varname>event_notifications</varname> parameter will restrict
execution of the script defined in <varname>event_notification_command</varname>
to the specified event(s).
</para>
<note>
<simpara>
<varname>event_notification_command</varname> is the script which does the actual "heavy lifting"
of reconfiguring the proxy server/ connection pooler. It is fully
user-definable; see section <xref linkend="bdr-event-notification-command"/> for a reference
implementation.
</simpara>
</note>
</sect1>
<sect1 id="bdr-repmgr-setup" xreflabel="repmgr setup with BDR">
<title>repmgr setup</title>
<para>
Register both nodes; example on <literal>node1</literal>:
<programlisting>
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: node record created for node 'node1' (ID: 1)
NOTICE: BDR node 1 registered (conninfo: host=node1 dbname=bdrtest user=repmgr)</programlisting>
</para>
<para>
and on <literal>node1</literal>:
<programlisting>
$ repmgr -f /etc/repmgr.conf bdr register
NOTICE: node record created for node 'node2' (ID: 2)
NOTICE: BDR node 2 registered (conninfo: host=node2 dbname=bdrtest user=repmgr)</programlisting>
</para>
<para>
The <literal>repmgr</literal> extension will be automatically created
when the first node is registered, and will be propagated to the second
node.
</para>
<important>
<simpara>
Ensure the &repmgr; package is available on both nodes before
attempting to register the first node.
</simpara>
</important>
<para>
At this point the meta data for both nodes has been created; executing
<xref linkend="repmgr-cluster-show"/> (on either node) should produce output like this:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
----+-------+------+-----------+----------+--------------------------------------------------------
1 | node1 | bdr | * running | | default | host=node1 dbname=bdrtest user=repmgr connect_timeout=2
2 | node2 | bdr | * running | | default | host=node2 dbname=bdrtest user=repmgr connect_timeout=2</programlisting>
</para>
<para>
Additionally it's possible to display log of significant events; executing
<xref linkend="repmgr-cluster-event"/> (on either node) should produce output like this:
<programlisting>
$ repmgr -f /etc/repmgr.conf cluster event
Node ID | Event | OK | Timestamp | Details
---------+--------------+----+---------------------+----------------------------------------------
2 | bdr_register | t | 2017-07-27 17:51:48 | node record created for node 'node2' (ID: 2)
1 | bdr_register | t | 2017-07-27 17:51:00 | node record created for node 'node1' (ID: 1)
</programlisting>
</para>
<para>
At this point there will only be records for the two node registrations (displayed here
in reverse chronological order).
</para>
</sect1>
<sect1 id="bdr-event-notification-command" xreflabel="Defining the BDR failover &quot;event_notification command&quot;">
<title>Defining the BDR failover "event_notification_command"</title>
<para>
Key to "failover" execution is the <literal>event_notification_command</literal>,
which is a user-definable script specified in <filename>repmpgr.conf</filename>
and which can use a &repmgr; <link linkend="event-notifications">event notification</link>
to reconfigure the proxy server / connection pooler so it points to the other, still-active node.
Details of the event will be passed as parameters to the script.
</para>
<para>
Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>;
these will be replaced with the appropriate value when the script is executed:
</para>
<variablelist>
<varlistentry>
<term><option>%n</option></term>
<listitem>
<para>
node ID
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%e</option></term>
<listitem>
<para>
event type
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%t</option></term>
<listitem>
<para>
success (1 or 0)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%t</option></term>
<listitem>
<para>
timestamp
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%d</option></term>
<listitem>
<para>
details
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%c</option></term>
<listitem>
<para>
conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%a</option></term>
<listitem>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Note that <literal>%c</literal> and <literal>%a</literal> are only provided with
particular failover events, in this case <varname>bdr_failover</varname>.
</para>
<para>
The provided sample script
(<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>)
is configured as follows:
<programlisting>
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
</para>
<para>
and parses the placeholder parameters like this:
<programlisting>
NODE_ID=$1
EVENT_TYPE=$2
SUCCESS=$3
NEXT_CONNINFO=$4
NEXT_NODE_NAME=$5</programlisting>
</para>
<note>
<para>
The sample script also contains some hard-coded values for the <application>PgBouncer</application>
configuration for both nodes; these will need to be adjusted for your local environment
(ideally the scripts would be maintained as templates and generated by some
kind of provisioning system).
</para>
</note>
<para>
The script performs following steps:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>pauses <application>PgBouncer</application> on all nodes</simpara>
</listitem>
<listitem>
<simpara>recreates the <application>PgBouncer</application> configuration file on each
node using the information provided by &repmgrd;
(primarily the <varname>conninfo</varname> string) to configure
<application>PgBouncer</application></simpara>
</listitem>
<listitem>
<simpara>reloads the <application>PgBouncer</application> configuration</simpara>
</listitem>
<listitem>
<simpara>executes the <command>RESUME</command> command (in <application>PgBouncer</application>)</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Following successful script execution, any connections to PgBouncer on the failed BDR node
will be redirected to the active node.
</para>
</sect1>
<sect1 id="bdr-monitoring-failover" xreflabel="Node monitoring and failover">
<title>Node monitoring and failover</title>
<para>
At the intervals specified by <varname>monitor_interval_secs</varname>
in <filename>repmgr.conf</filename>, &repmgrd;
will ping each node to check if it's available. If a node isn't available,
&repmgrd; will enter failover mode and check <varname>reconnect_attempts</varname>
times at intervals of <varname>reconnect_interval</varname> to confirm the node is definitely unreachable.
This buffer period is necessary to avoid false positives caused by transient
network outages.
</para>
<para>
If the node is still unavailable, &repmgrd; will enter failover mode and execute
the script defined in <varname>event_notification_command</varname>; an entry will be logged
in the <literal>repmgr.events</literal> table and &repmgrd; will
(unless otherwise configured) resume monitoring of the node in "degraded" mode until it reappears.
</para>
<para>
&repmgrd; logfile output during a failover event will look something like this
on one node (usually the node which has failed, here <literal>node2</literal>):
<programlisting>
...
[2017-07-27 21:08:39] [INFO] starting continuous BDR node monitoring
[2017-07-27 21:08:39] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:08:55] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:11] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
[2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
[2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
[2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
[2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
[2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
[2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
[2017-07-27 21:09:28] [NOTICE] setting node record for node 2 to inactive
[2017-07-27 21:09:28] [INFO] executing notification command for event "bdr_failover"
[2017-07-27 21:09:28] [DETAIL] command is:
/path/to/bdr-pgbouncer.sh 2 bdr_failover 1 "host=host=node1 dbname=bdrtest user=repmgr connect_timeout=2" "node1"
[2017-07-27 21:09:28] [INFO] node 'node2' (ID: 2) detected as failed; next available node is 'node1' (ID: 1)
[2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
...</programlisting>
</para>
<para>
Output on the other node (<literal>node1</literal>) during the same event will look like this:
<programlisting>
...
[2017-07-27 21:08:35] [INFO] starting continuous BDR node monitoring
[2017-07-27 21:08:35] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:08:51] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:07] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:23] [WARNING] unable to connect to node node2 (ID 2)
[2017-07-27 21:09:23] [INFO] checking state of node 2, 0 of 5 attempts
[2017-07-27 21:09:23] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:24] [INFO] checking state of node 2, 1 of 5 attempts
[2017-07-27 21:09:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:25] [INFO] checking state of node 2, 2 of 5 attempts
[2017-07-27 21:09:25] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:26] [INFO] checking state of node 2, 3 of 5 attempts
[2017-07-27 21:09:26] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:27] [INFO] checking state of node 2, 4 of 5 attempts
[2017-07-27 21:09:27] [INFO] sleeping 1 seconds until next reconnection attempt
[2017-07-27 21:09:28] [WARNING] unable to reconnect to node 2 after 5 attempts
[2017-07-27 21:09:28] [NOTICE] other node's repmgrd is handling failover
[2017-07-27 21:09:28] [INFO] monitoring BDR replication status on node "node1" (ID: 1)
[2017-07-27 21:09:28] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
...</programlisting>
</para>
<para>
This assumes only the PostgreSQL instance on <literal>node2</literal> has failed. In this case the
&repmgrd; instance running on <literal>node2</literal> has performed the failover. However if
the entire server becomes unavailable, &repmgrd; on <literal>node1</literal> will perform
the failover.
</para>
</sect1>
<sect1 id="bdr-node-recovery" xreflabel="Node recovery">
<title>Node recovery</title>
<para>
Following failure of a BDR node, if the node subsequently becomes available again,
a <varname>bdr_recovery</varname> event will be generated. This could potentially be used to
reconfigure PgBouncer automatically to bring the node back into the available pool,
however it would be prudent to manually verify the node's status before
exposing it to the application.
</para>
<para>
If the failed node comes back up and connects correctly, output similar to this
will be visible in the &repmgrd; log:
<programlisting>
[2017-07-27 21:25:30] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
[2017-07-27 21:25:46] [INFO] monitoring BDR replication status on node "node2" (ID: 2)
[2017-07-27 21:25:46] [DETAIL] monitoring node "node2" (ID: 2) in degraded mode
[2017-07-27 21:25:55] [INFO] active replication slot for node "node1" found after 1 seconds
[2017-07-27 21:25:55] [NOTICE] node "node2" (ID: 2) has recovered after 986 seconds</programlisting>
</para>
</sect1>
<sect1 id="bdr-complete-shutdown" xreflabel="Shutdown of both nodes">
<title>Shutdown of both nodes</title>
<para>
If both PostgreSQL instances are shut down, &repmgrd; will try and handle the
situation as gracefully as possible, though with no failover candidates available
there's not much it can do. Should this case ever occur, we recommend shutting
down &repmgrd; on both nodes and restarting it once the PostgreSQL instances
are running properly.
</para>
</sect1>
</chapter>

View File

@@ -8,7 +8,7 @@
</indexterm>
<para>
&repmgrd; is a daemon process which runs on each PostgreSQL node,
&repmgrd; is a daemon which runs on each PostgreSQL node,
monitoring the local node, and (unless it's the primary node) the upstream server
(the primary server or with cascading replication, another standby) which it's
connected to.
@@ -81,7 +81,7 @@
<listitem>
<simpara>
<literal>connection</literal> - determines server availability
by attempting to make a new connection to the upstream node
by attempt ingto make a new connection to the upstream node
</simpara>
</listitem>
<listitem>
@@ -325,7 +325,7 @@
</sect2>
<sect2 id="repmgrd-automatic-failover-configuration-optional" xreflabel="Optional configuration for automatic failover">
<sect2 id="repmgrd-automatic-failover-configuration-optional">
<title>Optional configuration for automatic failover</title>
<para>
@@ -370,8 +370,8 @@
</para>
</note>
<para>
One or more of the following parameter placeholders
may be provided, which will be replaced by repmgrd with the appropriate
One or both of the following parameter placeholders
should be provided, which will be replaced by repmgrd with the appropriate
value:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
@@ -380,15 +380,6 @@
<listitem>
<simpara><literal>%a</literal>: node name</simpara>
</listitem>
<listitem>
<simpara><literal>%v</literal>: number of visible nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%u</literal>: number of shared upstream nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%t</literal>: total number of nodes</simpara>
</listitem>
</itemizedlist>
</para>
<para>
@@ -407,8 +398,8 @@
</indexterm>
<para>
If <literal>true</literal>, only continue with failover if no standbys
(or the witness server, if present) have seen the primary node recently.
If <literal>true</literal>, only continue with failover if no standbys have seen
the primary node recently.
</para>
<note>
<para>
@@ -419,33 +410,6 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>always_promote</option></term>
<listitem>
<indexterm>
<primary>always_promote</primary>
</indexterm>
<para>
Default: <literal>false</literal>.
</para>
<para>
If <literal>true</literal>, promote the local node even if its
&repmgr; metadata is not up-to-date.
</para>
<para>
Normally &repmgr; expects its metadata (stored in the <varname>repmgr.nodes</varname>
table) to be up-to-date so &repmgrd; can take the correct action during a failover.
However it's possible that updates made on the primary may not
have propagated to the standby (promotion candidate). In this case &repmgrd; will
default to not promoting the standby. This behaviour can be overridden by setting
<option>always_promote</option> to <literal>true</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>standby_disconnect_on_failover</option></term>
@@ -526,23 +490,6 @@
</sect2>
<sect2 id="repmgrd-automatic-failover-configuration-pgbouncer-fencing">
<title>Configuring &repmgrd; and pgbouncer to fence a failed primary node</title>
<indexterm>
<primary>fencing</primary>
<secondary>using repmgrd and pgbouncer to fence a failed primary node</secondary>
</indexterm>
<indexterm>
<primary>PgBouncer</primary>
<secondary>using repmgrd and pgbouncer to fence a failed primary node</secondary>
</indexterm>
<para>
For further details and a reference implementation, see the separate document
<ulink url="https://github.com/2ndQuadrant/repmgr/blob/master/doc/repmgrd-node-fencing.md">Fencing a failed master node with repmgrd and PgBouncer</ulink>.
</para>
</sect2>
<sect2 id="postgresql-service-configuration">
<title>PostgreSQL service configuration</title>
@@ -576,8 +523,7 @@
</indexterm>
<para>
If you are intending to use the <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link>
and <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link>
commands, the following
and <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link> commands, the following
parameters <emphasis>must</emphasis> be set in <filename>repmgr.conf</filename>:
<itemizedlist spacing="compact" mark="bullet">
@@ -593,10 +539,10 @@
</para>
<para>
Example (for &repmgr; with PostgreSQL 12 on CentOS 7):
Example (for &repmgr; with PostgreSQL 11 on CentOS 7):
<programlisting>
repmgrd_service_start_command='sudo systemctl repmgr12 start'
repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
repmgrd_service_start_command='sudo systemctl repmgr11 start'
repmgrd_service_stop_command='sudo systemctl repmgr11 stop'
</programlisting>
</para>
<para>
@@ -660,6 +606,18 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
</simpara>
</listitem>
<listitem>
<simpara>
<varname>bdr_local_monitoring_only</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>bdr_recovery_timeout</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>child_nodes_check_interval</varname>
@@ -792,12 +750,6 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
</simpara>
</listitem>
<listitem>
<simpara>
<varname>always_promote</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>promote_command</varname>
@@ -913,7 +865,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
<para>
The commands <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link> and
<link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link> can be used
as convenience wrappers to start and stop &repmgrd; on the local node.
as convenience wrappers to start and stop &repmgrd;.
</para>
<important>
<para>

View File

@@ -108,7 +108,7 @@ The actual script is as follows; adjust the configurable items as appropriate:
# 1. Promote this node from standby to primary
repmgr standby promote -f /etc/repmgr.conf --log-to-file
repmgr standby promote -f /etc/repmgr.conf
# 2. Reconfigure pgbouncer instances
@@ -146,7 +146,7 @@ Script and template file should be installed on each node where `repmgrd` is run
Finally, set `promote_command` in `repmgr.conf` on each node to
point to the custom promote script:
promote_command='/var/lib/postgres/repmgr/promote.sh'
promote_command=/var/lib/postgres/repmgr/promote.sh
and reload/restart any running `repmgrd` instances for the changes to take
effect.

View File

@@ -6,9 +6,9 @@
<secondary>operation</secondary>
</indexterm>
<sect1 id="repmgrd-pausing" xreflabel="pausing the repmgrd service">
<sect1 id="repmgrd-pausing">
<title>Pausing the repmgrd service</title>
<title>Pausing repmgrd</title>
<indexterm>
<primary>repmgrd</primary>
@@ -47,7 +47,7 @@
<note>
<para>
For major PostgreSQL upgrades, e.g. from PostgreSQL 11 to PostgreSQL 12,
For major PostgreSQL upgrades, e.g. from PostgreSQL 10 to PostgreSQL 11,
&repmgrd; should be shut down completely and only started up
once the &repmgr; packages for the new PostgreSQL major version have been installed.
</para>
@@ -88,21 +88,17 @@
<sect2 id="repmgrd-pausing-execution">
<title>Pausing/unpausing &repmgrd;</title>
<para>
To pause &repmgrd;, execute <link linkend="repmgr-service-pause"><command>repmgr service pause</command></link>
(&repmgr; 4.2 - 4.4: <link linkend="repmgr-service-pause"><command>repmgr daemon pause</command></link>),
e.g.:
To pause &repmgrd;, execute <link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link>, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf service pause
$ repmgr -f /etc/repmgr.conf daemon pause
NOTICE: node 1 (node1) paused
NOTICE: node 2 (node2) paused
NOTICE: node 3 (node3) paused</programlisting>
</para>
<para>
The state of &repmgrd; on each node can be checked with
<link linkend="repmgr-service-status"><command>repmgr service status</command></link>
(&repmgr; 4.2 - 4.4: <link linkend="repmgr-service-status"><command>repmgr daemon status</command></link>),
e.g.:
<programlisting>$ repmgr -f /etc/repmgr.conf service status
<link linkend="repmgr-daemon-status"><command>repmgr daemon status</command></link>, e.g.:
<programlisting>$ repmgr -f /etc/repmgr.conf daemon status
ID | Name | Role | Status | repmgrd | PID | Paused?
----+-------+---------+---------+---------+------+---------
1 | node1 | primary | running | running | 7851 | yes
@@ -112,8 +108,8 @@ NOTICE: node 3 (node3) paused</programlisting>
<note>
<para>
If executing a switchover with <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>,
&repmgr; will automatically pause/unpause the &repmgrd; service as part of the switchover process.
If executing a switchover with <link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>,
&repmgr; will automatically pause/unpause &repmgrd; as part of the switchover process.
</para>
</note>
@@ -121,32 +117,29 @@ NOTICE: node 3 (node3) paused</programlisting>
If the primary (in this example, <literal>node1</literal>) is stopped, &repmgrd;
running on one of the standbys (here: <literal>node2</literal>) will react like this:
<programlisting>
[2019-08-28 12:22:21] [WARNING] unable to connect to upstream node "node1" (node ID: 1)
[2019-08-28 12:22:21] [INFO] checking state of node 1, 1 of 5 attempts
[2019-08-28 12:22:21] [INFO] sleeping 1 seconds until next reconnection attempt
[2018-09-20 12:22:21] [WARNING] unable to connect to upstream node "node1" (ID: 1)
[2018-09-20 12:22:21] [INFO] checking state of node 1, 1 of 5 attempts
[2018-09-20 12:22:21] [INFO] sleeping 1 seconds until next reconnection attempt
...
[2019-08-28 12:22:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2019-08-28 12:22:25] [INFO] checking state of node 1, 5 of 5 attempts
[2019-08-28 12:22:25] [WARNING] unable to reconnect to node 1 after 5 attempts
[2019-08-28 12:22:25] [NOTICE] node is paused
[2019-08-28 12:22:33] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (node ID: 1) in degraded state
[2019-08-28 12:22:33] [DETAIL] repmgrd paused by administrator
[2019-08-28 12:22:33] [HINT] execute "repmgr service unpause" to resume normal failover mode</programlisting>
[2018-09-20 12:22:24] [INFO] sleeping 1 seconds until next reconnection attempt
[2018-09-20 12:22:25] [INFO] checking state of node 1, 5 of 5 attempts
[2018-09-20 12:22:25] [WARNING] unable to reconnect to node 1 after 5 attempts
[2018-09-20 12:22:25] [NOTICE] node is paused
[2018-09-20 12:22:33] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state
[2018-09-20 12:22:33] [DETAIL] repmgrd paused by administrator
[2018-09-20 12:22:33] [HINT] execute "repmgr daemon unpause" to resume normal failover mode</programlisting>
</para>
<para>
If the primary becomes available again (e.g. following a software upgrade), &repmgrd;
will automatically reconnect, e.g.:
<programlisting>
[2019-08-28 12:25:41] [NOTICE] reconnected to upstream node "node1" (ID: 1) after 8 seconds, resuming monitoring</programlisting>
[2018-09-20 13:12:41] [NOTICE] reconnected to upstream node 1 after 8 seconds, resuming monitoring</programlisting>
</para>
<para>
To unpause the &repmgrd; service, execute
<link linkend="repmgr-service-unpause"><command>repmgr service unpause</command></link>
((&repmgr; 4.2 - 4.4: <link linkend="repmgr-service-unpause"><command>repmgr daemon unpause</command></link>),
e.g.:
To unpause &repmgrd;, execute <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>, e.g.:
<programlisting>
$ repmgr -f /etc/repmgr.conf service unpause
$ repmgr -f /etc/repmgr.conf daemon unpause
NOTICE: node 1 (node1) unpaused
NOTICE: node 2 (node2) unpaused
NOTICE: node 3 (node3) unpaused</programlisting>
@@ -157,11 +150,11 @@ NOTICE: node 3 (node3) unpaused</programlisting>
If the previous primary is no longer accessible when &repmgrd;
is unpaused, no failover action will be taken. Instead, a new primary must be manually promoted using
<link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>,
and any standbys attached to the new primary with
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>.
and any standbys attached to the new primary with
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>.
</para>
<para>
This is to prevent execution of <link linkend="repmgr-service-unpause"><command>repmgr service unpause</command></link>
This is to prevent <link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
resulting in the automatic promotion of a new primary, which may be a problem particularly
in larger clusters, where &repmgrd; could select a different promotion
candidate to the one intended by the administrator.
@@ -175,23 +168,17 @@ NOTICE: node 3 (node3) unpaused</programlisting>
The pause state of each node will be stored over a PostgreSQL restart.
</para>
<para>
<link linkend="repmgr-service-pause"><command>repmgr service pause</command></link> and
<link linkend="repmgr-service-unpause"><command>repmgr service unpause</command></link> can be
executed even if &repmgrd; is not running; in this case,
&repmgrd; will start up in whichever pause state has been set.
</para>
<para>
<link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
<link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link> can be
executed even if &repmgrd; is not running; in this case,
&repmgrd; will start up in whichever pause state has been set.
</para>
<note>
<para>
<link linkend="repmgr-service-pause"><command>repmgr service pause</command></link> and
<link linkend="repmgr-service-unpause"><command>repmgr service unpause</command></link>
<emphasis>do not</emphasis> start/stop &repmgrd;.
</para>
<para>
The commands <link linkend="repmgr-daemon-start"><command>repmgr daemon start</command></link>
and <link linkend="repmgr-daemon-stop"><command>repmgr daemon stop</command></link>
(<link linkend="repmgrd-service-configuration">if correctly configured</link>) can be used to start/stop
&repmgrd; on individual nodes.
<link linkend="repmgr-daemon-pause"><command>repmgr daemon pause</command></link> and
<link linkend="repmgr-daemon-unpause"><command>repmgr daemon unpause</command></link>
<emphasis>do not</emphasis> stop/start &repmgrd;.
</para>
</note>
</sect2>
@@ -295,7 +282,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
[2017-08-29 10:59:37] [HINT] use "repmgr standby promote" to manually promote this node
[2017-08-29 10:59:37] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
[2017-08-29 10:59:53] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
[2017-08-29 11:00:45] [NOTICE] reconnected to upstream node "node1" (ID: 1) after 68 seconds, resuming monitoring
[2017-08-29 11:00:45] [NOTICE] reconnected to upstream node 1 after 68 seconds, resuming monitoring
[2017-08-29 11:00:57] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in normal state (automatic failover disabled)</programlisting>
</para>

View File

@@ -38,7 +38,7 @@
<simpara>
ability to <link linkend="repmgrd-pausing">pause repmgrd</link>
operation on all nodes with a
<link linkend="repmgr-service-pause"><command>single command</command></link>
<link linkend="repmgr-daemon-pause"><command>single command</command></link>
</simpara>
</listitem>
@@ -100,11 +100,11 @@
Start &repmgrd; on each standby and verify that it's running by examining the
log output, which at log level <literal>INFO</literal> will look like this:
<programlisting>
[2019-08-15 07:14:42] [NOTICE] repmgrd (repmgrd 5.0) starting up
[2019-08-15 07:14:42] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr connect_timeout=2"
INFO: set_repmgrd_pid(): provided pidfile is /var/run/repmgr/repmgrd-12.pid
[2019-08-15 07:14:42] [NOTICE] starting monitoring of node "node2" (ID: 2)
[2019-08-15 07:14:42] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
[2019-03-15 06:32:05] [NOTICE] repmgrd (repmgrd 4.3) starting up
[2019-03-15 06:32:05] [INFO] connecting to database "host=node2 dbname=repmgr user=repmgr connect_timeout=2"
INFO: set_repmgrd_pid(): provided pidfile is /var/run/repmgr/repmgrd-11.pid
[2019-03-15 06:32:05] [NOTICE] starting monitoring of node "node2" (ID: 2)
[2019-03-15 06:32:05] [INFO] monitoring connection to upstream node "node1" (ID: 1)</programlisting>
</para>
<para>
Each &repmgrd; should also have recorded its successful startup as an event:
@@ -112,9 +112,9 @@
$ repmgr -f /etc/repmgr.conf cluster event --event=repmgrd_start
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+---------------+----+---------------------+--------------------------------------------------------
3 | node3 | repmgrd_start | t | 2019-08-15 07:14:42 | monitoring connection to upstream node "node1" (ID: 1)
2 | node2 | repmgrd_start | t | 2019-08-15 07:14:41 | monitoring connection to upstream node "node1" (ID: 1)
1 | node1 | repmgrd_start | t | 2019-08-15 07:14:39 | monitoring cluster primary "node1" (ID: 1)</programlisting>
3 | node3 | repmgrd_start | t | 2019-03-14 04:17:30 | monitoring connection to upstream node "node1" (ID: 1)
2 | node2 | repmgrd_start | t | 2019-03-14 04:11:47 | monitoring connection to upstream node "node1" (ID: 1)
1 | node1 | repmgrd_start | t | 2019-03-14 04:04:31 | monitoring cluster primary "node1" (ID: 1)</programlisting>
</para>
<para>
Now stop the current primary server with e.g.:
@@ -128,33 +128,33 @@
decision is made. This is an extract from the log of a standby server (<literal>node2</literal>)
which has promoted to new primary after failure of the original primary (<literal>node1</literal>).
<programlisting>
[2019-08-15 07:27:50] [WARNING] unable to connect to upstream node "node1" (ID: 1)
[2019-08-15 07:27:50] [INFO] checking state of node 1, 1 of 3 attempts
[2019-08-15 07:27:50] [INFO] sleeping 5 seconds until next reconnection attempt
[2019-08-15 07:27:55] [INFO] checking state of node 1, 2 of 3 attempts
[2019-08-15 07:27:55] [INFO] sleeping 5 seconds until next reconnection attempt
[2019-08-15 07:28:00] [INFO] checking state of node 1, 3 of 3 attempts
[2019-08-15 07:28:00] [WARNING] unable to reconnect to node 1 after 3 attempts
[2019-08-15 07:28:00] [INFO] primary and this node have the same location ("default")
[2019-08-15 07:28:00] [INFO] local node's last receive lsn: 0/900CBF8
[2019-08-15 07:28:00] [INFO] node 3 last saw primary node 12 second(s) ago
[2019-08-15 07:28:00] [INFO] last receive LSN for sibling node "node3" (ID: 3) is: 0/900CBF8
[2019-08-15 07:28:00] [INFO] node "node3" (ID: 3) has same LSN as current candidate "node2" (ID: 2)
[2019-08-15 07:28:00] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds
[2019-08-15 07:28:00] [NOTICE] promotion candidate is "node2" (ID: 2)
[2019-08-15 07:28:00] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2019-08-15 07:28:00] [INFO] promote_command is:
"/usr/pgsql-12/bin/repmgr -f /etc/repmgr/12/repmgr.conf standby promote"
[2019-03-15 06:37:50] [WARNING] unable to connect to upstream node "node1" (ID: 1)
[2019-03-15 06:37:50] [INFO] checking state of node 1, 1 of 3 attempts
[2019-03-15 06:37:50] [INFO] sleeping 5 seconds until next reconnection attempt
[2019-03-15 06:37:55] [INFO] checking state of node 1, 2 of 3 attempts
[2019-03-15 06:37:55] [INFO] sleeping 5 seconds until next reconnection attempt
[2019-03-15 06:38:00] [INFO] checking state of node 1, 3 of 3 attempts
[2019-03-15 06:38:00] [WARNING] unable to reconnect to node 1 after 3 attempts
[2019-03-15 06:38:00] [INFO] primary and this node have the same location ("default")
[2019-03-15 06:38:00] [INFO] local node's last receive lsn: 0/900CBF8
[2019-03-15 06:38:00] [INFO] node 3 last saw primary node 12 second(s) ago
[2019-03-15 06:38:00] [INFO] last receive LSN for sibling node "node3" (ID: 3) is: 0/900CBF8
[2019-03-15 06:38:00] [INFO] node "node3" (ID: 3) has same LSN as current candidate "node2" (ID: 2)
[2019-03-15 06:38:00] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds
[2019-03-15 06:38:00] [NOTICE] promotion candidate is "node2" (ID: 2)
[2019-03-15 06:38:00] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2019-03-15 06:38:00] [INFO] promote_command is:
"/usr/pgsql-11/bin/repmgr -f /etc/repmgr/11/repmgr.conf standby promote"
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "/usr/pgsql-12/bin/pg_ctl -w -D '/var/lib/pgsql/12/data' promote"
DETAIL: promoting server "node2" (ID: 2) using "/usr/pgsql-11/bin/pg_ctl -w -D '/var/lib/pgsql/11/data' promote"
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary
[2019-08-15 07:28:01] [INFO] 3 followers to notify
[2019-08-15 07:28:01] [NOTICE] notifying node "node3" (ID: 3) to follow node 2
[2019-03-15 06:38:01] [INFO] 3 followers to notify
[2019-03-15 06:38:01] [NOTICE] notifying node "node3" (ID: 3) to follow node 2
INFO: node 3 received notification to follow node 2
[2019-08-15 07:28:01] [INFO] switching to primary monitoring mode
[2019-08-15 07:28:01] [NOTICE] monitoring cluster primary "node2" (ID: 2)</programlisting>
[2019-03-15 06:38:01] [INFO] switching to primary monitoring mode
[2019-03-15 06:38:01] [NOTICE] monitoring cluster primary "node2" (ID: 2)</programlisting>
</para>
<para>
The cluster status will now look like this, with the original primary (<literal>node1</literal>)
@@ -176,11 +176,11 @@
$ repmgr -f /etc/repmgr.conf cluster event
Node ID | Name | Event | OK | Timestamp | Details
---------+-------+----------------------------+----+---------------------+-------------------------------------------------------------
3 | node3 | repmgrd_failover_follow | t | 2019-08-15 07:38:03 | node 3 now following new upstream node 2
3 | node3 | standby_follow | t | 2019-08-15 07:38:02 | standby attached to upstream node "node2" (ID: 2)
2 | node2 | repmgrd_reload | t | 2019-08-15 07:38:01 | monitoring cluster primary "node2" (ID: 2)
2 | node2 | repmgrd_failover_promote | t | 2019-08-15 07:38:01 | node 2 promoted to primary; old primary 1 marked as failed
2 | node2 | standby_promote | t | 2019-08-15 07:38:01 | server "node2" (ID: 2) was successfully promoted to primary</programlisting>
3 | node3 | repmgrd_failover_follow | t | 2019-03-15 06:38:03 | node 3 now following new upstream node 2
3 | node3 | standby_follow | t | 2019-03-15 06:38:02 | standby attached to upstream node "node2" (ID: 2)
2 | node2 | repmgrd_reload | t | 2019-03-15 06:38:01 | monitoring cluster primary "node2" (ID: 2)
2 | node2 | repmgrd_failover_promote | t | 2019-03-15 06:38:01 | node 2 promoted to primary; old primary 1 marked as failed
2 | node2 | standby_promote | t | 2019-03-15 06:38:01 | server "node2" (ID: 2) was successfully promoted to primary</programlisting>
</para>
</sect1>

View File

@@ -186,7 +186,6 @@
NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
INFO: following shutdown command would be run on node "node1":
"pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
INFO: parameter "shutdown_check_timeout" is set to 60 seconds
</programlisting>
</para>
@@ -244,15 +243,16 @@
</para>
<para>
<application>pg_rewind</application> has been part of the core PostgreSQL distribution since
version 9.5. Users of PostgreSQL 9.4 will need to manually install it; the source code is available here:
version 9.5. Users of versions 9.3 and 9.4 will need to manually install it; the source code is available here:
<ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
If the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
its full path on the demotion candidate with <option>--force-rewind</option>.
</para>
<para>
Note that building the 9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code.
Note that building the 9.3/9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code. Also, PostgreSQL 9.3 does not provide <varname>wal_log_hints</varname>,
meaning data checksums must have been enabled when the database was initialized.
</para>
</sect2>
@@ -317,9 +317,7 @@
</para>
<para>
If &repmgrd; is in use, it's worth double-checking that
all nodes are unpaused by executing
<command><link linkend="repmgr-service-status">repmgr service status</link></command>
(&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-status">repmgr daemon status</link></command>).
all nodes are unpaused by executing <command><link linkend="repmgr-daemon-status">repmgr daemon status</link></command>.
</para>
<note>
@@ -342,7 +340,7 @@
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
If using PostgreSQL 9.4, you should ensure that the shutdown command
If using PostgreSQL 9.3 or 9.4, you should ensure that the shutdown command
is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
and later). If relying on <command>pg_ctl</command> to perform database server operations,
you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>

View File

@@ -201,13 +201,9 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
</para>
<tip>
<para>
If the &repmgr; upgrade requires a PostgreSQL restart, combine the &repmgr; upgrade
with a PostgreSQL minor version upgrade, which will require a restart in any case.
</para>
<para>
New PostgreSQL minor versions are usually released every couple of months;
see the <ulink url="https://www.postgresql.org/developer/roadmap/">Roadmap</ulink>
for the current schedule.
If the &repmgr; upgrade requires a PostgreSQL restart, combine the &repmgr; upgrade
with a PostgreSQL minor version upgrade, which will require a restart in any case.
New PostgreSQL minor version are usually released every couple of months.
</para>
</tip>
</sect2>
@@ -220,9 +216,7 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<secondary>checking repmgrd status</secondary>
</indexterm>
<para>
From <link linkend="release-4.2">repmgr 4.2</link>, once the upgrade is complete, execute the
<command><link linkend="repmgr-service-status">repmgr service status</link></command>
(&repmgr; 4.2 - 4.4: <command><link linkend="repmgr-service-status">repmgr daemon status</link></command>)
From <link linkend="release-4.2">repmgr 4.2</link>, once the upgrade is complete, execute the <command><link linkend="repmgr-daemon-status">repmgr daemon status</link></command>
command (on any node) to show an overview of the status of &repmgrd; on all nodes.
</para>
</sect2>
@@ -273,29 +267,6 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
</para>
</tip>
<sect2 id="upgrading-pg-upgrade-standby" xreflabel="pg_upgrade and upgrading standbys">
<title>Upgrading standbys with pg_upgrade and rsync</title>
<para>
If you are intending to upgrade a standby using the <command>rsync</command> method described
in the <ulink url="https://www.postgresql.org/docs/current/pgupgrade.html#PGUPGRADE-STEP-REPLICAS">pg_upgrade documentation</ulink>,
you <emphasis>must</emphasis> ensure the standby's replication configuration is present and correct
before starting the standby.
</para>
<para>
Use <link linkend="repmgr-standby-clone">repmgr standby clone --replication-conf-only</link> to generate
the correct replication configuration.
</para>
<tip>
<para>
If upgrading from PostgreSQL 11 or earlier, be sure to delete <filename>recovery.conf</filename>, if present,
otherwise PostgreSQL will refuse to start.
</para>
</tip>
</sect2>
</sect1>
@@ -312,12 +283,12 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<orderedlist>
<listitem>
<simpara>
converting the <filename>repmgr.conf</filename> configuration files
converting the repmgr.conf configuration files
</simpara>
</listitem>
<listitem>
<simpara>
upgrading the repmgr schema using <command>CREATE EXTENSION</command> (PostgreSQL 12 and earlier)
upgrading the repmgr schema using <command>CREATE EXTENSION</command>
</simpara>
</listitem>
</orderedlist>
@@ -446,13 +417,13 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<programlisting>
$ ./convert-config.pl /etc/repmgr.conf
node_id=2
node_name='node2'
conninfo='host=node2 dbname=repmgr user=repmgr connect_timeout=2'
node_name=node2
conninfo=host=node2 dbname=repmgr user=repmgr connect_timeout=2
pg_ctl_options='-l /var/log/postgres/startup.log'
rsync_options='--exclude=postgresql.local.conf --archive'
log_level='INFO'
pg_basebackup_options='--no-slot'
data_directory=''</programlisting>
rsync_options=--exclude=postgresql.local.conf --archive
log_level=INFO
pg_basebackup_options=--no-slot
data_directory=</programlisting>
</para>
<para>
The converted file is printed to <literal>STDOUT</literal> and the original file is not
@@ -461,31 +432,23 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
Please note that the the conversion script will add an empty
placeholder parameter for <varname>data_directory</varname>, which
is a required parameter from &repmgr; 4. This must be manually modified to contain
the correct data directory.
is a required parameter in repmgr4 and which <emphasis>must</emphasis>
be provided.
</para>
</sect3>
</sect2>
<sect2>
<title>Upgrading the repmgr schema (PostgreSQL 12 and earlier)</title>
<title>Upgrading the repmgr schema</title>
<para>
Ensure &repmgrd; is not running, or any cron jobs which execute the
<command>repmgr</command> binary.
</para>
<para>
Install the latest &repmgr; package; any <literal>repmgr 3.x</literal> packages
Install <literal>repmgr 4</literal> packages; any <literal>repmgr 3.x</literal> packages
should be uninstalled (if not automatically uninstalled already by your packaging system).
</para>
<sect3>
<title>Upgrading from repmgr 3.1.1 or earlier</title>
<tip>
<simpara>
If you don't care about any data from the existing &repmgr; installation,
(e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
tables), the follwing steps can be skipped; proceed to <xref linkend="upgrade-reregister-nodes"/>.
</simpara>
</tip>
<para>
If your repmgr version is 3.1.1 or earlier, you will need to update
the schema to the latest version in the 3.x series (3.3.2) before
@@ -514,37 +477,19 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
In the database used by the existing &repmgr; installation, execute:
<programlisting>
CREATE EXTENSION repmgr FROM unpackaged</programlisting>
CREATE EXTENSION repmgr FROM unpackaged;</programlisting>
</para>
<para>
This will move and convert all objects from the existing schema
into the new, standard <literal>repmgr</literal> schema.
</para>
<note>
<simpara>There must be only one schema matching <literal>repmgr_%</literal> in the
<simpara>there must be only one schema matching <literal>repmgr_%</literal> in the
database, otherwise this step may not work.
</simpara>
</note>
</sect3>
</sect2>
<sect2>
<title>Upgrading the repmgr schema (PostgreSQL 13 and later)</title>
<para>
Beginning with PostgreSQL 13, the <command>CREATE EXTENSION ... FROM unpackaged</command>
syntax is no longer available. In the unlikely event you have ended up with an
installation running PostgreSQL 13 or later and containing the legacy &repmgr;
schema, there is no convenient way of upgrading this; instead you'll just need
to re-register the nodes as detailed in <link linkend="upgrade-reregister-nodes">the following section</link>,
which will create the &repmgr; extension automatically.
</para>
<para>
Any historical data you wish to retain (e.g. the contents of the <structname>events</structname>
and <structname>monitoring</structname> tables) will need to be exported manually.
</para>
</sect2>
<sect2 id="upgrade-reregister-nodes">
<sect3>
<title>Re-register each node</title>
<para>
This is necessary to update the <literal>repmgr</literal> metadata with some additional items.
@@ -554,10 +499,6 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<programlisting>
repmgr primary register -f /etc/repmgr.conf --force</programlisting>
</para>
<para>
If not already present (e.g. after executing <command>CREATE EXTENSION repmgr FROM unpackaged</command>),
the &repmgr; extension will be automatically created by <command>repmgr primary register</command>.
</para>
<para>
On each standby node, execute e.g.
<programlisting>
@@ -570,20 +511,18 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
The original <literal>repmgr_$cluster</literal> schema can be dropped at any time.
</para>
<tip>
<simpara>
If you don't care about any data from the existing &repmgr; installation,
(e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
tables), the manual <command>CREATE EXTENSION</command> step can be skipped; just re-register
each node, starting with the primary node, and the <literal>repmgr</literal> extension will be
automatically created.
</simpara>
</tip>
</sect3>
</sect2>
<sect2 id="upgrade-drop-repmgr-cluster-schema">
<title>Drop the legacy repmgr schema</title>
<para>
Once the cluster has been registered with the current &repmgr; version, the legacy
<literal>repmgr_$cluster</literal> schema can be dropped at any time with:
<programlisting>
DROP SCHEMA repmgr_$cluster CASCADE</programlisting>
(substitute <literal>$cluster</literal> with the value of the <varname>clustername</varname>
variable used in &repmgr; 3.x).
</para>
</sect2>
</sect1>
</chapter>

View File

@@ -1,6 +1,6 @@
/*
* errcode.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -32,6 +32,18 @@ SELECT * FROM repmgr.show_nodes;
(0 rows)
-- functions
SELECT repmgr.am_bdr_failover_handler(-1);
am_bdr_failover_handler
-------------------------
(1 row)
SELECT repmgr.am_bdr_failover_handler(NULL);
am_bdr_failover_handler
-------------------------
(1 row)
SELECT repmgr.get_new_primary();
get_new_primary
-----------------
@@ -80,3 +92,9 @@ SELECT repmgr.standby_set_last_updated();
(1 row)
SELECT repmgr.unset_bdr_failover_handler();
unset_bdr_failover_handler
----------------------------
(1 row)

2
log.c
View File

@@ -1,6 +1,6 @@
/*
* log.c - Logging methods
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

2
log.h
View File

@@ -1,6 +1,6 @@
/*
* log.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,5 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
ALTER FUNCTION set_repmgrd_pid(INT, TEXT) RETURNS NULL ON NULL INPUT;

View File

@@ -1,5 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
DROP FUNCTION am_bdr_failover_handler(INT);
DROP FUNCTION unset_bdr_failover_handler();

View File

@@ -1,214 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
DO $repmgr$
DECLARE
DECLARE server_version_num INT;
BEGIN
SELECT setting
FROM pg_catalog.pg_settings
WHERE name = 'server_version_num'
INTO server_version_num;
IF server_version_num >= 90400 THEN
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
ELSE
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location TEXT NOT NULL,
last_wal_standby_location TEXT,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
END IF;
END$repmgr$;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

View File

@@ -1,7 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
SELECT pg_catalog.pg_extension_config_dump('repmgr.nodes', '');
SELECT pg_catalog.pg_extension_config_dump('repmgr.events', '');
SELECT pg_catalog.pg_extension_config_dump('repmgr.monitoring_history', '');

View File

@@ -1,214 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
DO $repmgr$
DECLARE
DECLARE server_version_num INT;
BEGIN
SELECT setting
FROM pg_catalog.pg_settings
WHERE name = 'server_version_num'
INTO server_version_num;
IF server_version_num >= 90400 THEN
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
ELSE
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location TEXT NOT NULL,
last_wal_standby_location TEXT,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
END IF;
END$repmgr$;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

View File

@@ -1,192 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.nodes', '');
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.events', '');
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
SELECT pg_catalog.pg_extension_config_dump('repmgr.monitoring_history', '');
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

View File

@@ -1,265 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
-- extract the current schema name
-- NOTE: this assumes there will be only one schema matching 'repmgr_%';
-- user is responsible for ensuring this is the case
CREATE TEMPORARY TABLE repmgr_old_schema (schema_name TEXT);
INSERT INTO repmgr_old_schema (schema_name)
SELECT nspname AS schema_name
FROM pg_catalog.pg_namespace
WHERE nspname LIKE 'repmgr_%'
LIMIT 1;
-- move old objects into new schema
DO $repmgr$
DECLARE
old_schema TEXT;
BEGIN
SELECT schema_name FROM repmgr_old_schema
INTO old_schema;
EXECUTE format('ALTER TABLE %I.repl_nodes SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_events SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_monitor SET SCHEMA repmgr', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_show_nodes', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_status', old_schema);
END$repmgr$;
-- convert "repmgr_$cluster.repl_nodes" to "repmgr.nodes"
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
INSERT INTO repmgr.nodes
(node_id, upstream_node_id, active, node_name, type, location, priority, conninfo, repluser, slot_name, config_file)
SELECT id, upstream_node_id, active, name,
CASE WHEN type = 'master' THEN 'primary' ELSE type END,
'default', priority, conninfo, 'unknown', slot_name, 'unknown'
FROM repmgr.repl_nodes
ORDER BY id;
-- convert "repmgr_$cluster.repl_event" to "event"
ALTER TABLE repmgr.repl_events RENAME TO events;
-- create new table "repmgr.voting_term"
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
INSERT INTO repmgr.voting_term (term) VALUES (1);
-- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"
DO $repmgr$
DECLARE
DECLARE server_version_num INT;
BEGIN
SELECT setting
FROM pg_catalog.pg_settings
WHERE name = 'server_version_num'
INTO server_version_num;
IF server_version_num >= 90400 THEN
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
INSERT INTO repmgr.monitoring_history
(primary_node_id, standby_node_id, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
SELECT primary_node, standby_node, last_monitor_time, last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
FROM repmgr.repl_monitor;
ELSE
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location TEXT NOT NULL,
last_wal_standby_location TEXT,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
INSERT INTO repmgr.monitoring_history
(primary_node_id, standby_node_id, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
SELECT primary_node, standby_node, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag
FROM repmgr.repl_monitor;
END IF;
END$repmgr$;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);
/* drop old tables */
DROP TABLE repmgr.repl_nodes;
DROP TABLE repmgr.repl_monitor;
-- remove temporary table
DROP TABLE repmgr_old_schema;

View File

@@ -1,245 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
-- extract the current schema name
-- NOTE: this assumes there will be only one schema matching 'repmgr_%';
-- user is responsible for ensuring this is the case
CREATE TEMPORARY TABLE repmgr_old_schema (schema_name TEXT);
INSERT INTO repmgr_old_schema (schema_name)
SELECT nspname AS schema_name
FROM pg_catalog.pg_namespace
WHERE nspname LIKE 'repmgr_%'
LIMIT 1;
-- move old objects into new schema
DO $repmgr$
DECLARE
old_schema TEXT;
BEGIN
SELECT schema_name FROM repmgr_old_schema
INTO old_schema;
EXECUTE format('ALTER TABLE %I.repl_nodes SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_events SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_monitor SET SCHEMA repmgr', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_show_nodes', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_status', old_schema);
END$repmgr$;
-- convert "repmgr_$cluster.repl_nodes" to "repmgr.nodes"
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
INSERT INTO repmgr.nodes
(node_id, upstream_node_id, active, node_name, type, location, priority, conninfo, repluser, slot_name, config_file)
SELECT id, upstream_node_id, active, name,
CASE WHEN type = 'master' THEN 'primary' ELSE type END,
'default', priority, conninfo, 'unknown', slot_name, 'unknown'
FROM repmgr.repl_nodes
ORDER BY id;
-- convert "repmgr_$cluster.repl_event" to "event"
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
INSERT INTO repmgr.events
(node_id, event, successful, event_timestamp, details)
SELECT node_id, event, successful, event_timestamp, details
FROM repmgr.repl_events;
-- create new table "repmgr.voting_term"
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
INSERT INTO repmgr.voting_term (term) VALUES (1);
-- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
INSERT INTO repmgr.monitoring_history
(primary_node_id, standby_node_id, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
SELECT primary_node, standby_node, last_monitor_time, last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
FROM repmgr.repl_monitor;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);
/* drop old tables */
DROP TABLE repmgr.repl_nodes;
DROP TABLE repmgr.repl_monitor;
DROP TABLE repmgr.repl_events;
-- remove temporary table
DROP TABLE repmgr_old_schema;

557
repmgr-action-bdr.c Normal file
View File

@@ -0,0 +1,557 @@
/*
* repmgr-action-bdr.c
*
* Implements BDR-related actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "repmgr.h"
#include "repmgr-client-global.h"
#include "repmgr-action-bdr.h"
/*
* do_bdr_register()
*
* As each BDR node is its own primary, registering a BDR node
* will create the repmgr metadata schema if necessary.
*/
void
do_bdr_register(void)
{
PGconn *conn = NULL;
BdrNodeInfoList bdr_nodes = T_BDR_NODE_INFO_LIST_INITIALIZER;
ExtensionStatus extension_status = REPMGR_UNKNOWN;
t_node_info node_info = T_NODE_INFO_INITIALIZER;
RecordStatus record_status = RECORD_NOT_FOUND;
PQExpBufferData event_details;
bool success = true;
char *dbname = NULL;
/* sanity-check configuration for BDR-compatability */
if (config_file_options.replication_type != REPLICATION_TYPE_BDR)
{
log_error(_("cannot run BDR REGISTER on a non-BDR node"));
exit(ERR_BAD_CONFIG);
}
dbname = pg_malloc0(MAXLEN);
if (dbname == NULL)
{
log_error(_("unable to allocate memory; terminating."));
exit(ERR_OUT_OF_MEMORY);
}
/* store the database name for future reference */
get_conninfo_value(config_file_options.conninfo, "dbname", dbname);
conn = establish_db_connection(config_file_options.conninfo, true);
if (!is_bdr_db(conn, NULL))
{
log_error(_("database \"%s\" is not BDR-enabled"), dbname);
log_hint(_("when using repmgr with BDR, the repmgr schema must be stored in the BDR database"));
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
/* Check that there are at most 2 BDR nodes */
get_all_bdr_node_records(conn, &bdr_nodes);
if (bdr_nodes.node_count == 0)
{
log_error(_("database \"%s\" is BDR-enabled but no BDR nodes were found"), dbname);
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
/* BDR 2 implementation is for 2 nodes only */
if (get_bdr_version_num() < 3 && bdr_nodes.node_count > 2)
{
log_error(_("repmgr can only support BDR 2.x clusters with 2 nodes"));
log_detail(_("this BDR cluster has %i nodes"), bdr_nodes.node_count);
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
if (get_bdr_version_num() > 2)
{
log_error(_("\"repmgr bdr register\" is for BDR 2.x only"));
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
/* check for a matching BDR node */
{
PQExpBufferData bdr_local_node_name;
bool node_match = false;
initPQExpBuffer(&bdr_local_node_name);
node_match = bdr_node_name_matches(conn, config_file_options.node_name, &bdr_local_node_name);
if (node_match == false)
{
if (strlen(bdr_local_node_name.data))
{
log_error(_("local node BDR node name is \"%s\", expected: \"%s\""),
bdr_local_node_name.data,
config_file_options.node_name);
log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
}
else
{
log_error(_("local node does not report BDR node name"));
log_hint(_("ensure this is an active BDR node"));
}
PQfinish(conn);
pfree(dbname);
termPQExpBuffer(&bdr_local_node_name);
exit(ERR_BAD_CONFIG);
}
termPQExpBuffer(&bdr_local_node_name);
}
/* check whether repmgr extension exists, and there are no non-BDR nodes registered */
extension_status = get_repmgr_extension_status(conn, NULL);
if (extension_status == REPMGR_UNKNOWN)
{
log_error(_("unable to determine status of \"repmgr\" extension in database \"%s\""),
dbname);
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
if (extension_status == REPMGR_UNAVAILABLE)
{
log_error(_("\"repmgr\" extension is not available"));
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
if (extension_status == REPMGR_INSTALLED)
{
if (!is_bdr_repmgr(conn))
{
log_error(_("repmgr metadatabase contains records for non-BDR nodes"));
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
}
else
{
log_debug("creating repmgr extension in database \"%s\"", dbname);
begin_transaction(conn);
if (!create_repmgr_extension(conn))
{
log_error(_("unable to create repmgr extension - see preceding error message(s); aborting"));
rollback_transaction(conn);
pfree(dbname);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
commit_transaction(conn);
}
pfree(dbname);
if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false)
{
log_debug("bdr_node_has_repmgr_set() = false");
bdr_node_set_repmgr_set(conn, config_file_options.node_name);
}
/*
* before adding the extension tables to the replication set, if any other
* BDR nodes exist, populate repmgr.nodes with a copy of existing entries
*
* currently we won't copy the contents of any other tables
*
*/
{
NodeInfoList local_node_records = T_NODE_INFO_LIST_INITIALIZER;
(void) get_all_node_records(conn, &local_node_records);
if (local_node_records.node_count == 0)
{
BdrNodeInfoList bdr_nodes = T_BDR_NODE_INFO_LIST_INITIALIZER;
BdrNodeInfoListCell *bdr_cell = NULL;
get_all_bdr_node_records(conn, &bdr_nodes);
if (bdr_nodes.node_count == 0)
{
log_error(_("unable to retrieve any BDR node records"));
log_detail("%s", PQerrorMessage(conn));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
for (bdr_cell = bdr_nodes.head; bdr_cell; bdr_cell = bdr_cell->next)
{
PGconn *bdr_node_conn = NULL;
NodeInfoList existing_nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
ExtensionStatus other_node_extension_status = REPMGR_UNKNOWN;
/* skip the local node */
if (strncmp(node_info.node_name, bdr_cell->node_info->node_name, sizeof(node_info.node_name)) == 0)
{
continue;
}
log_debug("connecting to BDR node \"%s\" (conninfo: \"%s\")",
bdr_cell->node_info->node_name,
bdr_cell->node_info->node_local_dsn);
bdr_node_conn = establish_db_connection_quiet(bdr_cell->node_info->node_local_dsn);
if (PQstatus(bdr_node_conn) != CONNECTION_OK)
{
continue;
}
/* check repmgr schema exists, skip if not */
other_node_extension_status = get_repmgr_extension_status(bdr_node_conn, NULL);
if (other_node_extension_status != REPMGR_INSTALLED)
{
continue;
}
(void) get_all_node_records(bdr_node_conn, &existing_nodes);
for (cell = existing_nodes.head; cell; cell = cell->next)
{
log_debug("creating record for node \"%s\" (ID: %i)",
cell->node_info->node_name, cell->node_info->node_id);
create_node_record(conn, "bdr register", cell->node_info);
}
PQfinish(bdr_node_conn);
break;
}
}
}
/* Add the repmgr extension tables to a replication set */
if (get_bdr_version_num() < 3)
{
add_extension_tables_to_bdr_replication_set(conn);
}
else
{
/* this is the only table we need to replicate */
char *replication_set = get_default_bdr_replication_set(conn);
/*
* this probably won't happen, but we need to be sure we're using
* the replication set metadata correctly...
*/
if (conn == NULL)
{
log_error(_("unable to retrieve default BDR replication set"));
log_hint(_("see preceding messages"));
log_debug("check query in get_default_bdr_replication_set()");
exit(ERR_BAD_CONFIG);
}
if (is_table_in_bdr_replication_set(conn, "nodes", replication_set) == false)
{
add_table_to_bdr_replication_set(conn, "nodes", replication_set);
}
pfree(replication_set);
}
initPQExpBuffer(&event_details);
begin_transaction(conn);
/*
* we'll check if a record exists (even if the schema was just created),
* as there's a faint chance of a race condition
*/
record_status = get_node_record(conn, config_file_options.node_id, &node_info);
/* Update internal node record */
node_info.type = BDR;
node_info.node_id = config_file_options.node_id;
node_info.upstream_node_id = NO_UPSTREAM_NODE;
node_info.active = true;
node_info.priority = config_file_options.priority;
strncpy(node_info.node_name, config_file_options.node_name, sizeof(node_info.node_name));
strncpy(node_info.location, config_file_options.location, sizeof(node_info.location));
strncpy(node_info.conninfo, config_file_options.conninfo, sizeof(node_info.conninfo));
if (record_status == RECORD_FOUND)
{
bool node_updated = false;
/*
* At this point we will have established there are no non-BDR
* records, so no need to verify the node type
*/
if (!runtime_options.force)
{
log_error(_("this node is already registered"));
log_hint(_("use -F/--force to overwrite the existing node record"));
rollback_transaction(conn);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
/*
* don't permit changing the node name - this must match the BDR node
* name set when the node was registered.
*/
if (strncmp(node_info.node_name, config_file_options.node_name, sizeof(node_info.node_name)) != 0)
{
log_error(_("a record for node %i is already registered with node_name \"%s\""),
config_file_options.node_id, node_info.node_name);
log_hint(_("node_name configured in repmgr.conf is \"%s\""), config_file_options.node_name);
rollback_transaction(conn);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
node_updated = update_node_record(conn, "bdr register", &node_info);
if (node_updated == true)
{
appendPQExpBuffer(&event_details, _("node record updated for node \"%s\" (%i)"),
config_file_options.node_name, config_file_options.node_id);
log_verbose(LOG_NOTICE, "%s", event_details.data);
}
else
{
success = false;
}
}
else
{
/* create new node record */
bool node_created = create_node_record(conn, "bdr register", &node_info);
if (node_created == true)
{
appendPQExpBuffer(&event_details,
_("node record created for node \"%s\" (ID: %i)"),
config_file_options.node_name, config_file_options.node_id);
log_notice("%s", event_details.data);
}
else
{
success = false;
}
}
if (success == false)
{
rollback_transaction(conn);
PQfinish(conn);
exit(ERR_DB_QUERY);
}
commit_transaction(conn);
/* Log the event */
create_event_notification(
conn,
&config_file_options,
config_file_options.node_id,
"bdr_register",
true,
event_details.data);
termPQExpBuffer(&event_details);
PQfinish(conn);
log_notice(_("BDR node %i registered (conninfo: %s)"),
config_file_options.node_id, config_file_options.conninfo);
return;
}
void
do_bdr_unregister(void)
{
PGconn *conn = NULL;
ExtensionStatus extension_status = REPMGR_UNKNOWN;
int target_node_id = UNKNOWN_NODE_ID;
t_node_info node_info = T_NODE_INFO_INITIALIZER;
RecordStatus record_status = RECORD_NOT_FOUND;
bool node_record_deleted = false;
PQExpBufferData event_details;
char *dbname;
/* sanity-check configuration for BDR-compatability */
if (config_file_options.replication_type != REPLICATION_TYPE_BDR)
{
log_error(_("cannot run BDR UNREGISTER on a non-BDR node"));
exit(ERR_BAD_CONFIG);
}
dbname = pg_malloc0(MAXLEN);
if (dbname == NULL)
{
log_error(_("unable to allocate memory; terminating."));
exit(ERR_OUT_OF_MEMORY);
}
/* store the database name for future reference */
get_conninfo_value(config_file_options.conninfo, "dbname", dbname);
conn = establish_db_connection(config_file_options.conninfo, true);
if (!is_bdr_db(conn, NULL))
{
log_error(_("database \"%s\" is not BDR-enabled"), dbname);
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
extension_status = get_repmgr_extension_status(conn, NULL);
if (extension_status != REPMGR_INSTALLED)
{
log_error(_("repmgr is not installed on database \"%s\""), dbname);
PQfinish(conn);
pfree(dbname);
exit(ERR_BAD_CONFIG);
}
pfree(dbname);
if (!is_bdr_repmgr(conn))
{
log_error(_("repmgr metadatabase contains records for non-BDR nodes"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
initPQExpBuffer(&event_details);
if (runtime_options.node_id != UNKNOWN_NODE_ID)
target_node_id = runtime_options.node_id;
else
target_node_id = config_file_options.node_id;
/* Check node exists and is really a BDR node */
record_status = get_node_record(conn, target_node_id, &node_info);
if (record_status != RECORD_FOUND)
{
log_error(_("no record found for node %i"), target_node_id);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
begin_transaction(conn);
log_debug("unregistering node %i", target_node_id);
node_record_deleted = delete_node_record(conn, target_node_id);
if (node_record_deleted == false)
{
appendPQExpBuffer(&event_details,
"unable to delete node record for node \"%s\" (ID: %i)",
node_info.node_name,
target_node_id);
rollback_transaction(conn);
}
else
{
appendPQExpBuffer(&event_details,
"node record deleted for node \"%s\" (ID: %i)",
node_info.node_name,
target_node_id);
commit_transaction(conn);
}
/* Log the event */
create_event_notification(
conn,
&config_file_options,
config_file_options.node_id,
"bdr_unregister",
true,
event_details.data);
PQfinish(conn);
log_notice(_("bdr node \"%s\" (ID: %i) successfully unregistered"),
node_info.node_name, target_node_id);
termPQExpBuffer(&event_details);
return;
}
void
do_bdr_help(void)
{
print_help_header();
printf(_("Usage:\n"));
printf(_(" %s [OPTIONS] bdr register\n"), progname());
printf(_(" %s [OPTIONS] bdr unregister\n"), progname());
puts("");
printf(_("BDR REGISTER\n"));
puts("");
printf(_(" \"bdr register\" initialises the repmgr cluster and registers the initial bdr node.\n"));
puts("");
printf(_(" -F, --force overwrite an existing node record\n"));
puts("");
printf(_("BDR UNREGISTER\n"));
puts("");
printf(_(" \"bdr unregister\" unregisters an inactive BDR node.\n"));
puts("");
printf(_(" --node-id ID node to unregister (optional, used when the node to unregister\n" \
" is offline)\n"));
puts("");
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-service.h
* Copyright (c) 2ndQuadrant, 2010-2020
* repmgr-action-bdr.h
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,13 +16,13 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef _REPMGR_ACTION_SERVICE_H_
#define _REPMGR_ACTION_SERVICE_H_
#ifndef _REPMGR_ACTION_BDR_H_
#define _REPMGR_ACTION_BDR_H_
extern void do_bdr_register(void);
extern void do_bdr_unregister(void);
extern void do_bdr_help(void);
extern void do_service_status(void);
extern void do_service_pause(void);
extern void do_service_unpause(void);
extern void do_service_help(void);
#endif
#endif /* _REPMGR_ACTION_BDR_H_ */

View File

@@ -3,7 +3,7 @@
*
* Implements cluster information actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -55,8 +55,10 @@ typedef enum
struct ColHeader headers_show[SHOW_HEADER_COUNT];
struct ColHeader headers_event[EVENT_HEADER_COUNT];
static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, int *error_code);
static int build_cluster_crosscheck(t_node_status_cube ***cube_dest, ItemList *warnings, int *error_code);
static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, ItemList *warnings, int *error_code);
static int build_cluster_crosscheck(t_node_status_cube ***cube_dest, int *name_length, ItemList *warnings, int *error_code);
static void cube_set_node_status(t_node_status_cube **cube, int n, int node_id, int matrix_node_id, int connection_node_id, int connection_status);
/*
@@ -65,8 +67,6 @@ static void cube_set_node_status(t_node_status_cube **cube, int n, int node_id,
* Parameters:
* --compact
* --csv
* --terse
* --verbose
*/
void
do_cluster_show(void)
@@ -206,8 +206,7 @@ do_cluster_show(void)
else
{
/* NOP on pre-9.6 servers */
cell->node_info->replication_info->timeline_id = get_node_timeline(cell->node_info->conn,
cell->node_info->replication_info->timeline_id_str);
cell->node_info->replication_info->timeline_id = get_node_timeline(cell->node_info->conn);
}
initPQExpBuffer(&node_status);
@@ -245,13 +244,18 @@ do_cluster_show(void)
headers_show[SHOW_LOCATION].cur_length = strlen(cell->node_info->location);
/* Format timeline ID */
if (cell->node_info->type == WITNESS)
if (cell->node_info->replication_info->timeline_id == UNKNOWN_TIMELINE_ID)
{
/* The witness node's timeline ID is irrelevant */
strncpy(cell->node_info->replication_info->timeline_id_str, _("n/a"), MAXLEN);
/* display "?" */
headers_show[SHOW_PRIORITY].cur_length = 1;
}
else
{
initPQExpBuffer(&buf);
appendPQExpBuffer(&buf, "%i", cell->node_info->replication_info->timeline_id);
headers_show[SHOW_PRIORITY].cur_length = strlen(buf.data);
termPQExpBuffer(&buf);
}
headers_show[SHOW_TIMELINE_ID].cur_length = strlen(cell->node_info->replication_info->timeline_id_str);
headers_show[SHOW_CONNINFO].cur_length = strlen(cell->node_info->conninfo);
@@ -318,7 +322,10 @@ do_cluster_show(void)
if (headers_show[SHOW_TIMELINE_ID].display == true)
{
printf("| %-*s ", headers_show[SHOW_TIMELINE_ID].max_length, cell->node_info->replication_info->timeline_id_str);
if (cell->node_info->replication_info->timeline_id == UNKNOWN_TIMELINE_ID)
printf("| %-*c ", headers_show[SHOW_TIMELINE_ID].max_length, '?');
else
printf("| %-*i ", headers_show[SHOW_TIMELINE_ID].max_length, (int)cell->node_info->replication_info->timeline_id);
}
if (headers_show[SHOW_CONNINFO].display == true)
@@ -336,25 +343,14 @@ do_cluster_show(void)
/* emit any warnings */
if (warnings.head != NULL && runtime_options.terse == false && runtime_options.output_mode != OM_CSV)
{
ItemListCell *cell = NULL;
PQExpBufferData warning;
initPQExpBuffer(&warning);
appendPQExpBufferStr(&warning,
_("following issues were detected\n"));
printf(_("\nWARNING: following issues were detected\n"));
for (cell = warnings.head; cell; cell = cell->next)
{
appendPQExpBuffer(&warning,
_(" - %s\n"), cell->string);
printf(_(" - %s\n"), cell->string);
}
puts("");
log_warning("%s", warning.data);
termPQExpBuffer(&warning);
if (runtime_options.verbose == false && connection_error_found == true)
{
log_hint(_("execute with --verbose option to see connection error messages"));
@@ -536,6 +532,9 @@ do_cluster_crosscheck(void)
{
int i = 0,
n = 0;
char c;
const char *node_header = "Name";
int name_length = strlen(node_header);
t_node_status_cube **cube;
@@ -543,7 +542,7 @@ do_cluster_crosscheck(void)
int error_code = SUCCESS;
ItemList warnings = {NULL, NULL};
n = build_cluster_crosscheck(&cube, &warnings, &error_code);
n = build_cluster_crosscheck(&cube, &name_length, &warnings, &error_code);
if (runtime_options.output_mode == OM_CSV)
{
@@ -577,56 +576,24 @@ do_cluster_crosscheck(void)
}
else
{
/* output header contains node name, node ID and one column for each node in the cluster */
struct ColHeader *headers_crosscheck = NULL;
int header_count = n + 2;
int header_id = 2;
headers_crosscheck = palloc0(sizeof(ColHeader) * header_count);
/* Initialize column headers */
strncpy(headers_crosscheck[0].title, _("Name"), MAXLEN);
strncpy(headers_crosscheck[1].title, _("ID"), MAXLEN);
printf("%*s | Id ", name_length, node_header);
for (i = 0; i < n; i++)
{
maxlen_snprintf(headers_crosscheck[header_id].title, "%i", cube[i]->node_id);
header_id++;
}
/* Initialize column max values */
for (i = 0; i < header_count; i++)
{
headers_crosscheck[i].display = true;
headers_crosscheck[i].max_length = strlen(headers_crosscheck[i].title);
headers_crosscheck[i].cur_length = headers_crosscheck[i].max_length;
/* We can derive the maximum node ID length for the ID column from
* the generated matrix node ID headers
*/
if (i >= 2 && headers_crosscheck[i].max_length > headers_crosscheck[1].max_length)
headers_crosscheck[1].max_length = headers_crosscheck[i].max_length;
}
printf("| %2d ", cube[i]->node_id);
printf("\n");
for (i = 0; i < name_length; i++)
printf("-");
printf("-+----");
for (i = 0; i < n; i++)
{
if (strlen(cube[i]->node_name) > headers_crosscheck[0].max_length)
{
headers_crosscheck[0].max_length = strlen(cube[i]->node_name);
}
}
print_status_header(header_count, headers_crosscheck);
printf("+----");
printf("\n");
for (i = 0; i < n; i++)
{
int column_node_ix;
printf(" %-*s | %-*i ",
headers_crosscheck[0].max_length,
printf("%*s | %2d ", name_length,
cube[i]->node_name,
headers_crosscheck[1].max_length,
cube[i]->node_id);
for (column_node_ix = 0; column_node_ix < n; column_node_ix++)
@@ -634,8 +601,6 @@ do_cluster_crosscheck(void)
int max_node_status = -2;
int node_ix = 0;
char c;
/*
* The value of entry (i,j) is equal to the maximum value of all
* the (i,j,k). Indeed:
@@ -675,14 +640,12 @@ do_cluster_crosscheck(void)
exit(ERR_INTERNAL);
}
printf("| %-*c ", headers_crosscheck[column_node_ix + 2].max_length, c);
printf("| %c ", c);
}
printf("\n");
}
pfree(headers_crosscheck);
if (warnings.head != NULL && runtime_options.terse == false)
{
log_warning(_("following problems detected:"));
@@ -739,13 +702,16 @@ do_cluster_matrix()
j = 0,
n = 0;
const char *node_header = "Name";
int name_length = strlen(node_header);
t_node_matrix_rec **matrix_rec_list;
bool connection_error_found = false;
int error_code = SUCCESS;
ItemList warnings = {NULL, NULL};
n = build_cluster_matrix(&matrix_rec_list, &warnings, &error_code);
n = build_cluster_matrix(&matrix_rec_list, &name_length, &warnings, &error_code);
if (runtime_options.output_mode == OM_CSV)
{
@@ -768,60 +734,27 @@ do_cluster_matrix()
}
else
{
/* output header contains node name, node ID and one column for each node in the cluster */
struct ColHeader *headers_matrix = NULL;
char c;
int header_count = n + 2;
int header_id = 2;
printf("%*s | Id ", name_length, node_header);
for (i = 0; i < n; i++)
printf("| %2d ", matrix_rec_list[i]->node_id);
printf("\n");
headers_matrix = palloc0(sizeof(ColHeader) * header_count);
/* Initialize column headers */
strncpy(headers_matrix[0].title, _("Name"), MAXLEN);
strncpy(headers_matrix[1].title, _("ID"), MAXLEN);
for (i = 0; i < name_length; i++)
printf("-");
printf("-+----");
for (i = 0; i < n; i++)
printf("+----");
printf("\n");
for (i = 0; i < n; i++)
{
maxlen_snprintf(headers_matrix[header_id].title, "%i", matrix_rec_list[i]->node_id);
header_id++;
}
/* Initialize column max values */
for (i = 0; i < header_count; i++)
{
headers_matrix[i].display = true;
headers_matrix[i].max_length = strlen(headers_matrix[i].title);
headers_matrix[i].cur_length = headers_matrix[i].max_length;
/* We can derive the maximum node ID length for the ID column from
* the generated matrix node ID headers
*/
if (i >= 2 && headers_matrix[i].max_length > headers_matrix[1].max_length)
headers_matrix[1].max_length = headers_matrix[i].max_length;
}
for (i = 0; i < n; i++)
{
if (strlen(matrix_rec_list[i]->node_name) > headers_matrix[0].max_length)
{
headers_matrix[0].max_length = strlen(matrix_rec_list[i]->node_name);
}
}
print_status_header(header_count, headers_matrix);
for (i = 0; i < n; i++)
{
printf(" %-*s | %-*i ",
headers_matrix[0].max_length,
printf("%*s | %2d ", name_length,
matrix_rec_list[i]->node_name,
headers_matrix[1].max_length,
matrix_rec_list[i]->node_id);
for (j = 0; j < n; j++)
{
char c;
switch (matrix_rec_list[i]->node_status_list[j]->node_status)
{
case -2:
@@ -839,13 +772,11 @@ do_cluster_matrix()
exit(ERR_INTERNAL);
}
printf("| %-*c ", headers_matrix[j + 2].max_length, c);
printf("| %c ", c);
}
printf("\n");
}
pfree(headers_matrix);
if (warnings.head != NULL && runtime_options.terse == false)
{
log_warning(_("following problems detected:"));
@@ -901,7 +832,7 @@ matrix_set_node_status(t_node_matrix_rec **matrix_rec_list, int n, int node_id,
static int
build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, int *error_code)
build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, ItemList *warnings, int *error_code)
{
PGconn *conn = NULL;
int i = 0,
@@ -959,6 +890,7 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, i
/* Initialise matrix structure for each node */
for (cell = nodes.head; cell; cell = cell->next)
{
int name_length_cur;
NodeInfoListCell *cell_j;
matrix_rec_list[i] = (t_node_matrix_rec *) pg_malloc0(sizeof(t_node_matrix_rec));
@@ -968,6 +900,13 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, i
cell->node_info->node_name,
sizeof(matrix_rec_list[i]->node_name));
/*
* Find the maximum length of a node name
*/
name_length_cur = strlen(matrix_rec_list[i]->node_name);
if (name_length_cur > *name_length)
*name_length = name_length_cur;
matrix_rec_list[i]->node_status_list = (t_node_status_rec **) pg_malloc0(sizeof(t_node_status_rec) * nodes.node_count);
j = 0;
@@ -1048,19 +987,7 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, i
make_remote_repmgr_path(&command, cell->node_info);
appendPQExpBufferStr(&command,
" cluster show --csv --terse");
/*
* Usually we'll want NOTICE as the log level, but if the user
* explicitly provided one with --log-level, that will be passed
* in the remote repmgr invocation.
*/
if (runtime_options.log_level[0] == '\0')
{
appendPQExpBufferStr(&command,
" -L NOTICE");
}
appendPQExpBufferChar(&command, '"');
" cluster show --csv -L NOTICE --terse\"");
log_verbose(LOG_DEBUG, "build_cluster_matrix(): executing:\n %s", command.data);
@@ -1132,7 +1059,7 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, i
static int
build_cluster_crosscheck(t_node_status_cube ***dest_cube, ItemList *warnings, int *error_code)
build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length, ItemList *warnings, int *error_code)
{
PGconn *conn = NULL;
int h,
@@ -1181,12 +1108,20 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, ItemList *warnings, in
for (cell = nodes.head; cell; cell = cell->next)
{
int name_length_cur = 0;
NodeInfoListCell *cell_i = NULL;
cube[h] = (t_node_status_cube *) pg_malloc(sizeof(t_node_status_cube));
cube[h]->node_id = cell->node_info->node_id;
strncpy(cube[h]->node_name, cell->node_info->node_name, sizeof(cube[h]->node_name));
/*
* Find the maximum length of a node name
*/
name_length_cur = strlen(cube[h]->node_name);
if (name_length_cur > *name_length)
*name_length = name_length_cur;
cube[h]->matrix_list_rec = (t_node_matrix_rec **) pg_malloc(sizeof(t_node_matrix_rec) * nodes.node_count);
i = 0;
@@ -1240,18 +1175,7 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, ItemList *warnings, in
make_remote_repmgr_path(&command, cell->node_info);
appendPQExpBufferStr(&command,
" cluster matrix --csv --terse");
/*
* Usually we'll want NOTICE as the log level, but if the user
* explicitly provided one with --log-level, that will be passed
* in the remote repmgr invocation.
*/
if (runtime_options.log_level[0] == '\0')
{
appendPQExpBufferStr(&command,
" -L NOTICE");
}
" cluster matrix --csv -L NOTICE --terse");
initPQExpBuffer(&command_output);
@@ -1449,10 +1373,6 @@ do_cluster_cleanup(void)
log_warning(_("unable to vacuum table \"repmgr.monitoring_history\""));
log_detail("%s", PQerrorMessage(primary_conn));
}
else
{
log_info(_("vacuum of table \"repmgr.monitoring_history\" completed"));
}
if (runtime_options.keep_history == 0)
{
@@ -1554,5 +1474,4 @@ do_cluster_help(void)
printf(_(" -k, --keep-history=VALUE retain indicated number of days of history (default: 0)\n"));
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-cluster.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -2,7 +2,7 @@
* repmgr-action-daemon.c
*
* Implements repmgrd actions for the repmgr command line utility
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -26,9 +26,478 @@
#include "repmgr-client-global.h"
#include "repmgr-action-daemon.h"
#define REPMGR_SERVICE_STOP_START_WAIT 15
#define REPMGR_SERVICE_STATUS_START_HINT _("use \"repmgr service status\" to confirm that repmgrd was successfully started")
#define REPMGR_SERVICE_STATUS_STOP_HINT _("use \"repmgr service status\" to confirm that repmgrd was successfully stopped")
#define REPMGR_DAEMON_STOP_START_WAIT 15
#define REPMGR_DAEMON_STATUS_START_HINT _("use \"repmgr daemon status\" to confirm that repmgrd was successfully started")
#define REPMGR_DAEMON_STATUS_STOP_HINT _("use \"repmgr daemon status\" to confirm that repmgrd was successfully stopped")
/*
* Possibly also show:
* - repmgrd start time?
* - repmgrd mode
* - priority
* - whether promotion candidate (due to zero priority/different location)
*/
typedef enum
{
STATUS_ID = 0,
STATUS_NAME,
STATUS_ROLE,
STATUS_PG,
STATUS_UPSTREAM_NAME,
STATUS_LOCATION,
STATUS_PRIORITY,
STATUS_REPMGRD,
STATUS_PID,
STATUS_PAUSED,
STATUS_UPSTREAM_LAST_SEEN
} StatusHeader;
#define STATUS_HEADER_COUNT 11
struct ColHeader headers_status[STATUS_HEADER_COUNT];
static void fetch_node_records(PGconn *conn, NodeInfoList *node_list);
static void _do_repmgr_pause(bool pause);
void
do_daemon_status(void)
{
PGconn *conn = NULL;
NodeInfoList nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
int i;
RepmgrdInfo **repmgrd_info;
ItemList warnings = {NULL, NULL};
bool connection_error_found = false;
/* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database"));
if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true);
else
conn = establish_db_connection_by_params(&source_conninfo, true);
fetch_node_records(conn, &nodes);
repmgrd_info = (RepmgrdInfo **) pg_malloc0(sizeof(RepmgrdInfo *) * nodes.node_count);
if (repmgrd_info == NULL)
{
log_error(_("unable to allocate memory"));
exit(ERR_OUT_OF_MEMORY);
}
strncpy(headers_status[STATUS_ID].title, _("ID"), MAXLEN);
strncpy(headers_status[STATUS_NAME].title, _("Name"), MAXLEN);
strncpy(headers_status[STATUS_ROLE].title, _("Role"), MAXLEN);
strncpy(headers_status[STATUS_PG].title, _("Status"), MAXLEN);
strncpy(headers_status[STATUS_UPSTREAM_NAME].title, _("Upstream"), MAXLEN);
/* following only displayed with the --detail option */
strncpy(headers_status[STATUS_LOCATION].title, _("Location"), MAXLEN);
if (runtime_options.compact == true)
strncpy(headers_status[STATUS_PRIORITY].title, _("Prio."), MAXLEN);
else
strncpy(headers_status[STATUS_PRIORITY].title, _("Priority"), MAXLEN);
strncpy(headers_status[STATUS_REPMGRD].title, _("repmgrd"), MAXLEN);
strncpy(headers_status[STATUS_PID].title, _("PID"), MAXLEN);
strncpy(headers_status[STATUS_PAUSED].title, _("Paused?"), MAXLEN);
if (runtime_options.compact == true)
strncpy(headers_status[STATUS_UPSTREAM_LAST_SEEN].title, _("Upstr. last"), MAXLEN);
else
strncpy(headers_status[STATUS_UPSTREAM_LAST_SEEN].title, _("Upstream last seen"), MAXLEN);
for (i = 0; i < STATUS_HEADER_COUNT; i++)
{
headers_status[i].max_length = strlen(headers_status[i].title);
headers_status[i].display = true;
}
if (runtime_options.detail == false)
{
headers_status[STATUS_LOCATION].display = false;
headers_status[STATUS_PRIORITY].display = false;
}
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
int j;
PQExpBufferData node_status;
PQExpBufferData upstream;
repmgrd_info[i] = pg_malloc0(sizeof(RepmgrdInfo));
repmgrd_info[i]->node_id = cell->node_info->node_id;
repmgrd_info[i]->pid = UNKNOWN_PID;
repmgrd_info[i]->recovery_type = RECTYPE_UNKNOWN;
repmgrd_info[i]->paused = false;
repmgrd_info[i]->running = false;
repmgrd_info[i]->pg_running = true;
repmgrd_info[i]->wal_paused_pending_wal = false;
repmgrd_info[i]->upstream_last_seen = -1;
cell->node_info->conn = establish_db_connection_quiet(cell->node_info->conninfo);
if (PQstatus(cell->node_info->conn) != CONNECTION_OK)
{
connection_error_found = true;
if (runtime_options.verbose)
{
char error[MAXLEN];
strncpy(error, PQerrorMessage(cell->node_info->conn), MAXLEN);
item_list_append_format(&warnings,
"when attempting to connect to node \"%s\" (ID: %i), following error encountered :\n\"%s\"",
cell->node_info->node_name, cell->node_info->node_id, trim(error));
}
else
{
item_list_append_format(&warnings,
"unable to connect to node \"%s\" (ID: %i)",
cell->node_info->node_name, cell->node_info->node_id);
}
repmgrd_info[i]->pg_running = false;
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("n/a"));
maxlen_snprintf(repmgrd_info[i]->pid_text, "%s", _("n/a"));
}
else
{
cell->node_info->node_status = NODE_STATUS_UP;
cell->node_info->recovery_type = get_recovery_type(cell->node_info->conn);
repmgrd_info[i]->pid = repmgrd_get_pid(cell->node_info->conn);
repmgrd_info[i]->running = repmgrd_is_running(cell->node_info->conn);
if (repmgrd_info[i]->running == true)
{
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("running"));
}
else
{
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("not running"));
}
if (repmgrd_info[i]->pid == UNKNOWN_PID)
{
maxlen_snprintf(repmgrd_info[i]->pid_text, "%s", _("n/a"));
}
else
{
maxlen_snprintf(repmgrd_info[i]->pid_text, "%i", repmgrd_info[i]->pid);
}
repmgrd_info[i]->paused = repmgrd_is_paused(cell->node_info->conn);
repmgrd_info[i]->recovery_type = get_recovery_type(cell->node_info->conn);
if (repmgrd_info[i]->recovery_type == RECTYPE_STANDBY)
{
repmgrd_info[i]->wal_paused_pending_wal = is_wal_replay_paused(cell->node_info->conn, true);
if (repmgrd_info[i]->wal_paused_pending_wal == true)
{
item_list_append_format(&warnings,
_("WAL replay is paused on node \"%s\" (ID: %i) with WAL replay pending; this node cannot be manually promoted until WAL replay is resumed"),
cell->node_info->node_name, cell->node_info->node_id);
}
}
repmgrd_info[i]->upstream_last_seen = get_upstream_last_seen(cell->node_info->conn, cell->node_info->type);
if (repmgrd_info[i]->upstream_last_seen < 0)
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, "%s", _("n/a"));
}
else
{
if (runtime_options.compact == true)
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, _("%i sec(s) ago"), repmgrd_info[i]->upstream_last_seen);
}
else
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, _("%i second(s) ago"), repmgrd_info[i]->upstream_last_seen);
}
}
}
initPQExpBuffer(&node_status);
initPQExpBuffer(&upstream);
(void)format_node_status(cell->node_info, &node_status, &upstream, &warnings);
snprintf(repmgrd_info[i]->pg_running_text, sizeof(cell->node_info->details),
"%s", node_status.data);
snprintf(cell->node_info->upstream_node_name, sizeof(cell->node_info->upstream_node_name),
"%s", upstream.data);
termPQExpBuffer(&node_status);
termPQExpBuffer(&upstream);
PQfinish(cell->node_info->conn);
headers_status[STATUS_NAME].cur_length = strlen(cell->node_info->node_name);
headers_status[STATUS_ROLE].cur_length = strlen(get_node_type_string(cell->node_info->type));
headers_status[STATUS_PG].cur_length = strlen(repmgrd_info[i]->pg_running_text);
headers_status[STATUS_UPSTREAM_NAME].cur_length = strlen(cell->node_info->upstream_node_name);
if (runtime_options.detail == true)
{
PQExpBufferData buf;
headers_status[STATUS_LOCATION].cur_length = strlen(cell->node_info->location);
initPQExpBuffer(&buf);
appendPQExpBuffer(&buf, "%i", cell->node_info->priority);
headers_status[STATUS_PRIORITY].cur_length = strlen(buf.data);
termPQExpBuffer(&buf);
}
headers_status[STATUS_PID].cur_length = strlen(repmgrd_info[i]->pid_text);
headers_status[STATUS_REPMGRD].cur_length = strlen(repmgrd_info[i]->repmgrd_running);
headers_status[STATUS_UPSTREAM_LAST_SEEN].cur_length = strlen(repmgrd_info[i]->upstream_last_seen_text);
for (j = 0; j < STATUS_HEADER_COUNT; j++)
{
if (headers_status[j].cur_length > headers_status[j].max_length)
{
headers_status[j].max_length = headers_status[j].cur_length;
}
}
i++;
}
/* Print column header row (text mode only) */
if (runtime_options.output_mode == OM_TEXT)
{
print_status_header(STATUS_HEADER_COUNT, headers_status);
}
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
if (runtime_options.output_mode == OM_CSV)
{
int running = repmgrd_info[i]->running ? 1 : 0;
int paused = repmgrd_info[i]->paused ? 1 : 0;
/* If PostgreSQL is not running, repmgrd status is unknown */
if (repmgrd_info[i]->pg_running == false)
{
running = -1;
paused = -1;
}
printf("%i,%s,%s,%i,%i,%i,%i,%i,%i,%s\n",
cell->node_info->node_id,
cell->node_info->node_name,
get_node_type_string(cell->node_info->type),
repmgrd_info[i]->pg_running ? 1 : 0,
running,
repmgrd_info[i]->pid,
paused,
cell->node_info->priority,
repmgrd_info[i]->pid == UNKNOWN_PID
? -1
: repmgrd_info[i]->upstream_last_seen,
cell->node_info->location);
}
else
{
printf(" %-*i ", headers_status[STATUS_ID].max_length, cell->node_info->node_id);
printf("| %-*s ", headers_status[STATUS_NAME].max_length, cell->node_info->node_name);
printf("| %-*s ", headers_status[STATUS_ROLE].max_length, get_node_type_string(cell->node_info->type));
printf("| %-*s ", headers_status[STATUS_PG].max_length, repmgrd_info[i]->pg_running_text);
printf("| %-*s ", headers_status[STATUS_UPSTREAM_NAME].max_length, cell->node_info->upstream_node_name);
if (runtime_options.detail == true)
{
printf("| %-*s ", headers_status[STATUS_LOCATION].max_length, cell->node_info->location);
printf("| %-*i ", headers_status[STATUS_PRIORITY].max_length, cell->node_info->priority);
}
printf("| %-*s ", headers_status[STATUS_REPMGRD].max_length, repmgrd_info[i]->repmgrd_running);
printf("| %-*s ", headers_status[STATUS_PID].max_length, repmgrd_info[i]->pid_text);
if (repmgrd_info[i]->pid == UNKNOWN_PID)
{
printf("| %-*s ", headers_status[STATUS_PAUSED].max_length, _("n/a"));
printf("| %-*s ", headers_status[STATUS_UPSTREAM_LAST_SEEN].max_length, _("n/a"));
}
else
{
printf("| %-*s ", headers_status[STATUS_PAUSED].max_length, repmgrd_info[i]->paused ? _("yes") : _("no"));
printf("| %-*s ", headers_status[STATUS_UPSTREAM_LAST_SEEN].max_length, repmgrd_info[i]->upstream_last_seen_text);
}
printf("\n");
}
pfree(repmgrd_info[i]);
i++;
}
pfree(repmgrd_info);
/* emit any warnings */
if (warnings.head != NULL && runtime_options.terse == false && runtime_options.output_mode != OM_CSV)
{
ItemListCell *cell = NULL;
printf(_("\nWARNING: following issues were detected\n"));
for (cell = warnings.head; cell; cell = cell->next)
{
printf(_(" - %s\n"), cell->string);
}
if (runtime_options.verbose == false && connection_error_found == true)
{
log_hint(_("execute with --verbose option to see connection error messages"));
}
}
}
void
do_daemon_pause(void)
{
_do_repmgr_pause(true);
}
void
do_daemon_unpause(void)
{
_do_repmgr_pause(false);
}
static void
_do_repmgr_pause(bool pause)
{
PGconn *conn = NULL;
NodeInfoList nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
int i;
int error_nodes = 0;
/* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database"));
if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true);
else
conn = establish_db_connection_by_params(&source_conninfo, true);
fetch_node_records(conn, &nodes);
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
log_verbose(LOG_DEBUG, "pausing node %i (%s)",
cell->node_info->node_id,
cell->node_info->node_name);
cell->node_info->conn = establish_db_connection_quiet(cell->node_info->conninfo);
if (PQstatus(cell->node_info->conn) != CONNECTION_OK)
{
log_warning(_("unable to connect to node %i"),
cell->node_info->node_id);
error_nodes++;
}
else
{
if (runtime_options.dry_run == true)
{
if (pause == true)
{
log_info(_("would pause node %i (%s) "),
cell->node_info->node_id,
cell->node_info->node_name);
}
else
{
log_info(_("would unpause node %i (%s) "),
cell->node_info->node_id,
cell->node_info->node_name);
}
}
else
{
bool success = repmgrd_pause(cell->node_info->conn, pause);
if (success == false)
error_nodes++;
log_notice(_("node %i (%s) %s"),
cell->node_info->node_id,
cell->node_info->node_name,
success == true
? pause == true ? "paused" : "unpaused"
: pause == true ? "not paused" : "not unpaused");
}
PQfinish(cell->node_info->conn);
}
i++;
}
if (error_nodes > 0)
{
if (pause == true)
{
log_error(_("unable to pause %i node(s)"), error_nodes);
}
else
{
log_error(_("unable to unpause %i node(s)"), error_nodes);
}
log_hint(_("execute \"repmgr daemon status\" to view current status"));
exit(ERR_REPMGRD_PAUSE);
}
exit(SUCCESS);
}
void
fetch_node_records(PGconn *conn, NodeInfoList *node_list)
{
bool success = get_all_node_records_with_upstream(conn, node_list);
if (success == false)
{
/* get_all_node_records() will display any error message */
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
if (node_list->node_count == 0)
{
log_error(_("no node records were found"));
log_hint(_("ensure at least one node is registered"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
}
void
do_daemon_start(void)
@@ -114,12 +583,12 @@ do_daemon_start(void)
if (runtime_options.no_wait == true || runtime_options.wait == 0)
{
log_hint(REPMGR_SERVICE_STATUS_START_HINT);
log_hint(REPMGR_DAEMON_STATUS_START_HINT);
}
else
{
int i = 0;
int timeout = REPMGR_SERVICE_STOP_START_WAIT;
int timeout = REPMGR_DAEMON_STOP_START_WAIT;
if (runtime_options.wait_provided)
timeout = runtime_options.wait;
@@ -129,7 +598,7 @@ do_daemon_start(void)
if (PQstatus(conn) != CONNECTION_OK)
{
log_notice(_("unable to connect to local node"));
log_hint(REPMGR_SERVICE_STATUS_START_HINT);
log_hint(REPMGR_DAEMON_STATUS_START_HINT);
exit(ERR_DB_CONN);
}
@@ -147,7 +616,7 @@ do_daemon_start(void)
PQfinish(conn);
log_error(_("repmgrd does not appear to have started after %i seconds"),
timeout);
log_hint(REPMGR_SERVICE_STATUS_START_HINT);
log_hint(REPMGR_DAEMON_STATUS_START_HINT);
exit(ERR_REPMGRD_SERVICE);
}
@@ -243,12 +712,12 @@ void do_daemon_stop(void)
if (runtime_options.no_wait == true || runtime_options.wait == 0)
{
if (have_db_connection == true)
log_hint(REPMGR_SERVICE_STATUS_STOP_HINT);
log_hint(REPMGR_DAEMON_STATUS_STOP_HINT);
}
else
{
int i = 0;
int timeout = REPMGR_SERVICE_STOP_START_WAIT;
int timeout = REPMGR_DAEMON_STOP_START_WAIT;
/*
*
*/
@@ -263,7 +732,7 @@ void do_daemon_stop(void)
log_warning(_("unable to determine repmgrd PID"));
if (have_db_connection == true)
log_hint(REPMGR_SERVICE_STATUS_STOP_HINT);
log_hint(REPMGR_DAEMON_STATUS_STOP_HINT);
exit(ERR_REPMGRD_SERVICE);
}
@@ -295,7 +764,7 @@ void do_daemon_stop(void)
timeout);
if (have_db_connection == true)
log_hint(REPMGR_SERVICE_STATUS_START_HINT);
log_hint(REPMGR_DAEMON_STATUS_START_HINT);
exit(ERR_REPMGRD_SERVICE);
}
@@ -314,30 +783,53 @@ void do_daemon_help(void)
print_help_header();
printf(_("Usage:\n"));
printf(_(" %s [OPTIONS] daemon status\n"), progname());
printf(_(" %s [OPTIONS] daemon pause\n"), progname());
printf(_(" %s [OPTIONS] daemon unpause\n"), progname());
printf(_(" %s [OPTIONS] daemon start\n"), progname());
printf(_(" %s [OPTIONS] daemon stop\n"), progname());
puts("");
printf(_("DAEMON STATUS\n"));
puts("");
printf(_(" \"daemon status\" shows the status of repmgrd on each node in the cluster\n"));
puts("");
printf(_(" --csv emit output as CSV\n"));
printf(_(" --detail show additional detail\n"));
printf(_(" --verbose show text of database connection error messages\n"));
puts("");
printf(_("DAEMON START\n"));
puts("");
printf(_(" \"daemon start\" attempts to start repmgrd on the local node\n"));
printf(_(" \"daemon start\" attempts to start repmgrd\n"));
puts("");
printf(_(" --dry-run check prerequisites but don't start repmgrd\n"));
printf(_(" -w/--wait wait for repmgrd to start (default: %i seconds)\n"), REPMGR_SERVICE_STOP_START_WAIT);
printf(_(" -w/--wait wait for repmgrd to start (default: %i seconds)\n"), REPMGR_DAEMON_STOP_START_WAIT);
printf(_(" --no-wait don't wait for repmgrd to start\n"));
puts("");
printf(_("DAEMON STOP\n"));
puts("");
printf(_(" \"daemon stop\" attempts to stop repmgrd on the local node\n"));
printf(_(" \"daemon stop\" attempts to stop repmgrd\n"));
puts("");
printf(_(" --dry-run check prerequisites but don't stop repmgrd\n"));
printf(_(" -w/--wait wait for repmgrd to stop (default: %i seconds)\n"), REPMGR_SERVICE_STOP_START_WAIT);
printf(_(" -w/--wait wait for repmgrd to stop (default: %i seconds)\n"), REPMGR_DAEMON_STOP_START_WAIT);
printf(_(" --no-wait don't wait for repmgrd to stop\n"));
puts("");
printf(_("DAEMON PAUSE\n"));
puts("");
printf(_(" \"daemon pause\" instructs repmgrd on each node to pause failover detection\n"));
puts("");
printf(_(" --dry-run check if nodes are reachable but don't pause repmgrd\n"));
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
printf(_("DAEMON UNPAUSE\n"));
puts("");
printf(_(" \"daemon unpause\" instructs repmgrd on each node to resume failover detection\n"));
puts("");
printf(_(" --dry-run check if nodes are reachable but don't unpause repmgrd\n"));
puts("");
puts("");
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-daemon.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -19,6 +19,10 @@
#ifndef _REPMGR_ACTION_DAEMON_H_
#define _REPMGR_ACTION_DAEMON_H_
extern void do_daemon_status(void);
extern void do_daemon_pause(void);
extern void do_daemon_unpause(void);
extern void do_daemon_start(void);
extern void do_daemon_stop(void);

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-node.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
*
* Implements primary actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -106,7 +106,7 @@ do_primary_register(void)
current_primary_id = get_primary_node_id(conn);
if (current_primary_id != NODE_NOT_FOUND && current_primary_id != config_file_options.node_id)
{
log_debug("current active primary node ID is %i", current_primary_id);
log_debug("XXX %i", current_primary_id);
primary_conn = establish_primary_db_connection(conn, false);
if (PQstatus(primary_conn) == CONNECTION_OK)
@@ -454,9 +454,9 @@ do_primary_unregister(void)
/*
* This appears to be the cluster primary - cowardly refuse to
* delete the record, unless --force is supplied.
* delete the record
*/
if (primary_node_info.node_id == target_node_info_ptr->node_id && !runtime_options.force)
if (primary_node_info.node_id == target_node_info_ptr->node_id)
{
log_error(_("node \"%s\" (ID: %i) is the current primary node, unable to unregister"),
target_node_info_ptr->node_name,
@@ -575,5 +575,4 @@ do_primary_help(void)
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-primary.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,548 +0,0 @@
/*
* repmgr-action-service.c
*
* Implements repmgrd actions for the repmgr command line utility
* Copyright (c) 2ndQuadrant, 2010-2020
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <signal.h>
#include <sys/stat.h> /* for stat() */
#include "repmgr.h"
#include "repmgr-client-global.h"
#include "repmgr-action-service.h"
/*
* Possibly also show:
* - repmgrd start time?
* - repmgrd mode
* - priority
* - whether promotion candidate (due to zero priority/different location)
*/
typedef enum
{
STATUS_ID = 0,
STATUS_NAME,
STATUS_ROLE,
STATUS_PG,
STATUS_UPSTREAM_NAME,
STATUS_LOCATION,
STATUS_PRIORITY,
STATUS_REPMGRD,
STATUS_PID,
STATUS_PAUSED,
STATUS_UPSTREAM_LAST_SEEN
} StatusHeader;
#define STATUS_HEADER_COUNT 11
struct ColHeader headers_status[STATUS_HEADER_COUNT];
static void fetch_node_records(PGconn *conn, NodeInfoList *node_list);
static void _do_repmgr_pause(bool pause);
void
do_service_status(void)
{
PGconn *conn = NULL;
NodeInfoList nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
int i;
RepmgrdInfo **repmgrd_info;
ItemList warnings = {NULL, NULL};
bool connection_error_found = false;
/* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database"));
if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true);
else
conn = establish_db_connection_by_params(&source_conninfo, true);
fetch_node_records(conn, &nodes);
repmgrd_info = (RepmgrdInfo **) pg_malloc0(sizeof(RepmgrdInfo *) * nodes.node_count);
if (repmgrd_info == NULL)
{
log_error(_("unable to allocate memory"));
exit(ERR_OUT_OF_MEMORY);
}
strncpy(headers_status[STATUS_ID].title, _("ID"), MAXLEN);
strncpy(headers_status[STATUS_NAME].title, _("Name"), MAXLEN);
strncpy(headers_status[STATUS_ROLE].title, _("Role"), MAXLEN);
strncpy(headers_status[STATUS_PG].title, _("Status"), MAXLEN);
strncpy(headers_status[STATUS_UPSTREAM_NAME].title, _("Upstream"), MAXLEN);
/* following only displayed with the --detail option */
strncpy(headers_status[STATUS_LOCATION].title, _("Location"), MAXLEN);
if (runtime_options.compact == true)
strncpy(headers_status[STATUS_PRIORITY].title, _("Prio."), MAXLEN);
else
strncpy(headers_status[STATUS_PRIORITY].title, _("Priority"), MAXLEN);
strncpy(headers_status[STATUS_REPMGRD].title, _("repmgrd"), MAXLEN);
strncpy(headers_status[STATUS_PID].title, _("PID"), MAXLEN);
strncpy(headers_status[STATUS_PAUSED].title, _("Paused?"), MAXLEN);
if (runtime_options.compact == true)
strncpy(headers_status[STATUS_UPSTREAM_LAST_SEEN].title, _("Upstr. last"), MAXLEN);
else
strncpy(headers_status[STATUS_UPSTREAM_LAST_SEEN].title, _("Upstream last seen"), MAXLEN);
for (i = 0; i < STATUS_HEADER_COUNT; i++)
{
headers_status[i].max_length = strlen(headers_status[i].title);
headers_status[i].display = true;
}
if (runtime_options.detail == false)
{
headers_status[STATUS_LOCATION].display = false;
headers_status[STATUS_PRIORITY].display = false;
}
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
int j;
PQExpBufferData node_status;
PQExpBufferData upstream;
repmgrd_info[i] = pg_malloc0(sizeof(RepmgrdInfo));
repmgrd_info[i]->node_id = cell->node_info->node_id;
repmgrd_info[i]->pid = UNKNOWN_PID;
repmgrd_info[i]->recovery_type = RECTYPE_UNKNOWN;
repmgrd_info[i]->paused = false;
repmgrd_info[i]->running = false;
repmgrd_info[i]->pg_running = true;
repmgrd_info[i]->wal_paused_pending_wal = false;
repmgrd_info[i]->upstream_last_seen = -1;
cell->node_info->conn = establish_db_connection_quiet(cell->node_info->conninfo);
if (PQstatus(cell->node_info->conn) != CONNECTION_OK)
{
connection_error_found = true;
if (runtime_options.verbose)
{
char error[MAXLEN];
strncpy(error, PQerrorMessage(cell->node_info->conn), MAXLEN);
item_list_append_format(&warnings,
"when attempting to connect to node \"%s\" (ID: %i), following error encountered :\n\"%s\"",
cell->node_info->node_name, cell->node_info->node_id, trim(error));
}
else
{
item_list_append_format(&warnings,
"unable to connect to node \"%s\" (ID: %i)",
cell->node_info->node_name, cell->node_info->node_id);
}
repmgrd_info[i]->pg_running = false;
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("n/a"));
maxlen_snprintf(repmgrd_info[i]->pid_text, "%s", _("n/a"));
}
else
{
cell->node_info->node_status = NODE_STATUS_UP;
cell->node_info->recovery_type = get_recovery_type(cell->node_info->conn);
repmgrd_info[i]->pid = repmgrd_get_pid(cell->node_info->conn);
repmgrd_info[i]->running = repmgrd_is_running(cell->node_info->conn);
if (repmgrd_info[i]->running == true)
{
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("running"));
}
else
{
maxlen_snprintf(repmgrd_info[i]->repmgrd_running, "%s", _("not running"));
}
if (repmgrd_info[i]->pid == UNKNOWN_PID)
{
maxlen_snprintf(repmgrd_info[i]->pid_text, "%s", _("n/a"));
}
else
{
maxlen_snprintf(repmgrd_info[i]->pid_text, "%i", repmgrd_info[i]->pid);
}
repmgrd_info[i]->paused = repmgrd_is_paused(cell->node_info->conn);
repmgrd_info[i]->recovery_type = get_recovery_type(cell->node_info->conn);
if (repmgrd_info[i]->recovery_type == RECTYPE_STANDBY)
{
repmgrd_info[i]->wal_paused_pending_wal = is_wal_replay_paused(cell->node_info->conn, true);
if (repmgrd_info[i]->wal_paused_pending_wal == true)
{
item_list_append_format(&warnings,
_("WAL replay is paused on node \"%s\" (ID: %i) with WAL replay pending; this node cannot be manually promoted until WAL replay is resumed"),
cell->node_info->node_name, cell->node_info->node_id);
}
}
repmgrd_info[i]->upstream_last_seen = get_upstream_last_seen(cell->node_info->conn, cell->node_info->type);
if (repmgrd_info[i]->upstream_last_seen < 0)
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, "%s", _("n/a"));
}
else
{
if (runtime_options.compact == true)
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, _("%i sec(s) ago"), repmgrd_info[i]->upstream_last_seen);
}
else
{
maxlen_snprintf(repmgrd_info[i]->upstream_last_seen_text, _("%i second(s) ago"), repmgrd_info[i]->upstream_last_seen);
}
}
}
initPQExpBuffer(&node_status);
initPQExpBuffer(&upstream);
(void)format_node_status(cell->node_info, &node_status, &upstream, &warnings);
snprintf(repmgrd_info[i]->pg_running_text, sizeof(cell->node_info->details),
"%s", node_status.data);
snprintf(cell->node_info->upstream_node_name, sizeof(cell->node_info->upstream_node_name),
"%s", upstream.data);
termPQExpBuffer(&node_status);
termPQExpBuffer(&upstream);
PQfinish(cell->node_info->conn);
headers_status[STATUS_NAME].cur_length = strlen(cell->node_info->node_name);
headers_status[STATUS_ROLE].cur_length = strlen(get_node_type_string(cell->node_info->type));
headers_status[STATUS_PG].cur_length = strlen(repmgrd_info[i]->pg_running_text);
headers_status[STATUS_UPSTREAM_NAME].cur_length = strlen(cell->node_info->upstream_node_name);
if (runtime_options.detail == true)
{
PQExpBufferData buf;
headers_status[STATUS_LOCATION].cur_length = strlen(cell->node_info->location);
initPQExpBuffer(&buf);
appendPQExpBuffer(&buf, "%i", cell->node_info->priority);
headers_status[STATUS_PRIORITY].cur_length = strlen(buf.data);
termPQExpBuffer(&buf);
}
headers_status[STATUS_PID].cur_length = strlen(repmgrd_info[i]->pid_text);
headers_status[STATUS_REPMGRD].cur_length = strlen(repmgrd_info[i]->repmgrd_running);
headers_status[STATUS_UPSTREAM_LAST_SEEN].cur_length = strlen(repmgrd_info[i]->upstream_last_seen_text);
for (j = 0; j < STATUS_HEADER_COUNT; j++)
{
if (headers_status[j].cur_length > headers_status[j].max_length)
{
headers_status[j].max_length = headers_status[j].cur_length;
}
}
i++;
}
/* Print column header row (text mode only) */
if (runtime_options.output_mode == OM_TEXT)
{
print_status_header(STATUS_HEADER_COUNT, headers_status);
}
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
if (runtime_options.output_mode == OM_CSV)
{
int running = repmgrd_info[i]->running ? 1 : 0;
int paused = repmgrd_info[i]->paused ? 1 : 0;
/* If PostgreSQL is not running, repmgrd status is unknown */
if (repmgrd_info[i]->pg_running == false)
{
running = -1;
paused = -1;
}
printf("%i,%s,%s,%i,%i,%i,%i,%i,%i,%s\n",
cell->node_info->node_id,
cell->node_info->node_name,
get_node_type_string(cell->node_info->type),
repmgrd_info[i]->pg_running ? 1 : 0,
running,
repmgrd_info[i]->pid,
paused,
cell->node_info->priority,
repmgrd_info[i]->pid == UNKNOWN_PID
? -1
: repmgrd_info[i]->upstream_last_seen,
cell->node_info->location);
}
else
{
printf(" %-*i ", headers_status[STATUS_ID].max_length, cell->node_info->node_id);
printf("| %-*s ", headers_status[STATUS_NAME].max_length, cell->node_info->node_name);
printf("| %-*s ", headers_status[STATUS_ROLE].max_length, get_node_type_string(cell->node_info->type));
printf("| %-*s ", headers_status[STATUS_PG].max_length, repmgrd_info[i]->pg_running_text);
printf("| %-*s ", headers_status[STATUS_UPSTREAM_NAME].max_length, cell->node_info->upstream_node_name);
if (runtime_options.detail == true)
{
printf("| %-*s ", headers_status[STATUS_LOCATION].max_length, cell->node_info->location);
printf("| %-*i ", headers_status[STATUS_PRIORITY].max_length, cell->node_info->priority);
}
printf("| %-*s ", headers_status[STATUS_REPMGRD].max_length, repmgrd_info[i]->repmgrd_running);
printf("| %-*s ", headers_status[STATUS_PID].max_length, repmgrd_info[i]->pid_text);
if (repmgrd_info[i]->pid == UNKNOWN_PID)
{
printf("| %-*s ", headers_status[STATUS_PAUSED].max_length, _("n/a"));
printf("| %-*s ", headers_status[STATUS_UPSTREAM_LAST_SEEN].max_length, _("n/a"));
}
else
{
printf("| %-*s ", headers_status[STATUS_PAUSED].max_length, repmgrd_info[i]->paused ? _("yes") : _("no"));
printf("| %-*s ", headers_status[STATUS_UPSTREAM_LAST_SEEN].max_length, repmgrd_info[i]->upstream_last_seen_text);
}
printf("\n");
}
pfree(repmgrd_info[i]);
i++;
}
pfree(repmgrd_info);
/* emit any warnings */
if (warnings.head != NULL && runtime_options.terse == false && runtime_options.output_mode != OM_CSV)
{
ItemListCell *cell = NULL;
PQExpBufferData warning;
initPQExpBuffer(&warning);
appendPQExpBufferStr(&warning,
_("following issues were detected\n"));
for (cell = warnings.head; cell; cell = cell->next)
{
appendPQExpBuffer(&warning,
_(" - %s\n"), cell->string);
}
puts("");
log_warning("%s", warning.data);
termPQExpBuffer(&warning);
if (runtime_options.verbose == false && connection_error_found == true)
{
log_hint(_("execute with --verbose option to see connection error messages"));
}
}
}
void
do_service_pause(void)
{
_do_repmgr_pause(true);
}
void
do_service_unpause(void)
{
_do_repmgr_pause(false);
}
static void
_do_repmgr_pause(bool pause)
{
PGconn *conn = NULL;
NodeInfoList nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
int i;
int error_nodes = 0;
/* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database"));
if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true);
else
conn = establish_db_connection_by_params(&source_conninfo, true);
fetch_node_records(conn, &nodes);
i = 0;
for (cell = nodes.head; cell; cell = cell->next)
{
log_verbose(LOG_DEBUG, "pausing node %i (%s)",
cell->node_info->node_id,
cell->node_info->node_name);
cell->node_info->conn = establish_db_connection_quiet(cell->node_info->conninfo);
if (PQstatus(cell->node_info->conn) != CONNECTION_OK)
{
log_warning(_("unable to connect to node %i"),
cell->node_info->node_id);
error_nodes++;
}
else
{
if (runtime_options.dry_run == true)
{
if (pause == true)
{
log_info(_("would pause node %i (%s) "),
cell->node_info->node_id,
cell->node_info->node_name);
}
else
{
log_info(_("would unpause node %i (%s) "),
cell->node_info->node_id,
cell->node_info->node_name);
}
}
else
{
bool success = repmgrd_pause(cell->node_info->conn, pause);
if (success == false)
error_nodes++;
log_notice(_("node %i (%s) %s"),
cell->node_info->node_id,
cell->node_info->node_name,
success == true
? pause == true ? "paused" : "unpaused"
: pause == true ? "not paused" : "not unpaused");
}
PQfinish(cell->node_info->conn);
}
i++;
}
if (error_nodes > 0)
{
if (pause == true)
{
log_error(_("unable to pause %i node(s)"), error_nodes);
}
else
{
log_error(_("unable to unpause %i node(s)"), error_nodes);
}
log_hint(_("execute \"repmgr service status\" to view current status"));
exit(ERR_REPMGRD_PAUSE);
}
exit(SUCCESS);
}
void
fetch_node_records(PGconn *conn, NodeInfoList *node_list)
{
bool success = get_all_node_records_with_upstream(conn, node_list);
if (success == false)
{
/* get_all_node_records() will display any error message */
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
if (node_list->node_count == 0)
{
log_error(_("no node records were found"));
log_hint(_("ensure at least one node is registered"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
}
void do_service_help(void)
{
print_help_header();
printf(_("Usage:\n"));
printf(_(" %s [OPTIONS] service status\n"), progname());
printf(_(" %s [OPTIONS] service pause\n"), progname());
printf(_(" %s [OPTIONS] service unpause\n"), progname());
puts("");
printf(_("SERVICE STATUS\n"));
puts("");
printf(_(" \"service status\" shows the status of repmgrd on each node in the cluster\n"));
puts("");
printf(_(" --csv emit output as CSV\n"));
printf(_(" --detail show additional detail\n"));
printf(_(" --verbose show text of database connection error messages\n"));
puts("");
printf(_("SERVICE PAUSE\n"));
puts("");
printf(_(" \"service pause\" instructs repmgrd on each node to pause failover detection\n"));
puts("");
printf(_(" --dry-run check if nodes are reachable but don't pause repmgrd\n"));
puts("");
printf(_("SERVICE UNPAUSE\n"));
puts("");
printf(_(" \"service unpause\" instructs repmgrd on each node to resume failover detection\n"));
puts("");
printf(_(" --dry-run check if nodes are reachable but don't unpause repmgrd\n"));
puts("");
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-standby.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
*
* Implements witness actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -74,6 +74,18 @@ do_witness_register(void)
exit(ERR_BAD_CONFIG);
}
/* check that witness node is not a BDR node */
if (is_bdr_db_quiet(witness_conn) == true)
{
log_error(_("witness node is a BDR node"));
log_hint(_("a witness node cannot be configured for a BDR cluster"));
PQfinish(witness_conn);
exit(ERR_BAD_CONFIG);
}
/* connect to primary with provided parameters */
log_info(_("connecting to primary node"));
@@ -182,6 +194,19 @@ do_witness_register(void)
}
}
/* check that primary node is not a BDR node */
if (is_bdr_db_quiet(primary_conn) == true)
{
log_error(_("primary node is a BDR node"));
log_hint(_("a witness node cannot be configured for a BDR cluster"));
PQfinish(witness_conn);
PQfinish(primary_conn);
exit(ERR_BAD_CONFIG);
}
/* create repmgr extension, if does not exist */
if (runtime_options.dry_run == false && !create_repmgr_extension(witness_conn))
{
@@ -569,5 +594,5 @@ void do_witness_help(void)
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
return;
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-witness.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/*
* repmgr-client-global.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -28,8 +28,6 @@
/* default value for "cluster event --limit"*/
#define CLUSTER_EVENT_LIMIT 20
typedef struct
{
/* configuration metadata */
@@ -48,7 +46,6 @@ typedef struct
bool no_wait;
bool compact;
bool detail;
bool dump_config;
/* logging options */
char log_level[MAXLEN]; /* overrides setting in repmgr.conf */
@@ -88,8 +85,7 @@ typedef struct
char replication_user[MAXLEN];
char upstream_conninfo[MAXLEN];
bool without_barman;
bool replication_conf_only;
bool verify_backup;
bool recovery_conf_only;
/* "standby clone"/"standby follow" options */
int upstream_node_id;
@@ -113,7 +109,6 @@ typedef struct
/* "node check" options */
bool archive_ready;
bool downstream;
bool upstream;
bool replication_lag;
bool role;
bool slots;
@@ -121,8 +116,6 @@ typedef struct
bool has_passfile;
bool replication_connection;
bool data_directory_config;
bool replication_config_owner;
bool db_connection;
/* "node rejoin" options */
char config_files[MAXLEN];
@@ -143,7 +136,7 @@ typedef struct
/* following options for internal use */
char config_archive_dir[MAXPGPATH];
OutputMode output_mode; /* set through provision of --csv, --nagios or --optformat */
OutputMode output_mode;
bool disable_wal_receiver;
bool enable_wal_receiver;
} t_runtime_options;
@@ -152,7 +145,7 @@ typedef struct
/* configuration metadata */ \
false, false, false, false, false, \
/* general configuration options */ \
"", false, false, "", -1, false, false, false, false, \
"", false, false, "", -1, false, false, false, \
/* logging options */ \
"", false, false, false, false, \
/* output options */ \
@@ -165,7 +158,7 @@ typedef struct
UNKNOWN_NODE_ID, "", "", UNKNOWN_NODE_ID, \
/* "standby clone" options */ \
false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \
false, false, false, \
false, false, \
/* "standby clone"/"standby follow" options */ \
NO_UPSTREAM_NODE, \
/* "standby register" options */ \
@@ -175,7 +168,7 @@ typedef struct
/* "node status" options */ \
false, \
/* "node check" options */ \
false, false, false, false, false, false, false, false, false, false, false, false, \
false, false, false, false, false, false, false, false, false, \
/* "node rejoin" options */ \
"", \
/* "node service" options */ \
@@ -208,30 +201,6 @@ typedef enum
} t_server_action;
typedef enum
{
USER_TYPE_UNKNOWN = -1,
REPMGR_USER,
REPLICATION_USER_OPT,
REPLICATION_USER_NODE,
SUPERUSER
} t_user_type;
typedef enum
{
JOIN_SUCCESS,
JOIN_FAIL_NO_PING,
JOIN_FAIL_NO_REPLICATION
} standy_join_status;
typedef enum
{
REMOTE_ERROR_UNKNOWN = -1,
REMOTE_ERROR_NONE,
REMOTE_ERROR_DB_CONNECTION,
REMOTE_ERROR_CONNINFO_PARSE
} t_remote_error_type;
typedef struct ColHeader
{
char title[MAXLEN];
@@ -242,17 +211,21 @@ typedef struct ColHeader
/* globally available configuration structures */
/* global configuration structures */
extern t_runtime_options runtime_options;
extern t_conninfo_param_list source_conninfo;
extern t_node_info target_node_info;
extern t_configuration_options config_file_options;
t_conninfo_param_list source_conninfo;
/* global variables */
extern bool config_file_required;
extern char pg_bindir[MAXLEN];
/* global functions */
extern t_node_info target_node_info;
extern int check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *server_version_string);
extern void check_93_config(void);
extern bool create_repmgr_extension(PGconn *conn);
extern int test_ssh_connection(char *host, char *remote_user);
@@ -263,7 +236,7 @@ extern int copy_remote_files(char *host, char *remote_user, char *remote_path,
extern void print_error_list(ItemList *error_list, int log_level);
extern void make_pg_path(PQExpBufferData *buf, const char *file);
extern char *make_pg_path(const char *file);
extern void get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privileged_conn);
@@ -282,19 +255,10 @@ extern void get_node_config_directory(char *config_dir_buf);
extern void get_node_data_directory(char *data_dir_buf);
extern void init_node_record(t_node_info *node_record);
extern bool can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
extern void make_standby_signal_path(char *buf);
extern bool write_standby_signal(void);
extern void drop_replication_slot_if_exists(PGconn *conn, int node_id, char *slot_name);
extern bool create_replication_slot(PGconn *conn, char *slot_name, t_node_info *upstream_node_record, PQExpBufferData *error_msg);
extern bool drop_replication_slot_if_exists(PGconn *conn, int node_id, char *slot_name);
extern standy_join_status check_standby_join(PGconn *primary_conn, t_node_info *primary_node_record, t_node_info *standby_node_record);
extern bool check_replication_slots_available(int node_id, PGconn* conn);
extern bool check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *follow_target_conn, t_node_info *follow_target_node_record, bool is_rejoin);
extern bool check_replication_config_owner(int pg_version, const char *data_directory, PQExpBufferData *error_msg, PQExpBufferData *detail_msg);
extern void check_shared_library(PGconn *conn);
extern bool is_repmgrd_running(PGconn *conn);
extern int parse_repmgr_version(const char *version_string);
#endif /* _REPMGR_CLIENT_GLOBAL_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/*
* repmgr-client.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -34,21 +34,23 @@
#define STANDBY_SWITCHOVER 8
#define WITNESS_REGISTER 9
#define WITNESS_UNREGISTER 10
#define NODE_STATUS 11
#define NODE_CHECK 12
#define NODE_SERVICE 13
#define NODE_REJOIN 14
#define NODE_CONTROL 15
#define CLUSTER_SHOW 16
#define CLUSTER_CLEANUP 17
#define CLUSTER_MATRIX 18
#define CLUSTER_CROSSCHECK 19
#define CLUSTER_EVENT 20
#define SERVICE_STATUS 21
#define SERVICE_PAUSE 22
#define SERVICE_UNPAUSE 23
#define DAEMON_START 24
#define DAEMON_STOP 25
#define BDR_REGISTER 11
#define BDR_UNREGISTER 12
#define NODE_STATUS 13
#define NODE_CHECK 14
#define NODE_SERVICE 15
#define NODE_REJOIN 16
#define NODE_CONTROL 17
#define CLUSTER_SHOW 18
#define CLUSTER_CLEANUP 19
#define CLUSTER_MATRIX 20
#define CLUSTER_CROSSCHECK 21
#define CLUSTER_EVENT 22
#define DAEMON_STATUS 23
#define DAEMON_PAUSE 24
#define DAEMON_UNPAUSE 25
#define DAEMON_START 26
#define DAEMON_STOP 27
/* command line options without short versions */
#define OPT_HELP 1001
@@ -81,33 +83,31 @@
#define OPT_SIBLINGS_FOLLOW 1028
#define OPT_ROLE 1029
#define OPT_DOWNSTREAM 1030
#define OPT_UPSTREAM 1031
#define OPT_SLOTS 1032
#define OPT_SLOTS 1031
#define OPT_CONFIG_ARCHIVE_DIR 1032
#define OPT_HAS_PASSFILE 1033
#define OPT_WAIT_START 1034
#define OPT_REPL_CONN 1035
#define OPT_REMOTE_NODE_ID 1036
#define OPT_REPLICATION_CONF_ONLY 1037
#define OPT_RECOVERY_CONF_ONLY 1037
#define OPT_NO_WAIT 1038
#define OPT_MISSING_SLOTS 1039
#define OPT_REPMGRD_NO_PAUSE 1040
#define OPT_VERSION_NUMBER 1041
#define OPT_DATA_DIRECTORY_CONFIG 1042
#define OPT_COMPACT 1043
#define OPT_DETAIL 1044
#define OPT_REPMGRD_FORCE_UNPAUSE 1045
#define OPT_REPLICATION_CONFIG_OWNER 1046
#define OPT_DB_CONNECTION 1047
#define OPT_VERIFY_BACKUP 1048
/* These options are for internal use only */
#define OPT_CONFIG_ARCHIVE_DIR 2001
#define OPT_DISABLE_WAL_RECEIVER 2002
#define OPT_ENABLE_WAL_RECEIVER 2003
#define OPT_DUMP_CONFIG 2004
#define OPT_DISABLE_WAL_RECEIVER 1044
#define OPT_ENABLE_WAL_RECEIVER 1045
#define OPT_DETAIL 1046
#define OPT_REPMGRD_FORCE_UNPAUSE 1047
/* deprecated since 3.3 */
#define OPT_DATA_DIR 999
#define OPT_NO_CONNINFO_PASSWORD 998
#define OPT_RECOVERY_MIN_APPLY_DELAY 997
/* deprecated since 4.0 */
#define OPT_CHECK_UPSTREAM_CONFIG 999
#define OPT_CHECK_UPSTREAM_CONFIG 996
#define OPT_NODE 995
static struct option long_options[] =
@@ -126,7 +126,6 @@ static struct option long_options[] =
{"no-wait", no_argument, NULL, 'W'},
{"compact", no_argument, NULL, OPT_COMPACT},
{"detail", no_argument, NULL, OPT_DETAIL},
{"dump-config", no_argument, NULL, OPT_DUMP_CONFIG},
/* connection options */
{"dbname", required_argument, NULL, 'd'},
@@ -158,14 +157,12 @@ static struct option long_options[] =
{"copy-external-config-files", optional_argument, NULL, OPT_COPY_EXTERNAL_CONFIG_FILES},
{"fast-checkpoint", no_argument, NULL, 'c'},
{"no-upstream-connection", no_argument, NULL, OPT_NO_UPSTREAM_CONNECTION},
{"recovery-min-apply-delay", required_argument, NULL, OPT_RECOVERY_MIN_APPLY_DELAY},
{"replication-user", required_argument, NULL, OPT_REPLICATION_USER},
{"upstream-conninfo", required_argument, NULL, OPT_UPSTREAM_CONNINFO},
{"upstream-node-id", required_argument, NULL, OPT_UPSTREAM_NODE_ID},
{"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN},
{"replication-conf-only", no_argument, NULL, OPT_REPLICATION_CONF_ONLY},
{"verify-backup", no_argument, NULL, OPT_VERIFY_BACKUP },
/* deprecate this once Pg11 and earlier are unsupported */
{"recovery-conf-only", no_argument, NULL, OPT_REPLICATION_CONF_ONLY},
{"recovery-conf-only", no_argument, NULL, OPT_RECOVERY_CONF_ONLY},
/* "standby register" options */
{"wait-start", required_argument, NULL, OPT_WAIT_START},
@@ -186,7 +183,6 @@ static struct option long_options[] =
/* "node check" options */
{"archive-ready", no_argument, NULL, OPT_ARCHIVE_READY},
{"downstream", no_argument, NULL, OPT_DOWNSTREAM},
{"upstream", no_argument, NULL, OPT_UPSTREAM},
{"replication-lag", no_argument, NULL, OPT_REPLICATION_LAG},
{"role", no_argument, NULL, OPT_ROLE},
{"slots", no_argument, NULL, OPT_SLOTS},
@@ -194,8 +190,6 @@ static struct option long_options[] =
{"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE},
{"replication-connection", no_argument, NULL, OPT_REPL_CONN},
{"data-directory-config", no_argument, NULL, OPT_DATA_DIRECTORY_CONFIG},
{"replication-config-owner", no_argument, NULL, OPT_REPLICATION_CONFIG_OWNER},
{"db-connection", no_argument, NULL, OPT_DB_CONNECTION},
/* "node rejoin" options */
{"config-files", required_argument, NULL, OPT_CONFIG_FILES},
@@ -221,8 +215,14 @@ static struct option long_options[] =
/* deprecated */
{"check-upstream-config", no_argument, NULL, OPT_CHECK_UPSTREAM_CONFIG},
{"no-conninfo-password", no_argument, NULL, OPT_NO_CONNINFO_PASSWORD},
/* previously used by "standby switchover" */
{"remote-config-file", required_argument, NULL, 'C'},
/* legacy alias for -D/--pgdata */
{"data-dir", required_argument, NULL, OPT_DATA_DIR},
/* replaced by --node-id */
{"node", required_argument, NULL, OPT_NODE},
{NULL, 0, NULL, 0}
};

View File

@@ -1,7 +1,7 @@
/*
* repmgr.c - repmgr extension
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) 2ndQuadrant, 2010-2019
*
* This is the actual extension code; see repmgr-client.c for the code which
* generates the repmgr binary
@@ -33,14 +33,22 @@
#include "storage/shmem.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#if (PG_VERSION_NUM >= 90400)
#include "utils/pg_lsn.h"
#endif
#include "utils/timestamp.h"
#include "lib/stringinfo.h"
#include "access/xact.h"
#include "utils/snapmgr.h"
#if (PG_VERSION_NUM >= 90400)
#include "pgstat.h"
#else
#define PGSTAT_STAT_PERMANENT_DIRECTORY "pg_stat"
#endif
#include "voting.h"
@@ -76,6 +84,8 @@ typedef struct repmgrdSharedState
int current_electoral_term;
int candidate_node_id;
bool follow_new_primary;
/* BDR failover */
int bdr_failover_handler;
} repmgrdSharedState;
static repmgrdSharedState *shared_state = NULL;
@@ -121,6 +131,12 @@ PG_FUNCTION_INFO_V1(get_new_primary);
Datum reset_voting_status(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(reset_voting_status);
Datum am_bdr_failover_handler(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(am_bdr_failover_handler);
Datum unset_bdr_failover_handler(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(unset_bdr_failover_handler);
Datum set_repmgrd_pid(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(set_repmgrd_pid);
@@ -149,6 +165,8 @@ PG_FUNCTION_INFO_V1(get_wal_receiver_pid);
void
_PG_init(void)
{
elog(DEBUG1, "repmgr init");
if (!process_shared_preload_libraries_in_progress)
return;
@@ -165,6 +183,7 @@ _PG_init(void)
*/
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = repmgr_shmem_startup;
}
@@ -222,6 +241,7 @@ repmgr_shmem_startup(void)
shared_state->voting_status = VS_NO_VOTE;
shared_state->candidate_node_id = UNKNOWN_NODE_ID;
shared_state->follow_new_primary = false;
shared_state->bdr_failover_handler = UNKNOWN_NODE_ID;
}
LWLockRelease(AddinShmemInitLock);
@@ -551,6 +571,63 @@ reset_voting_status(PG_FUNCTION_ARGS)
}
Datum
am_bdr_failover_handler(PG_FUNCTION_ARGS)
{
int node_id = UNKNOWN_NODE_ID;
bool am_handler = false;
if (!shared_state)
PG_RETURN_NULL();
if (PG_ARGISNULL(0))
PG_RETURN_NULL();
node_id = PG_GETARG_INT32(0);
LWLockAcquire(shared_state->lock, LW_SHARED);
if (shared_state->bdr_failover_handler == UNKNOWN_NODE_ID)
{
LWLockRelease(shared_state->lock);
LWLockAcquire(shared_state->lock, LW_EXCLUSIVE);
shared_state->bdr_failover_handler = node_id;
am_handler = true;
}
else if (shared_state->bdr_failover_handler == node_id)
{
am_handler = true;
}
LWLockRelease(shared_state->lock);
PG_RETURN_BOOL(am_handler);
}
Datum
unset_bdr_failover_handler(PG_FUNCTION_ARGS)
{
if (!shared_state)
PG_RETURN_NULL();
LWLockAcquire(shared_state->lock, LW_SHARED);
/* only do something if local_node_id is initialised */
if (shared_state->local_node_id != UNKNOWN_NODE_ID)
{
LWLockRelease(shared_state->lock);
LWLockAcquire(shared_state->lock, LW_EXCLUSIVE);
shared_state->bdr_failover_handler = UNKNOWN_NODE_ID;
}
LWLockRelease(shared_state->lock);
PG_RETURN_VOID();
}
/*
* Returns the repmgrd pid; or NULL if none set; or -1 if set but repmgrd
* process not running (TODO!)

View File

@@ -6,13 +6,12 @@
# is noted for each item. Where no default value is shown, the
# parameter will be treated as empty or false.
#
# repmgr parses its configuration file in the same way as PostgreSQL itself
# does. In particular, strings must be enclosed in single quotes (although
# simple identifiers may be provided as-is).
# IMPORTANT: string values can be provided as-is, or enclosed in single quotes
# (but not double-quotes, which will be interpreted as part of the string),
# e.g.:
#
# For details on the configuration file format see the documentation at:
#
# https://repmgr.org/docs/current/configuration-file.html#CONFIGURATION-FILE-FORMAT
# node_name=foo
# node_name = 'foo'
#
# =============================================================================
# Required configuration items
@@ -21,7 +20,7 @@
# repmgr and repmgrd require the following items to be explicitly configured.
#node_id= # A unique integer greater than zero
#node_id= # A unique integer greater than zero
#node_name='' # An arbitrary (but unique) string; we recommend
# using the server's hostname or another identifier
# unambiguously associated with the server to avoid
@@ -29,8 +28,8 @@
# node's current role, e.g. 'primary' or 'standby1',
# as roles can change and it will be confusing if
# the current primary is called 'standby1'.
# The string's maximum length is 63 characters and it should
# contain only printable ASCII characters.
# The string's maximum length is 63 characters and it should
# contain only printable ASCII characters.
#conninfo='' # Database connection information as a conninfo string.
# All servers in the cluster must be able to connect to
@@ -71,12 +70,13 @@
#replication_user='repmgr' # User to make replication connections with, if not set
# defaults to the user defined in "conninfo".
#replication_type='physical' # Must "physical" (the default).
#replication_type=physical # Must be one of "physical" or "bdr".
# NOTE: "bdr" can only be used with BDR 2.x
#location='default' # An arbitrary string defining the location of the node; this
#location=default # An arbitrary string defining the location of the node; this
# is used during failover to check visibility of the
# current primary node. For further details see:
# https://repmgr.org/docs/current/repmgrd-network-split.html
# https://repmgr.org/docs/current/repmgrd-network-split.html
#use_replication_slots=no # whether to use physical replication slots
# NOTE: when using replication slots,
@@ -101,10 +101,10 @@
# This is mainly intended for those cases when `repmgr` is executed directly
# by `repmgrd`.
#log_level='INFO' # Log level: possible values are DEBUG, INFO, NOTICE,
#log_level=INFO # Log level: possible values are DEBUG, INFO, NOTICE,
# WARNING, ERROR, ALERT, CRIT or EMERG
#log_facility='STDERR' # Logging facility: possible values are STDERR, or for
#log_facility=STDERR # Logging facility: possible values are STDERR, or for
# syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER
#log_file='' # STDERR can be redirected to an arbitrary file
@@ -159,13 +159,12 @@
#repmgr_bindir='' # Path to repmgr binary directory (location of the repmgr
# binary. Only needed if the repmgr executable is not in
# the system $PATH or the path defined in "pg_bindir".
# the system $PATH or the path defined in "pg_bindir".
#use_primary_conninfo_password=false # explicitly set "password" in "primary_conninfo"
# using the value contained in the environment variable
# PGPASSWORD
#use_primary_conninfo_password=false # explicitly set "password" in recovery.conf's
# "primary_conninfo" parameter using the value contained
# in the environment variable PGPASSWORD
#passfile='' # path to .pgpass file to include in "primary_conninfo"
#------------------------------------------------------------------------------
# external command options
#------------------------------------------------------------------------------
@@ -179,10 +178,8 @@
# rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""
# ssh_options=-o "StrictHostKeyChecking no"
#pg_ctl_options='' # Options to append to "pg_ctl"
#pg_ctl_options='' # Options to append to "pg_ctl"
#pg_basebackup_options='' # Options to append to "pg_basebackup"
# (Note: when cloning from Barman, repmgr will honour any
# --waldir/--xlogdir setting present in "pg_basebackup_options"
#rsync_options='' # Options to append to "rsync"
ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
@@ -196,24 +193,22 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#
# Examples:
#
# tablespace_mapping='/path/to/original/tablespace=/path/to/new/tablespace'
# tablespace_mapping=/path/to/original/tablespace=/path/to/new/tablespace
# restore_command = 'cp /path/to/archived/wals/%f %p'
#tablespace_mapping='' # Tablespaces can be remapped from one
#tablespace_mapping='' # Tablespaces can be remapped from one
# file system location to another. This
# parameter can be provided multiple times.
#restore_command='' # This will be included in the recovery configuration
# generated by repmgr.
#restore_command='' # This will be placed in the recovery.conf file generated
# by repmgr.
#archive_cleanup_command='' # This will be included in the recovery configuration
# generated by repmgr. Note we recommend using Barman for
# managing WAL archives (see: https://www.pgbarman.org )
#archive_cleanup_command='' # This will be placed in the recovery.conf file generated
# by repmgr. Note we recommend using Barman for managing
# WAL archives (see: https://www.pgbarman.org )
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" will be set to
# this value (PostgreSQL 9.4 and later). Value can be
# an integer representing milliseconds, or a string
# representing a period of time (e.g. '5 min').
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" in recovery.conf
# will be set to this value (PostgreSQL 9.4 and later).
#------------------------------------------------------------------------------
@@ -240,8 +235,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# for the new primary to become available
#standby_follow_timeout=15 # The max length of time (in seconds) to wait
# for the standby to connect to the primary
#standby_follow_restart=false # Restart the standby instead of sending a SIGHUP
# (only for PostgreSQL 13 and later)
#------------------------------------------------------------------------------
# "standby switchover" settings
@@ -287,31 +281,31 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# These settings are only applied when repmgrd is running. Values shown
# are defaults.
#failover='manual' # one of 'automatic', 'manual'.
#failover=manual # one of 'automatic', 'manual'.
# determines what action to take in the event of upstream failure
#
# 'automatic': repmgrd will automatically attempt to promote the
# node or follow the new upstream node
# 'manual': repmgrd will take no action and the node will require
# manual attention to reattach it to replication
# (does not apply to BDR mode)
#priority=100 # indicates a preferred priority for promoting nodes;
# a value of zero prevents the node being promoted to primary
# (default: 100)
#connection_check_type=ping # How to check availability of the upstream node; valid options:
# 'ping': use PQping() to check if the node is accepting connections
# 'connection': execute a throwaway query on the current connection
# 'query': execute an SQL statement on the node via the existing connection
# 'ping': use PQping() to check if the node is accepting connections
# 'connection': execute a throwaway query on the current connection
#reconnect_attempts=6 # Number of attempts which will be made to reconnect to an unreachable
# primary (or other upstream node)
#reconnect_interval=10 # Interval between attempts to reconnect to an unreachable
# primary (or other upstream node)
#promote_command='' # command repmgrd executes when promoting a new primary; use something like:
#promote_command= # command repmgrd executes when promoting a new primary; use something like:
#
# repmgr standby promote -f /etc/repmgr.conf
#
#follow_command='' # command repmgrd executes when instructing a standby to follow a new primary;
#follow_command= # command repmgrd executes when instructing a standby to follow a new primary;
# use something like:
#
# repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n
@@ -323,8 +317,8 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# for the the local node to restart and become ready to accept connections after
# executing "follow_command" (defaults to the value set in "standby_reconnect_timeout")
#monitoring_history=no # Whether to write monitoring data to the "montoring_history" table
#monitor_interval_secs=2 # Interval (in seconds) at which to write monitoring data
#monitoring_history=no # Whether to write monitoring data to the "montoring_history" table
#monitor_interval_secs=2 # Interval (in seconds) at which to write monitoring data
#degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the
# server(s) being monitored are no longer available. -1 (default)
# disables the timeout completely.
@@ -345,8 +339,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# WAL receivers
#primary_visibility_consensus=false # If "true", only continue with failover if no standbys have seen
# the primary node recently. *Must* be the same on all nodes.
#always_promote=false # Always promote a node, even if repmgr metadata is outdated
#failover_validation_command='' # Script to execute for an external mechanism to validate the failover
#failover_validation_command= # Script to execute for an external mechanism to validate the failover
# decision made by repmgrd. One or both of the following parameter placeholders
# should be provided, which will be replaced by repmgrd with the appropriate
# value: %n (node_id), %a (node_name). *Must* be the same on all nodes.
@@ -354,14 +347,14 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# an error, pause the specified amount of seconds before rerunning the election.
#
# The following items are relevant for repmgrd running on the primary,
# and will be ignored on non-primary nodes
# and will be ignored on non-primary nodes
#child_nodes_check_interval=5 # Interval (in seconds) to check for attached child nodes (standbys)
#child_nodes_connected_min_count=-1 # Minimum number of child nodes which must remain connected, otherwise
# disconnection command will be triggered
#child_nodes_disconnect_min_count=-1 # Minimum number of disconnected child nodes required to execute disconnection command
# (ignored if "child_nodes_connected_min_count" set)
#child_nodes_disconnect_timeout=30 # Interval between child node disconnection and disconnection command execution
#child_nodes_disconnect_command='' # Command to execute if child node disconnection detected
#child_nodes_disconnect_command= # Command to execute if child node disconnection detected
#------------------------------------------------------------------------------
# service control commands
@@ -379,7 +372,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# NOTE: These commands must be runnable on remote nodes as well for switchover
# to function correctly.
#
# If you use sudo, the user repmgr runs as (usually 'postgres') must have
# If you use sudo, the user repmgr runs as (usually 'postgres') must have
# passwordless sudo access to execute the command.
#
# For example, to use systemd, you can set
@@ -392,8 +385,8 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# # this is required when running sudo over ssh without -t:
# Defaults:postgres !requiretty
# postgres ALL = NOPASSWD: /usr/bin/systemctl stop postgresql-9.6, \
# /usr/bin/systemctl start postgresql-9.6, \
# /usr/bin/systemctl restart postgresql-9.6
# /usr/bin/systemctl start postgresql-9.6, \
# /usr/bin/systemctl restart postgresql-9.6
#
# Debian/Ubuntu users: use "sudo pg_ctlcluster" to execute service control commands.
#
@@ -409,7 +402,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# for "promote_command"; do not use "repmgr standby promote"
# (or a script which executes "repmgr standby promote") here.
# Used by "repmgr service (start|stop)" to control repmgrd
# Used by "repmgr daemon (start|stop)" to control repmgrd
#
#repmgrd_service_start_command = ''
#repmgrd_service_stop_command = ''
@@ -420,7 +413,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# Various warning/critical thresholds used by "repmgr node check".
#archive_ready_warning=16 # repmgr node check --archive-ready
#archive_ready_warning=16 # repmgr node check --archive-ready
#archive_ready_critical=128 #
# Numbers of files pending archiving via PostgreSQL's
# "archive_command" configuration parameter. If
@@ -441,3 +434,12 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# issues with shutting down the demotion candidate.
#------------------------------------------------------------------------------
# BDR monitoring options
#------------------------------------------------------------------------------
#bdr_local_monitoring_only=false # Only monitor the local node; no checks will be
# performed on the other node
#bdr_recovery_timeout # If a BDR node was offline and has become available
# maximum length of time in seconds to wait for the
# node to reconnect to the cluster

View File

@@ -1,6 +1,6 @@
# repmgr extension
comment = 'Replication manager for PostgreSQL'
default_version = '5.2'
default_version = '4.4'
module_pathname = '$libdir/repmgr'
relocatable = false
schema = repmgr

Some files were not shown because too many files have changed in this diff Show More