Compare commits

..

8 Commits

Author SHA1 Message Date
Ian Barwick
fefa43e3a6 Minor README fix 2016-10-05 16:49:01 +09:00
Ian Barwick
c1a1fe6f82 Update README
`--ignore-external-config-files` deprecated
2016-10-05 16:48:40 +09:00
Ian Barwick
4dc3a05e8d Update history 2016-10-05 13:58:05 +09:00
Ian Barwick
5945accd84 Add documentation for repmgrd failover process and failed node fencing
Addresses GitHub #200.
2016-10-05 13:58:01 +09:00
Ian Barwick
15cbda9ec3 repmgr: consistent error message style 2016-10-05 13:57:57 +09:00
Ian Barwick
358559acc4 Update barman-wal-restore documentation
Barman 2.0 provides this in a separate, more convenient `barman-cli` package;
document this and add note about previous `barman-wal-restore.py` script.
2016-10-03 16:04:02 +09:00
Ian Barwick
0a9f8e160a Tweak repmgr.conf.sample
Put `monitor_interval_secs` at the start of the `repmgrd` section, as it's
a very fundamental configuration item.
2016-10-03 16:03:59 +09:00
Ian Barwick
a2d67e85de Bump version
3.2
2016-09-30 15:13:56 +09:00
35 changed files with 921 additions and 2316 deletions

View File

@@ -2,7 +2,7 @@ License and Contributions
========================= =========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is `repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2017, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for Copyright 2010-2016, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details. details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers. The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -21,11 +21,9 @@ copy of the relevant Copyright Assignment Form.
Code style Code style
---------- ----------
Code in repmgr should be formatted to the same standards as the main PostgreSQL Code in repmgr is formatted to a consistent style using the following command:
project. For more details see:
https://www.postgresql.org/docs/current/static/source-format.html astyle --style=ansi --indent=tab --suffix=none *.c *.h
Contributors should reformat their code similarly before submitting code to Contributors should reformat their code similarly before submitting code to
the project, in order to minimize merge conflicts with other work. the project, in order to minimize merge conflicts with other work.

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2017, 2ndQuadrant Limited Copyright (c) 2010-2016, 2ndQuadrant Limited
All rights reserved. All rights reserved.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify

54
HISTORY
View File

@@ -1,53 +1,3 @@
3.3.3 2017-06
repmgr: fix `standby register --force` when updating existing node record (Ian)
3.3.2 2017-06-01
Add support for PostgreSQL 10 (Ian)
repmgr: ensure --replication-user option is honoured when passing database
connection parameters as a conninfo string (Ian)
repmgr: improve detection of pg_rewind on remote server (Ian)
repmgr: add DETAIL log output for additional clarification of error messages (Ian)
repmgr: suppress various spurious error messages in `standby follow` and
`standby switchover` (Ian)
repmgr: add missing `-P` option (Ian)
repmgrd: monitoring statistic reporting fixes (Ian)
3.3.1 2017-03-13
repmgrd: prevent invalid apply lag value being written to the
monitoring table (Ian)
repmgrd: fix error in XLogRecPtr conversion when calculating
monitoring statistics (Ian)
repmgr: if replication slots in use, where possible delete slot on old
upstream node after following new upstream (Ian)
repmgr: improve logging of rsync actions (Ian)
repmgr: improve `standby clone` when synchronous replication in use (Ian)
repmgr: stricter checking of allowed node id values
repmgr: enable `master register --force` when there is a foreign key
dependency from a standby node (Ian)
3.3 2016-12-27
repmgr: always log to STDERR even if log facility defined (Ian)
repmgr: add --log-to-file to log repmgr output to the defined
log facility (Ian)
repmgr: improve handling of command line parameter errors (Ian)
repmgr: add option --upstream-conninfo to explicitly set
'primary_conninfo' in recovery.conf (Ian)
repmgr: enable a standby to be registered which isn't running (Ian)
repmgr: enable `standby register --force` to update a node record
with cascaded downstream node records (Ian)
repmgr: add option `--no-conninfo-password` (Abhijit, Ian)
repmgr: add initial support for PostgreSQL 10.0 (Ian)
repmgr: escape values in primary_conninfo if needed (Ian)
3.2.1 2016-10-24
repmgr: require a valid repmgr cluster name unless -F/--force
supplied (Ian)
repmgr: check master server is registered with repmgr before
cloning (Ian)
repmgr: ensure data directory defaults to that of the source node (Ian)
repmgr: various fixes to Barman cloning mode (Gianni, Ian)
repmgr: fix `repmgr cluster crosscheck` output (Ian)
3.2 2016-10-05 3.2 2016-10-05
repmgr: add support for cloning from a Barman backup (Gianni) repmgr: add support for cloning from a Barman backup (Gianni)
repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni) repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni)
@@ -65,12 +15,12 @@
the standby (Ian) the standby (Ian)
repmgr: add option `--copy-external-config-files` for files outside repmgr: add option `--copy-external-config-files` for files outside
of the data directory (Ian) of the data directory (Ian)
repmgr: add configuration options to override the default pg_ctl
commands (Jarkko Oranen, Ian)
repmgr: only require `wal_keep_segments` to be set in certain corner repmgr: only require `wal_keep_segments` to be set in certain corner
cases (Ian) cases (Ian)
repmgr: better support cloning from a node other than the one to repmgr: better support cloning from a node other than the one to
stream from (Ian) stream from (Ian)
repmgrd: add configuration options to override the default pg_ctl
commands (Jarkko Oranen, Ian)
repmgrd: don't start if node is inactive and failover=automatic (Ian) repmgrd: don't start if node is inactive and failover=automatic (Ian)
packaging: improve "repmgr-auto" Debian package (Gianni) packaging: improve "repmgr-auto" Debian package (Gianni)

View File

@@ -1,16 +1,15 @@
# #
# Makefile # Makefile
# Copyright (c) 2ndQuadrant, 2010-2017 # Copyright (c) 2ndQuadrant, 2010-2016
HEADERS = $(wildcard *.h) HEADERS = $(wildcard *.h)
repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o
repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o dirmod.o compat.o repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o dirmod.o
DATA = repmgr.sql uninstall_repmgr.sql DATA = repmgr.sql uninstall_repmgr.sql
REGRESS = repmgr_funcs repmgr_test
PG_CPPFLAGS = -I$(includedir_internal) -I$(libpq_srcdir) PG_CPPFLAGS = -I$(libpq_srcdir)
PG_LIBS = $(libpq_pgport) PG_LIBS = $(libpq_pgport)
@@ -18,11 +17,11 @@ all: repmgrd repmgr
$(MAKE) -C sql $(MAKE) -C sql
repmgrd: $(repmgrd_OBJS) repmgrd: $(repmgrd_OBJS)
$(CC) -o repmgrd $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(CC) -o repmgrd $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS)
$(MAKE) -C sql $(MAKE) -C sql
repmgr: $(repmgr_OBJS) repmgr: $(repmgr_OBJS)
$(CC) -o repmgr $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(CC) -o repmgr $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS)
# Make all objects depend on all include files. This is a bit of a # Make all objects depend on all include files. This is a bit of a
# shotgun approach, but the codebase is small enough that a complete rebuild # shotgun approach, but the codebase is small enough that a complete rebuild

253
README.md
View File

@@ -7,8 +7,8 @@ replication capabilities with utilities to set up standby servers, monitor
replication, and perform administrative tasks such as failover or switchover replication, and perform administrative tasks such as failover or switchover
operations. operations.
The current `repmgr` version (3.3) supports all PostgreSQL versions from The current `repmgr` version, 3.2, supports all PostgreSQL versions from
9.3 to 9.6. 9.3, including the upcoming 9.6.
Overview Overview
-------- --------
@@ -121,8 +121,7 @@ views:
status for each node status for each node
The `repmgr` metadata schema can be stored in an existing database or in its own The `repmgr` metadata schema can be stored in an existing database or in its own
dedicated database. Note that the `repmgr` metadata schema cannot reside on a database dedicated database.
server which is not part of the replication cluster managed by `repmgr`.
A dedicated database superuser is required to own the meta-database as well as carry A dedicated database superuser is required to own the meta-database as well as carry
out administrative actions. out administrative actions.
@@ -189,14 +188,6 @@ system.
Instructions can be found in the APT section of the PostgreSQL Wiki Instructions can be found in the APT section of the PostgreSQL Wiki
( https://wiki.postgresql.org/wiki/Apt ). ( https://wiki.postgresql.org/wiki/Apt ).
*NOTE*: repmgr 3.3 packages are now only available via a 2ndQuadrant-hosted
repository which can be installed like this:
apt-key adv --fetch-keys http://packages.2ndquadrant.com/repmgr3/apt/0xD3FA41F6.asc
echo deb http://packages.2ndquadrant.com/repmgr3/apt/ $(lsb_release -cs)-2ndquadrant main > /etc/apt/sources.list.d/repmgr3.list
See `PACKAGES.md` for details on building .deb and .rpm packages from the See `PACKAGES.md` for details on building .deb and .rpm packages from the
`repmgr` source code. `repmgr` source code.
@@ -210,7 +201,7 @@ See `PACKAGES.md` for details on building .deb and .rpm packages from the
Release tarballs are also available: Release tarballs are also available:
https://github.com/2ndQuadrant/repmgr/releases https://github.com/2ndQuadrant/repmgr/releases
http://repmgr.org/ http://repmgr.org/downloads.php
`repmgr` is compiled in the same way as a PostgreSQL extension using the PGXS `repmgr` is compiled in the same way as a PostgreSQL extension using the PGXS
infrastructure, e.g.: infrastructure, e.g.:
@@ -238,29 +229,15 @@ The configuration file will be searched for in the following locations:
Note that if a file is explicitly specified with `-f/--config-file`, an error will Note that if a file is explicitly specified with `-f/--config-file`, an error will
be raised if it is not found or not readable and no attempt will be made to check be raised if it is not found or not readable and no attempt will be made to check
default locations; this is to prevent `repmgr` unexpectedly reading the wrong file. default locations; this is to prevent `repmgr` reading the wrong file.
For a full list of annotated configuration items, see the file `repmgr.conf.sample`. For a full list of annotated configuration items, see the file `repmgr.conf.sample`.
The following parameters in the configuration file can be overridden with The following parameters in the configuration file can be overridden with
command line options: command line options:
- `log_level` with `-L/--log-level` - `-L/--log-level`
- `pg_bindir` with `-b/--pg_bindir` - `-b/--pg_bindir`
### Logging
By default `repmgr` and `repmgrd` will log directly to `STDERR`. For `repmgrd`
we recommend capturing output in a logfile or using your system's log facility;
see `repmgr.conf.sample` for details.
As a command line utility, `repmgr` will log directly to the console by default
(this is a change in behaviour from versions before 3.3, where it would always
log to the same location as `repmgrd`). However in some circumstances, such as
when `repmgr` is executed by `repmgrd` during a failover event, it makes sense to
capture `repmgr`'s log output - this can be done by supplying the command-line
option `--log-to-file` to `repmgr`.
### Command line options and environment variables ### Command line options and environment variables
@@ -297,14 +274,14 @@ Setting up a simple replication cluster with repmgr
The following section will describe how to set up a basic replication cluster The following section will describe how to set up a basic replication cluster
with a master and a standby server using the `repmgr` command line tool. with a master and a standby server using the `repmgr` command line tool.
It is assumed PostgreSQL is installed on both servers in the cluster, It is assumed PostgreSQL is installed on both servers in the cluster,
`rsync` is available and passwordless SSH connections are possible between `rsync` is available and password-less SSH connections are possible between
both servers. both servers.
* * * * * *
> *TIP*: for testing `repmgr`, it's possible to use multiple PostgreSQL > *TIP*: for testing `repmgr`, it's possible to use multiple PostgreSQL
> instances running on different ports on the same computer, with > instances running on different ports on the same computer, with
> passwordless SSH access to `localhost` enabled. > password-less SSH access to `localhost` enabled.
* * * * * *
@@ -322,13 +299,7 @@ The following replication settings may need to be adjusted:
max_wal_senders = 10 max_wal_senders = 10
# Ensure WAL files contain enough information to enable read-only queries # Ensure WAL files contain enough information to enable read-only queries
# on the standby. # on the standby
#
# PostgreSQL 9.5 and earlier: one of 'hot_standby' or 'logical'
# PostgreSQL 9.6 and later: one of 'replica' or 'logical'
# ('hot_standby' will still be accepted as an alias for 'replica')
#
# See: https://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-LEVEL
wal_level = 'hot_standby' wal_level = 'hot_standby'
@@ -347,11 +318,10 @@ The following replication settings may need to be adjusted:
archive_command = '/bin/true' archive_command = '/bin/true'
# If cloning using rsync, or you have configured `pg_basebackup_options` # If cloning using rsync, or you have configured `pg_basebackup_options`
# in `repmgr.conf` to include the setting `--xlog-method=fetch` (from # in `repmgr.conf` to include the setting `--xlog-method=fetch`, *and*
# PostgreSQL 10 `--wal-method=fetch`), *and* you have not set # you have not set `restore_command` in `repmgr.conf`to fetch WAL files
# `restore_command` in `repmgr.conf`to fetch WAL files from another # from another source such as Barman, you'll need to set `wal_keep_segments`
# source such as Barman, you'll need to set `wal_keep_segments` to a # to a high enough value to ensure that all WAL files generated while
# high enough value to ensure that all WAL files generated while
# the standby is being cloned are retained until the standby starts up. # the standby is being cloned are retained until the standby starts up.
# wal_keep_segments = 5000 # wal_keep_segments = 5000
@@ -405,8 +375,7 @@ least the following parameters:
- `cluster`: an arbitrary name for the replication cluster; this must be identical - `cluster`: an arbitrary name for the replication cluster; this must be identical
on all nodes on all nodes
- `node`: a unique integer identifying the node; note this must be a positive - `node`: a unique integer identifying the node
32 bit signed integer between 1 and 2147483647
- `node_name`: a unique string identifying the node; we recommend a name - `node_name`: a unique string identifying the node; we recommend a name
specific to the server (e.g. 'server_1'); avoid names indicating the specific to the server (e.g. 'server_1'); avoid names indicating the
current replication role like 'master' or 'standby' as the server's current replication role like 'master' or 'standby' as the server's
@@ -414,8 +383,7 @@ least the following parameters:
- `conninfo`: a valid connection string for the `repmgr` database on the - `conninfo`: a valid connection string for the `repmgr` database on the
*current* server. (On the standby, the database will not yet exist, but *current* server. (On the standby, the database will not yet exist, but
`repmgr` needs to know the connection details to complete the setup `repmgr` needs to know the connection details to complete the setup
process). *NOTE* this must be a keyword/value string, not a connection process).
URI; this limitation will be removed in a future `repmgr` version.
`repmgr.conf` should not be stored inside the PostgreSQL data directory, `repmgr.conf` should not be stored inside the PostgreSQL data directory,
as it could be overwritten when setting up or reinitialising the PostgreSQL as it could be overwritten when setting up or reinitialising the PostgreSQL
@@ -440,11 +408,11 @@ to include this schema name, e.g.
### Initialise the master server ### Initialise the master server
To enable `repmgr` to support a replication cluster, the master node must To enable `repmgr` to support a replication cluster, the master node must
be registered with `repmgr`, which creates the `repmgr` metadatabase and adds be registered with `repmgr`, which creates the `repmgr` database and adds
a metadata record for the server: a metadata record for the server:
$ repmgr -f repmgr.conf master register $ repmgr -f repmgr.conf master register
NOTICE: master node correctly registered for cluster test with id 1 (conninfo: host=repmgr_node1 user=repmgr dbname=repmgr) [2016-01-07 16:56:46] [NOTICE] master node correctly registered for cluster test with id 1 (conninfo: host=repmgr_node1 user=repmgr dbname=repmgr)
The metadata record looks like this: The metadata record looks like this:
@@ -471,13 +439,13 @@ the values `node`, `node_name` and `conninfo` adjusted accordingly, e.g.:
Clone the standby with: Clone the standby with:
$ repmgr -h repmgr_node1 -U repmgr -d repmgr -D /path/to/node2/data/ -f /etc/repmgr.conf standby clone $ repmgr -h repmgr_node1 -U repmgr -d repmgr -D /path/to/node2/data/ -f /etc/repmgr.conf standby clone
NOTICE: destination directory '/path/to/node2/data/' provided [2016-01-07 17:21:26] [NOTICE] destination directory '/path/to/node2/data/' provided
NOTICE: starting backup... [2016-01-07 17:21:26] [NOTICE] starting backup...
HINT: this may take some time; consider using the -c/--fast-checkpoint option [2016-01-07 17:21:26] [HINT] this may take some time; consider using the -c/--fast-checkpoint option
NOTICE: pg_stop_backup complete, all required WAL segments have been archived NOTICE: pg_stop_backup complete, all required WAL segments have been archived
NOTICE: standby clone (using pg_basebackup) complete [2016-01-07 17:21:28] [NOTICE] standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server [2016-01-07 17:21:28] [NOTICE] you can now start your PostgreSQL server
HINT: for example : pg_ctl -D /path/to/node2/data/ start [2016-01-07 17:21:28] [HINT] for example : pg_ctl -D /path/to/node2/data/ start
This will clone the PostgreSQL data directory files from the master at `repmgr_node1` This will clone the PostgreSQL data directory files from the master at `repmgr_node1`
using PostgreSQL's `pg_basebackup` utility. A `recovery.conf` file containing the using PostgreSQL's `pg_basebackup` utility. A `recovery.conf` file containing the
@@ -518,8 +486,7 @@ place. To ensure this happens when using the default `pg_basebackup` method,
`repmgr` will set `pg_basebackup`'s `--xlog-method` parameter to `stream`, `repmgr` will set `pg_basebackup`'s `--xlog-method` parameter to `stream`,
which will ensure all WAL files generated during the cloning process are which will ensure all WAL files generated during the cloning process are
streamed in parallel with the main backup. Note that this requires two streamed in parallel with the main backup. Note that this requires two
replication connections to be available (`repmgr` will verify sufficient replication connections to be available.
connections are available before attempting to clone).
To override this behaviour, in `repmgr.conf` set `pg_basebackup`'s To override this behaviour, in `repmgr.conf` set `pg_basebackup`'s
`--xlog-method` parameter to `fetch`: `--xlog-method` parameter to `fetch`:
@@ -531,9 +498,6 @@ See the `pg_basebackup` documentation for details:
https://www.postgresql.org/docs/current/static/app-pgbasebackup.html https://www.postgresql.org/docs/current/static/app-pgbasebackup.html
> *NOTE*: From PostgreSQL 10, `pg_basebackup`'s `--xlog-method` parameter
> has been renamed to `--wal-method`.
Make any adjustments to the standby's PostgreSQL configuration files now, Make any adjustments to the standby's PostgreSQL configuration files now,
then start the server. then start the server.
@@ -576,8 +540,8 @@ Connect to the master server and execute:
Register the standby server with: Register the standby server with:
$ repmgr -f /etc/repmgr.conf standby register repmgr -f /etc/repmgr.conf standby register
NOTICE: standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr) [2016-01-08 11:13:16] [NOTICE] standby node correctly registered for cluster test with id 2 (conninfo: host=repmgr_node2 user=repmgr dbname=repmgr)
Connect to the standby server's `repmgr` database and check the `repl_nodes` Connect to the standby server's `repmgr` database and check the `repl_nodes`
table: table:
@@ -608,21 +572,6 @@ to effectively manage cascading replication (see below).
* * * * * *
Under some circumstances you may wish to register a standby which is not
yet running; this can be the case when using provisioning tools to create
a complex replication cluster. In this case, by using the `-F/--force`
option and providing the connection parameters to the master server,
the standby can be registered.
Similarly, with cascading replication it may be necessary to register
a standby whose upstream node has not yet been registered - in this case,
using `-F/--force` will result in the creation of an inactive placeholder
record for the upstream node, which will however later need to be registered
with the `-F/--force` option too.
When used with `standby register`, care should be taken that use of the
`-F/--force` option does not result in an incorrectly configured cluster.
### Using Barman to clone a standby ### Using Barman to clone a standby
`repmgr standby clone` also supports Barman, the Backup and `repmgr standby clone` also supports Barman, the Backup and
@@ -646,12 +595,12 @@ In order to enable Barman support for `repmgr standby clone`, you must
ensure that: ensure that:
- the name of the server configured in Barman is equal to the - the name of the server configured in Barman is equal to the
`cluster` setting in `repmgr.conf`; `cluster_name` setting in `repmgr.conf`;
- the `barman_server` setting in `repmgr.conf` is set to the SSH - the `barman_server` setting in `repmgr.conf` is set to the SSH
hostname of the Barman server; hostname of the Barman server;
- the `restore_command` setting in `repmgr.conf` is configured to - the `restore_command` setting in `repmgr.conf` is configured to
use a copy of the `barman-wal-restore` script shipped with the use a copy of the `barman-wal-restore` script shipped with the
`barman-cli` package (see below); `barman-cli package` (see below);
- the Barman catalogue includes at least one valid backup for this - the Barman catalogue includes at least one valid backup for this
server. server.
@@ -691,13 +640,13 @@ specify this in `repmgr.conf` with `barman_config`:
Now we can clone a standby using the Barman server: Now we can clone a standby using the Barman server:
$ repmgr -h node1 -d repmgr -D 9.5/main -f /etc/repmgr.conf standby clone $ repmgr -h node1 -D 9.5/main -f /etc/repmgr.conf standby clone
NOTICE: destination directory '9.5/main' provided [2016-06-12 20:08:35] [NOTICE] destination directory '9.5/main' provided
NOTICE: getting backup from Barman... [2016-06-12 20:08:35] [NOTICE] getting backup from Barman...
NOTICE: standby clone (from Barman) complete [2016-06-12 20:08:36] [NOTICE] standby clone (from Barman) complete
NOTICE: you can now start your PostgreSQL server [2016-06-12 20:08:36] [NOTICE] you can now start your PostgreSQL server
HINT: for example : pg_ctl -D 9.5/data start [2016-06-12 20:08:36] [HINT] for example : pg_ctl -D 9.5/data start
HINT: After starting the server, you need to register this standby with "repmgr standby register" [2016-06-12 20:08:36] [HINT] After starting the server, you need to register this standby with "repmgr standby register"
@@ -757,22 +706,13 @@ string passed to `repmgr` with `-d/--dbname` (see above for details), and/or set
appropriate environment variables. appropriate environment variables.
Note that PostgreSQL will always set explicit defaults for `sslmode` and Note that PostgreSQL will always set explicit defaults for `sslmode` and
`sslcompression` (and from PostgreSQL 10.0 also `target_session_attrs`). `sslcompression`.
If `application_name` is set in the standby's `conninfo` parameter in If `application_name` is set in the standby's `conninfo` parameter in
`repmgr.conf`, this value will be appended to `primary_conninfo`, otherwise `repmgr.conf`, this value will be appended to `primary_conninfo`, otherwise
`repmgr` will set `application_name` to the same value as the `node_name` `repmgr` will set `application_name` to the same value as the `node_name`
parameter. parameter.
By default `repmgr` assumes the user who owns the `repmgr` metadatabase will
also be the replication user; a different replication user can be specified
with `--replication-user`.
If the upstream server requires a password, and this was provided via
`PGPASSWORD`, `.pgpass` etc., by default `repmgr` will include this in
`primary_conninfo`. Use the command line option `--no-conninfo-password` to
suppress this.
Setting up cascading replication with repmgr Setting up cascading replication with repmgr
-------------------------------------------- --------------------------------------------
@@ -806,15 +746,15 @@ created standby. Clone this standby (using the connection parameters
for the existing standby) and register it: for the existing standby) and register it:
$ repmgr -h repmgr_node2 -U repmgr -d repmgr -D /path/to/node3/data/ -f /etc/repmgr.conf standby clone $ repmgr -h repmgr_node2 -U repmgr -d repmgr -D /path/to/node3/data/ -f /etc/repmgr.conf standby clone
NOTICE: destination directory 'node_3/data/' provided [2016-01-08 13:44:52] [NOTICE] destination directory 'node_3/data/' provided
NOTICE: starting backup (using pg_basebackup)... [2016-01-08 13:44:52] [NOTICE] starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option [2016-01-08 13:44:52] [HINT] this may take some time; consider using the -c/--fast-checkpoint option
NOTICE: standby clone (using pg_basebackup) complete [2016-01-08 13:44:52] [NOTICE] standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server [2016-01-08 13:44:52] [NOTICE] you can now start your PostgreSQL server
HINT: for example : pg_ctl -D /path/to/node_3/data start [2016-01-08 13:44:52] [HINT] for example : pg_ctl -D /path/to/node_3/data start
$ repmgr -f /etc/repmgr.conf standby register $ repmgr -f /etc/repmgr.conf standby register
NOTICE: standby node correctly registered for cluster test with id 3 (conninfo: host=repmgr_node3 dbname=repmgr user=repmgr) [2016-01-08 14:04:32] [NOTICE] standby node correctly registered for cluster test with id 3 (conninfo: host=repmgr_node3 dbname=repmgr user=repmgr)
After starting the standby, the `repl_nodes` table will look like this: After starting the standby, the `repl_nodes` table will look like this:
@@ -826,15 +766,6 @@ After starting the standby, the `repl_nodes` table will look like this:
3 | standby | 2 | test | node3 | host=repmgr_node3 dbname=repmgr user=repmgr | | 100 | t 3 | standby | 2 | test | node3 | host=repmgr_node3 dbname=repmgr user=repmgr | | 100 | t
(3 rows) (3 rows)
* * *
> *TIP*: under some circumstances when setting up a cascading replication
> cluster, you may wish to clone a downstream standby whose upstream node
> does not yet exist. In this case you can clone from the master (or
> another upstream node) and provide the parameter `--upstream-conninfo`
> to explictly set the upstream's `primary_conninfo` string in `recovery.conf`.
* * *
Using replication slots with repmgr Using replication slots with repmgr
----------------------------------- -----------------------------------
@@ -920,19 +851,19 @@ Promote the first standby with:
This will produce output similar to the following: This will produce output similar to the following:
ERROR: connection to database failed: could not connect to server: Connection refused [2016-01-08 16:07:31] [ERROR] connection to database failed: could not connect to server: Connection refused
Is the server running on host "repmgr_node1" (192.161.2.1) and accepting Is the server running on host "repmgr_node1" (192.161.2.1) and accepting
TCP/IP connections on port 5432? TCP/IP connections on port 5432?
could not connect to server: Connection refused could not connect to server: Connection refused
Is the server running on host "repmgr_node1" (192.161.2.1) and accepting Is the server running on host "repmgr_node1" (192.161.2.1) and accepting
TCP/IP connections on port 5432? TCP/IP connections on port 5432?
NOTICE: promoting standby [2016-01-08 16:07:31] [NOTICE] promoting standby
NOTICE: promoting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_2/data promote' [2016-01-08 16:07:31] [NOTICE] promoting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_2/data promote'
server promoting server promoting
NOTICE: STANDBY PROMOTE successful [2016-01-08 16:07:33] [NOTICE] STANDBY PROMOTE successful
Note: the first `ERROR` is `repmgr` attempting to connect to the current Note: the first `[ERROR]` is `repmgr` attempting to connect to the current
master to verify that it has failed. If a valid master is found, `repmgr` master to verify that it has failed. If a valid master is found, `repmgr`
will refuse to promote a standby. will refuse to promote a standby.
@@ -964,7 +895,7 @@ end of the preceding section ("Promoting a standby server with repmgr"),
execute this: execute this:
$ repmgr -f /etc/repmgr.conf -D /path/to/node_3/data/ -h repmgr_node2 -U repmgr -d repmgr standby follow $ repmgr -f /etc/repmgr.conf -D /path/to/node_3/data/ -h repmgr_node2 -U repmgr -d repmgr standby follow
NOTICE: restarting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_3/data/ -w -m fast restart' [2016-01-08 16:57:06] [NOTICE] restarting server using '/usr/bin/postgres/pg_ctl -D /path/to/node_3/data/ -w -m fast restart'
waiting for server to shut down.... done waiting for server to shut down.... done
server stopped server stopped
waiting for server to start.... done waiting for server to start.... done
@@ -1011,13 +942,6 @@ both passwordless SSH access and the path of `repmgr.conf` on that server.
> careful preparation and with adequate attention. In particular you should > careful preparation and with adequate attention. In particular you should
> be confident that your network environment is stable and reliable. > be confident that your network environment is stable and reliable.
> >
> Additionally you should be sure that the current master can be shut down
> quickly and cleanly. In particular, access from applications should be
> minimalized or preferably blocked completely. Also check that there is
> no backlog of files waiting to be archived, as PostgreSQL will not shut
> down until archiving completes, and that any standbys attached to the
> current primary don't have a significant amount of replication lag.
>
> We recommend running `repmgr standby switchover` at the most verbose > We recommend running `repmgr standby switchover` at the most verbose
> logging level (`--log-level DEBUG --verbose`) and capturing all output > logging level (`--log-level DEBUG --verbose`) and capturing all output
> to assist troubleshooting any problems. > to assist troubleshooting any problems.
@@ -1043,26 +967,26 @@ local server, as well as the normal default locations. `repmgr` will check
this file can be found before performing any further actions. this file can be found before performing any further actions.
$ repmgr -f /etc/repmgr.conf -C /etc/repmgr.conf standby switchover -v $ repmgr -f /etc/repmgr.conf -C /etc/repmgr.conf standby switchover -v
NOTICE: using configuration file "/etc/repmgr.conf" [2016-01-27 16:38:33] [NOTICE] using configuration file "/etc/repmgr.conf"
NOTICE: switching current node 2 to master server and demoting current master to standby... [2016-01-27 16:38:33] [NOTICE] switching current node 2 to master server and demoting current master to standby...
NOTICE: 5 files copied to /tmp/repmgr-node1-archive [2016-01-27 16:38:34] [NOTICE] 5 files copied to /tmp/repmgr-node1-archive
NOTICE: connection to database failed: FATAL: the database system is shutting down [2016-01-27 16:38:34] [NOTICE] connection to database failed: FATAL: the database system is shutting down
NOTICE: current master has been stopped [2016-01-27 16:38:34] [NOTICE] current master has been stopped
ERROR: connection to database failed: FATAL: the database system is shutting down [2016-01-27 16:38:34] [ERROR] connection to database failed: FATAL: the database system is shutting down
NOTICE: promoting standby [2016-01-27 16:38:34] [NOTICE] promoting standby
NOTICE: promoting server using '/usr/local/bin/pg_ctl -D /var/lib/postgresql/9.5/node_2/data promote' [2016-01-27 16:38:34] [NOTICE] promoting server using '/usr/local/bin/pg_ctl -D /var/lib/postgresql/9.5/node_2/data promote'
server promoting server promoting
NOTICE: STANDBY PROMOTE successful [2016-01-27 16:38:36] [NOTICE] STANDBY PROMOTE successful
NOTICE: Executing pg_rewind on old master server [2016-01-27 16:38:36] [NOTICE] Executing pg_rewind on old master server
NOTICE: 5 files copied to /var/lib/postgresql/9.5/data [2016-01-27 16:38:36] [NOTICE] 5 files copied to /var/lib/postgresql/9.5/data
NOTICE: restarting server using '/usr/local/bin/pg_ctl -w -D /var/lib/postgresql/9.5/node_1/data -m fast restart' [2016-01-27 16:38:36] [NOTICE] restarting server using '/usr/local/bin/pg_ctl -w -D /var/lib/postgresql/9.5/node_1/data -m fast restart'
pg_ctl: PID file "/var/lib/postgresql/9.5/node_1/data/postmaster.pid" does not exist pg_ctl: PID file "/var/lib/postgresql/9.5/node_1/data/postmaster.pid" does not exist
Is server running? Is server running?
starting server anyway starting server anyway
NOTICE: node 1 is replicating in state "streaming" [2016-01-27 16:38:37] [NOTICE] node 1 is replicating in state "streaming"
NOTICE: switchover was successful [2016-01-27 16:38:37] [NOTICE] switchover was successful
Messages containing the line `connection to database failed: FATAL: the database Messages containing the line `connection to database failed: FATAL: the database
system is shutting down` are not errors - `repmgr` is polling the old master database system is shutting down` are not errors - `repmgr` is polling the old master database
@@ -1084,7 +1008,7 @@ should have been updated to reflect this:
### Caveats ### Caveats
- The functionality provided by `repmgr standby switchover` is primarily aimed - The functionality provided `repmgr standby switchover` is primarily aimed
at a two-server master/standby replication cluster and currently does at a two-server master/standby replication cluster and currently does
not support additional standbys. not support additional standbys.
- `repmgr standby switchover` is designed to use the `pg_rewind` utility, - `repmgr standby switchover` is designed to use the `pg_rewind` utility,
@@ -1098,6 +1022,11 @@ should have been updated to reflect this:
the `repmgrd` may try and promote a standby by itself. the `repmgrd` may try and promote a standby by itself.
- Any other standbys attached to the old master will need to be manually - Any other standbys attached to the old master will need to be manually
instructed to point to the new master (e.g. with `repmgr standby follow`). instructed to point to the new master (e.g. with `repmgr standby follow`).
- You must ensure that following a server start using `pg_ctl`, log output
is not send to STDERR (the default behaviour). If logging is not configured,
we recommend setting `logging_collector=on` in `postgresql.conf` and
providing an explicit `-l/--log` setting in `repmgr.conf`'s `pg_ctl_options`
parameter.
We hope to remove some of these restrictions in future versions of `repmgr`. We hope to remove some of these restrictions in future versions of `repmgr`.
@@ -1143,9 +1072,8 @@ This will remove the standby record from `repmgr`'s internal metadata
table (`repl_nodes`). A `standby_unregister` event notification will be table (`repl_nodes`). A `standby_unregister` event notification will be
recorded in the `repl_events` table. recorded in the `repl_events` table.
Note that this command will not stop the server itself or remove it from Note that this command will not stop the server itself or remove
the replication cluster. Note that if the standby was using a replication it from the replication cluster.
slot, this will not be removed.
If the standby is not running, the command can be executed on another If the standby is not running, the command can be executed on another
node by providing the id of the node to be unregistered using node by providing the id of the node to be unregistered using
@@ -1163,22 +1091,18 @@ Automatic failover with `repmgrd`
and which can automate actions such as failover and updating standbys to and which can automate actions such as failover and updating standbys to
follow the new master. follow the new master.
To use `repmgrd` for automatic failover, `postgresql.conf` must contain the To use `repmgrd` for automatic failover, the following `repmgrd` options must
following line: be set in `repmgr.conf`:
shared_preload_libraries = 'repmgr_funcs'
(changing this setting requires a restart of PostgreSQL).
Additionally the following `repmgrd` options must be set in `repmgr.conf`:
failover=automatic failover=automatic
promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file' promote_command='repmgr standby promote -f /etc/repmgr.conf'
follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file' follow_command='repmgr standby follow -f /etc/repmgr.conf'
Note that the `--log-to-file` option will cause `repmgr`'s output to be logged to (See `repmgr.conf.sample` for further `repmgrd`-specific settings).
the destination configured to receive log output for `repmgrd`.
See `repmgr.conf.sample` for further `repmgrd`-specific settings Additionally, `postgresql.conf` must contain the following line:
shared_preload_libraries = 'repmgr_funcs'
When `failover` is set to `automatic`, upon detecting failure of the current When `failover` is set to `automatic`, upon detecting failure of the current
master, `repmgrd` will execute one of `promote_command` or `follow_command`, master, `repmgrd` will execute one of `promote_command` or `follow_command`,
@@ -1481,9 +1405,7 @@ functionality will be included in a feature release (e.g. 3.0.x to 3.1.x).
In general `repmgr` can be upgraded as-is without any further action required, In general `repmgr` can be upgraded as-is without any further action required,
however feature releases may require the `repmgr` database to be upgraded. however feature releases may require the `repmgr` database to be upgraded.
An SQL script will be provided - please check the release notes for details: An SQL script will be provided - please check the release notes for details.
* http://repmgr.org/release-notes-3.3.html#UPGRADING
Distribution-specific configuration Distribution-specific configuration
@@ -1586,7 +1508,7 @@ which contains connection details for the local database.
bootstrapping new installations. To update an existing but 'stale' bootstrapping new installations. To update an existing but 'stale'
data directory (for example belonging to a failed master), `rsync` data directory (for example belonging to a failed master), `rsync`
must be used by specifying `--rsync-only`. In this case, must be used by specifying `--rsync-only`. In this case,
passwordless SSH connections between servers are required. password-less SSH connections between servers are required.
* `standby promote` * `standby promote`
@@ -1600,13 +1522,13 @@ which contains connection details for the local database.
by using `standby follow` (see below); if `repmgrd` is active, it will by using `standby follow` (see below); if `repmgrd` is active, it will
handle this. handle this.
This command will fail with an error if the current master is still running. This command will not function if the current master is still running.
* `standby switchover` * `standby switchover`
Promotes a standby to master and demotes the existing master to a standby. Promotes a standby to master and demotes the existing master to a standby.
This command must be run on the standby to be promoted, and requires a This command must be run on the standby to be promoted, and requires a
passwordless SSH connection to the current master. Additionally the password-less SSH connection to the current master. Additionally the
location of the master's `repmgr.conf` file must be provided with location of the master's `repmgr.conf` file must be provided with
`-C/--remote-config-file`. `-C/--remote-config-file`.
@@ -1627,7 +1549,7 @@ which contains connection details for the local database.
Creates a witness server as a separate PostgreSQL instance. This instance Creates a witness server as a separate PostgreSQL instance. This instance
can be on a separate server or a server running an existing node. The can be on a separate server or a server running an existing node. The
witness server contains a copy of the repmgr metadata tables but will not witness server contain a copy of the repmgr metadata tables but will not
be set up as a standby; instead it will update its metadata copy each be set up as a standby; instead it will update its metadata copy each
time a failover occurs. time a failover occurs.
@@ -1713,7 +1635,7 @@ which contains connection details for the local database.
overview of connections between all databases in the cluster. overview of connections between all databases in the cluster.
These commands require a valid `repmgr.conf` file on each node. These commands require a valid `repmgr.conf` file on each node.
Additionally passwordless `ssh` connections are required between Additionally password-less `ssh` connections are required between
all nodes. all nodes.
Example 1 (all nodes up): Example 1 (all nodes up):
@@ -1891,7 +1813,6 @@ Thanks from the repmgr core team.
Further reading Further reading
--------------- ---------------
* http://blog.2ndquadrant.com/repmgr-3-2-is-here-barman-support-brand-new-high-availability-features/
* http://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/ * http://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/
* http://blog.2ndquadrant.com/managing-useful-clusters-repmgr/ * http://blog.2ndquadrant.com/managing-useful-clusters-repmgr/
* http://blog.2ndquadrant.com/easier_postgresql_90_clusters/ * http://blog.2ndquadrant.com/easier_postgresql_90_clusters/

View File

@@ -1,6 +1,6 @@
/* /*
* check_dir.c - Directories management functions * check_dir.c - Directories management functions
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/* /*
* check_dir.h * check_dir.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

107
compat.c
View File

@@ -1,107 +0,0 @@
/*
*
* compat.c
* Provides a couple of useful string utility functions adapted
* from the backend code, which are not publicly exposed. They're
* unlikely to change but it would be worth keeping an eye on them
* for any fixes/improvements
*
* Copyright (c) 2ndQuadrant, 2010-2017
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#include "repmgr.h"
#include "compat.h"
/*
* Append the given string to the buffer, with suitable quoting for passing
* the string as a value, in a keyword/pair value in a libpq connection
* string
*
* This function is adapted from src/fe_utils/string_utils.c (before 9.6
* located in: src/bin/pg_dump/dumputils.c)
*/
void
appendConnStrVal(PQExpBuffer buf, const char *str)
{
const char *s;
bool needquotes;
/*
* If the string is one or more plain ASCII characters, no need to quote
* it. This is quite conservative, but better safe than sorry.
*/
needquotes = true;
for (s = str; *s; s++)
{
if (!((*s >= 'a' && *s <= 'z') || (*s >= 'A' && *s <= 'Z') ||
(*s >= '0' && *s <= '9') || *s == '_' || *s == '.'))
{
needquotes = true;
break;
}
needquotes = false;
}
if (needquotes)
{
appendPQExpBufferChar(buf, '\'');
while (*str)
{
/* ' and \ must be escaped by to \' and \\ */
if (*str == '\'' || *str == '\\')
appendPQExpBufferChar(buf, '\\');
appendPQExpBufferChar(buf, *str);
str++;
}
appendPQExpBufferChar(buf, '\'');
}
else
appendPQExpBufferStr(buf, str);
}
/*
* Adapted from: src/fe_utils/string_utils.c
*/
void
appendShellString(PQExpBuffer buf, const char *str)
{
const char *p;
appendPQExpBufferChar(buf, '\'');
for (p = str; *p; p++)
{
if (*p == '\n' || *p == '\r')
{
fprintf(stderr,
_("shell command argument contains a newline or carriage return: \"%s\"\n"),
str);
exit(ERR_BAD_CONFIG);
}
if (*p == '\'')
appendPQExpBufferStr(buf, "'\"'\"'");
else
appendPQExpBufferChar(buf, *p);
}
appendPQExpBufferChar(buf, '\'');
}

View File

@@ -1,29 +0,0 @@
/*
* compat.h
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef _COMPAT_H_
#define _COMPAT_H_
extern void
appendConnStrVal(PQExpBuffer buf, const char *str);
extern void
appendShellString(PQExpBuffer buf, const char *str);
#endif

331
config.c
View File

@@ -1,7 +1,7 @@
/* /*
* config.c - Functions to parse the config file * config.c - Functions to parse the config file
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -10,11 +10,11 @@
* *
* This program is distributed in the hope that it will be useful, * This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of * but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details. * GNU General Public License for more details.
* *
* You should have received a copy of the GNU General Public License * You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>. * along with this program. If not, see <http://www.gnu.org/licenses/>.
* *
*/ */
@@ -30,7 +30,7 @@ static void tablespace_list_append(t_configuration_options *options, const char
static void exit_with_errors(ItemList *config_errors); static void exit_with_errors(ItemList *config_errors);
const static char *_progname = NULL; const static char *_progname = NULL;
static char config_file_path[MAXPGPATH] = ""; static char config_file_path[MAXPGPATH];
static bool config_file_provided = false; static bool config_file_provided = false;
bool config_file_found = false; bool config_file_found = false;
@@ -55,8 +55,8 @@ progname(void)
* *
* Returns true if a configuration file could be parsed, otherwise false. * Returns true if a configuration file could be parsed, otherwise false.
* *
* Any *repmgrd-specific* configuration options added/changed in this function must also be * Any configuration options changed in this function must also be changed in
* added/changed in reload_config() * reload_config()
* *
* NOTE: this function is called before the logger is set up, so we need * NOTE: this function is called before the logger is set up, so we need
* to handle the verbose option ourselves; also the default log level is NOTICE, * to handle the verbose option ourselves; also the default log level is NOTICE,
@@ -99,9 +99,9 @@ load_config(const char *config_file, bool verbose, t_configuration_options *opti
/* /*
* If no configuration file was provided, attempt to find a default file * If no configuration file was provided, attempt to find a default file
* in this order: * in this order:
* - current directory * - current directory
* - /etc/repmgr.conf * - /etc/repmgr.conf
* - default sysconfdir * - default sysconfdir
* *
* here we just check for the existence of the file; parse_config() * here we just check for the existence of the file; parse_config()
* will handle read errors etc. * will handle read errors etc.
@@ -181,23 +181,6 @@ load_config(const char *config_file, bool verbose, t_configuration_options *opti
} }
bool
parse_config(t_configuration_options *options)
{
/* Collate configuration file errors here for friendlier reporting */
static ItemList config_errors = { NULL, NULL };
_parse_config(options, &config_errors);
if (config_errors.head != NULL)
{
exit_with_errors(&config_errors);
}
return true;
}
/* /*
* Parse configuration file; if any errors are encountered, * Parse configuration file; if any errors are encountered,
* list them and exit. * list them and exit.
@@ -205,8 +188,8 @@ parse_config(t_configuration_options *options)
* Ensure any default values set here are synced with repmgr.conf.sample * Ensure any default values set here are synced with repmgr.conf.sample
* and any other documentation. * and any other documentation.
*/ */
void bool
_parse_config(t_configuration_options *options, ItemList *error_list) parse_config(t_configuration_options *options)
{ {
FILE *fp; FILE *fp;
char *s, char *s,
@@ -218,6 +201,9 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
PQconninfoOption *conninfo_options; PQconninfoOption *conninfo_options;
char *conninfo_errmsg = NULL; char *conninfo_errmsg = NULL;
/* Collate configuration file errors here for friendlier reporting */
static ItemList config_errors = { NULL, NULL };
bool node_found = false; bool node_found = false;
/* Initialize configuration options with sensible defaults /* Initialize configuration options with sensible defaults
@@ -225,7 +211,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
* to be initialised here * to be initialised here
*/ */
memset(options->cluster_name, 0, sizeof(options->cluster_name)); memset(options->cluster_name, 0, sizeof(options->cluster_name));
options->node = UNKNOWN_NODE_ID; options->node = -1;
options->upstream_node = NO_UPSTREAM_NODE; options->upstream_node = NO_UPSTREAM_NODE;
options->use_replication_slots = 0; options->use_replication_slots = 0;
memset(options->conninfo, 0, sizeof(options->conninfo)); memset(options->conninfo, 0, sizeof(options->conninfo));
@@ -276,7 +262,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
{ {
log_verbose(LOG_NOTICE, _("no configuration file provided and no default file found - " log_verbose(LOG_NOTICE, _("no configuration file provided and no default file found - "
"continuing with default values\n")); "continuing with default values\n"));
return; return true;
} }
fp = fopen(config_file_path, "r"); fp = fopen(config_file_path, "r");
@@ -321,11 +307,11 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
strncpy(options->cluster_name, value, MAXLEN); strncpy(options->cluster_name, value, MAXLEN);
else if (strcmp(name, "node") == 0) else if (strcmp(name, "node") == 0)
{ {
options->node = repmgr_atoi(value, "node", error_list, false); options->node = repmgr_atoi(value, "node", &config_errors, false);
node_found = true; node_found = true;
} }
else if (strcmp(name, "upstream_node") == 0) else if (strcmp(name, "upstream_node") == 0)
options->upstream_node = repmgr_atoi(value, "upstream_node", error_list, false); options->upstream_node = repmgr_atoi(value, "upstream_node", &config_errors, false);
else if (strcmp(name, "conninfo") == 0) else if (strcmp(name, "conninfo") == 0)
strncpy(options->conninfo, value, MAXLEN); strncpy(options->conninfo, value, MAXLEN);
else if (strcmp(name, "barman_server") == 0) else if (strcmp(name, "barman_server") == 0)
@@ -356,11 +342,11 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
} }
else else
{ {
item_list_append(error_list, _("value for 'failover' must be 'automatic' or 'manual'\n")); item_list_append(&config_errors,_("value for 'failover' must be 'automatic' or 'manual'\n"));
} }
} }
else if (strcmp(name, "priority") == 0) else if (strcmp(name, "priority") == 0)
options->priority = repmgr_atoi(value, "priority", error_list, true); options->priority = repmgr_atoi(value, "priority", &config_errors, true);
else if (strcmp(name, "node_name") == 0) else if (strcmp(name, "node_name") == 0)
strncpy(options->node_name, value, MAXLEN); strncpy(options->node_name, value, MAXLEN);
else if (strcmp(name, "promote_command") == 0) else if (strcmp(name, "promote_command") == 0)
@@ -378,17 +364,17 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
else if (strcmp(name, "service_promote_command") == 0) else if (strcmp(name, "service_promote_command") == 0)
strncpy(options->service_promote_command, value, MAXLEN); strncpy(options->service_promote_command, value, MAXLEN);
else if (strcmp(name, "master_response_timeout") == 0) else if (strcmp(name, "master_response_timeout") == 0)
options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", error_list, false); options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", &config_errors, false);
/* /*
* 'primary_response_timeout' as synonym for 'master_response_timeout' - * 'primary_response_timeout' as synonym for 'master_response_timeout' -
* we'll switch terminology in a future release (3.1?) * we'll switch terminology in a future release (3.1?)
*/ */
else if (strcmp(name, "primary_response_timeout") == 0) else if (strcmp(name, "primary_response_timeout") == 0)
options->master_response_timeout = repmgr_atoi(value, "primary_response_timeout", error_list, false); options->master_response_timeout = repmgr_atoi(value, "primary_response_timeout", &config_errors, false);
else if (strcmp(name, "reconnect_attempts") == 0) else if (strcmp(name, "reconnect_attempts") == 0)
options->reconnect_attempts = repmgr_atoi(value, "reconnect_attempts", error_list, false); options->reconnect_attempts = repmgr_atoi(value, "reconnect_attempts", &config_errors, false);
else if (strcmp(name, "reconnect_interval") == 0) else if (strcmp(name, "reconnect_interval") == 0)
options->reconnect_interval = repmgr_atoi(value, "reconnect_interval", error_list, false); options->reconnect_interval = repmgr_atoi(value, "reconnect_interval", &config_errors, false);
else if (strcmp(name, "pg_bindir") == 0) else if (strcmp(name, "pg_bindir") == 0)
strncpy(options->pg_bindir, value, MAXLEN); strncpy(options->pg_bindir, value, MAXLEN);
else if (strcmp(name, "pg_ctl_options") == 0) else if (strcmp(name, "pg_ctl_options") == 0)
@@ -398,14 +384,14 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
else if (strcmp(name, "logfile") == 0) else if (strcmp(name, "logfile") == 0)
strncpy(options->logfile, value, MAXLEN); strncpy(options->logfile, value, MAXLEN);
else if (strcmp(name, "monitor_interval_secs") == 0) else if (strcmp(name, "monitor_interval_secs") == 0)
options->monitor_interval_secs = repmgr_atoi(value, "monitor_interval_secs", error_list, false); options->monitor_interval_secs = repmgr_atoi(value, "monitor_interval_secs", &config_errors, false);
else if (strcmp(name, "retry_promote_interval_secs") == 0) else if (strcmp(name, "retry_promote_interval_secs") == 0)
options->retry_promote_interval_secs = repmgr_atoi(value, "retry_promote_interval_secs", error_list, false); options->retry_promote_interval_secs = repmgr_atoi(value, "retry_promote_interval_secs", &config_errors, false);
else if (strcmp(name, "witness_repl_nodes_sync_interval_secs") == 0) else if (strcmp(name, "witness_repl_nodes_sync_interval_secs") == 0)
options->witness_repl_nodes_sync_interval_secs = repmgr_atoi(value, "witness_repl_nodes_sync_interval_secs", error_list, false); options->witness_repl_nodes_sync_interval_secs = repmgr_atoi(value, "witness_repl_nodes_sync_interval_secs", &config_errors, false);
else if (strcmp(name, "use_replication_slots") == 0) else if (strcmp(name, "use_replication_slots") == 0)
/* XXX we should have a dedicated boolean argument format */ /* XXX we should have a dedicated boolean argument format */
options->use_replication_slots = repmgr_atoi(value, "use_replication_slots", error_list, false); options->use_replication_slots = repmgr_atoi(value, "use_replication_slots", &config_errors, false);
else if (strcmp(name, "event_notification_command") == 0) else if (strcmp(name, "event_notification_command") == 0)
strncpy(options->event_notification_command, value, MAXLEN); strncpy(options->event_notification_command, value, MAXLEN);
else if (strcmp(name, "event_notifications") == 0) else if (strcmp(name, "event_notifications") == 0)
@@ -433,7 +419,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
_("no value provided for parameter \"%s\""), _("no value provided for parameter \"%s\""),
name); name);
item_list_append(error_list, error_message_buf); item_list_append(&config_errors, error_message_buf);
} }
} }
@@ -442,15 +428,11 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
if (node_found == false) if (node_found == false)
{ {
item_list_append(error_list, _("\"node\": parameter was not found")); item_list_append(&config_errors, _("\"node\": parameter was not found"));
} }
else if (options->node == 0) else if (options->node == 0)
{ {
item_list_append(error_list, _("\"node\": must be greater than zero")); item_list_append(&config_errors, _("\"node\": must be greater than zero"));
}
else if (options->node < 0)
{
item_list_append(error_list, _("\"node\": must be a positive signed 32 bit integer, i.e. 2147483647 or less"));
} }
if (strlen(options->conninfo)) if (strlen(options->conninfo))
@@ -470,11 +452,18 @@ _parse_config(t_configuration_options *options, ItemList *error_list)
_("\"conninfo\": %s"), _("\"conninfo\": %s"),
conninfo_errmsg); conninfo_errmsg);
item_list_append(error_list, error_message_buf); item_list_append(&config_errors, error_message_buf);
} }
PQconninfoFree(conninfo_options); PQconninfoFree(conninfo_options);
} }
if (config_errors.head != NULL)
{
exit_with_errors(&config_errors);
}
return true;
} }
@@ -564,85 +553,70 @@ parse_line(char *buf, char *name, char *value)
trim(value); trim(value);
} }
/*
* reload_config()
*
* This is only called by repmgrd after receiving a SIGHUP or when a monitoring
* loop is started up; it therefore only needs to reload options required
* by repmgrd, which are as follows:
*
* changeable options:
* - failover
* - follow_command
* - logfacility
* - logfile
* - loglevel
* - master_response_timeout
* - monitor_interval_secs
* - priority
* - promote_command
* - reconnect_attempts
* - reconnect_interval
* - retry_promote_interval_secs
* - witness_repl_nodes_sync_interval_secs
*
* non-changeable options:
* - cluster_name
* - conninfo
* - node
* - node_name
*
* extract with something like:
* grep local_options\\. repmgrd.c | perl -n -e '/local_options\.([\w_]+)/ && print qq|$1\n|;' | sort | uniq
*/
bool bool
reload_config(t_configuration_options *orig_options) reload_config(t_configuration_options *orig_options)
{ {
PGconn *conn; PGconn *conn;
t_configuration_options new_options = T_CONFIGURATION_OPTIONS_INITIALIZER; t_configuration_options new_options;
bool config_changed = false; bool config_changed = false;
bool log_config_changed = false;
static ItemList config_errors = { NULL, NULL };
/* /*
* Re-read the configuration file: repmgr.conf * Re-read the configuration file: repmgr.conf
*/ */
log_info(_("reloading configuration file\n")); log_info(_("reloading configuration file and updating repmgr tables\n"));
_parse_config(&new_options, &config_errors); parse_config(&new_options);
if (new_options.node == -1)
if (config_errors.head != NULL)
{ {
/* XXX dump errors to log */
log_warning(_("unable to parse new configuration, retaining current configuration\n")); log_warning(_("unable to parse new configuration, retaining current configuration\n"));
return false; return false;
} }
/* The following options cannot be changed */
if (strcmp(new_options.cluster_name, orig_options->cluster_name) != 0) if (strcmp(new_options.cluster_name, orig_options->cluster_name) != 0)
{ {
log_warning(_("cluster_name cannot be changed, retaining current configuration\n")); log_warning(_("unable to change cluster name, retaining current configuration\n"));
return false; return false;
} }
if (new_options.node != orig_options->node) if (new_options.node != orig_options->node)
{ {
log_warning(_("node ID cannot be changed, retaining current configuration\n")); log_warning(_("unable to change node ID, retaining current configuration\n"));
return false; return false;
} }
if (strcmp(new_options.node_name, orig_options->node_name) != 0) if (strcmp(new_options.node_name, orig_options->node_name) != 0)
{ {
log_warning(_("node_name cannot be changed, keeping current configuration\n")); log_warning(_("unable to change standby name, keeping current configuration\n"));
return false;
}
if (new_options.failover != MANUAL_FAILOVER && new_options.failover != AUTOMATIC_FAILOVER)
{
log_warning(_("new value for 'failover' must be 'automatic' or 'manual'\n"));
return false;
}
if (new_options.master_response_timeout <= 0)
{
log_warning(_("new value for 'master_response_timeout' must be greater than zero\n"));
return false;
}
if (new_options.reconnect_attempts < 0)
{
log_warning(_("new value for 'reconnect_attempts' must be zero or greater\n"));
return false;
}
if (new_options.reconnect_interval < 0)
{
log_warning(_("new value for 'reconnect_interval' must be zero or greater\n"));
return false; return false;
} }
if (strcmp(orig_options->conninfo, new_options.conninfo) != 0) if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
{ {
/* Test conninfo string works*/ /* Test conninfo string */
conn = establish_db_connection(new_options.conninfo, false); conn = establish_db_connection(new_options.conninfo, false);
if (!conn || (PQstatus(conn) != CONNECTION_OK)) if (!conn || (PQstatus(conn) != CONNECTION_OK))
{ {
@@ -659,6 +633,34 @@ reload_config(t_configuration_options *orig_options)
* to manage them * to manage them
*/ */
/* cluster_name */
if (strcmp(orig_options->cluster_name, new_options.cluster_name) != 0)
{
strcpy(orig_options->cluster_name, new_options.cluster_name);
config_changed = true;
}
/* conninfo */
if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
{
strcpy(orig_options->conninfo, new_options.conninfo);
config_changed = true;
}
/* barman_server */
if (strcmp(orig_options->barman_server, new_options.barman_server) != 0)
{
strcpy(orig_options->barman_server, new_options.barman_server);
config_changed = true;
}
/* node */
if (orig_options->node != new_options.node)
{
orig_options->node = new_options.node;
config_changed = true;
}
/* failover */ /* failover */
if (orig_options->failover != new_options.failover) if (orig_options->failover != new_options.failover)
{ {
@@ -666,27 +668,6 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* follow_command */
if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
{
strcpy(orig_options->follow_command, new_options.follow_command);
config_changed = true;
}
/* master_response_timeout */
if (orig_options->master_response_timeout != new_options.master_response_timeout)
{
orig_options->master_response_timeout = new_options.master_response_timeout;
config_changed = true;
}
/* monitor_interval_secs */
if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
{
orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
config_changed = true;
}
/* priority */ /* priority */
if (orig_options->priority != new_options.priority) if (orig_options->priority != new_options.priority)
{ {
@@ -694,6 +675,13 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* node_name */
if (strcmp(orig_options->node_name, new_options.node_name) != 0)
{
strcpy(orig_options->node_name, new_options.node_name);
config_changed = true;
}
/* promote_command */ /* promote_command */
if (strcmp(orig_options->promote_command, new_options.promote_command) != 0) if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
{ {
@@ -701,6 +689,44 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* follow_command */
if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
{
strcpy(orig_options->follow_command, new_options.follow_command);
config_changed = true;
}
/*
* XXX These ones can change with a simple SIGHUP?
*
* strcpy (orig_options->loglevel, new_options.loglevel); strcpy
* (orig_options->logfacility, new_options.logfacility);
*
* logger_shutdown(); XXX do we have progname here ? logger_init(progname,
* orig_options.loglevel, orig_options.logfacility);
*/
/* rsync_options */
if (strcmp(orig_options->rsync_options, new_options.rsync_options) != 0)
{
strcpy(orig_options->rsync_options, new_options.rsync_options);
config_changed = true;
}
/* ssh_options */
if (strcmp(orig_options->ssh_options, new_options.ssh_options) != 0)
{
strcpy(orig_options->ssh_options, new_options.ssh_options);
config_changed = true;
}
/* master_response_timeout */
if (orig_options->master_response_timeout != new_options.master_response_timeout)
{
orig_options->master_response_timeout = new_options.master_response_timeout;
config_changed = true;
}
/* reconnect_attempts */ /* reconnect_attempts */
if (orig_options->reconnect_attempts != new_options.reconnect_attempts) if (orig_options->reconnect_attempts != new_options.reconnect_attempts)
{ {
@@ -715,6 +741,27 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* pg_ctl_options */
if (strcmp(orig_options->pg_ctl_options, new_options.pg_ctl_options) != 0)
{
strcpy(orig_options->pg_ctl_options, new_options.pg_ctl_options);
config_changed = true;
}
/* pg_basebackup_options */
if (strcmp(orig_options->pg_basebackup_options, new_options.pg_basebackup_options) != 0)
{
strcpy(orig_options->pg_basebackup_options, new_options.pg_basebackup_options);
config_changed = true;
}
/* monitor_interval_secs */
if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
{
orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
config_changed = true;
}
/* retry_promote_interval_secs */ /* retry_promote_interval_secs */
if (orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs) if (orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs)
{ {
@@ -722,54 +769,20 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* use_replication_slots */
/* witness_repl_nodes_sync_interval_secs */ if (orig_options->use_replication_slots != new_options.use_replication_slots)
if (orig_options->witness_repl_nodes_sync_interval_secs != new_options.witness_repl_nodes_sync_interval_secs)
{ {
orig_options->witness_repl_nodes_sync_interval_secs = new_options.witness_repl_nodes_sync_interval_secs; orig_options->use_replication_slots = new_options.use_replication_slots;
config_changed = true; config_changed = true;
} }
/*
* Handle changes to logging configuration
*/
if (strcmp(orig_options->logfacility, new_options.logfacility) != 0)
{
strcpy(orig_options->logfacility, new_options.logfacility);
log_config_changed = true;
}
if (strcmp(orig_options->logfile, new_options.logfile) != 0)
{
strcpy(orig_options->logfile, new_options.logfile);
log_config_changed = true;
}
if (strcmp(orig_options->loglevel, new_options.loglevel) != 0)
{
strcpy(orig_options->loglevel, new_options.loglevel);
log_config_changed = true;
}
if (log_config_changed == true)
{
log_notice(_("restarting logging with changed parameters\n"));
logger_shutdown();
logger_init(orig_options, progname());
}
if (config_changed == true) if (config_changed == true)
{ {
log_notice(_("configuration file reloaded with changed parameters\n")); log_debug(_("reload_config(): configuration has changed\n"));
} }
/* else
* if logging configuration changed, don't say the configuration didn't
* change, as it clearly has.
*/
else if (log_config_changed == false)
{ {
log_info(_("configuration has not changed\n")); log_debug(_("reload_config(): configuration has not changed\n"));
} }
return config_changed; return config_changed;
@@ -943,7 +956,7 @@ static void
parse_event_notifications_list(t_configuration_options *options, const char *arg) parse_event_notifications_list(t_configuration_options *options, const char *arg)
{ {
const char *arg_ptr; const char *arg_ptr;
char event_type_buf[MAXLEN] = ""; char event_type_buf[MAXLEN] = "";
char *dst_ptr = event_type_buf; char *dst_ptr = event_type_buf;

View File

@@ -1,7 +1,7 @@
/* /*
* config.h * config.h
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -97,7 +97,7 @@ typedef struct
* The following will initialize the structure with a minimal set of options; * The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file * actual defaults are set in parse_config() before parsing the configuration file
*/ */
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", UNKNOWN_NODE_ID, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } } #define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } }
typedef struct ItemListCell typedef struct ItemListCell
{ {
@@ -131,11 +131,8 @@ void set_progname(const char *argv0);
const char * progname(void); const char * progname(void);
bool load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0); bool load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0);
void _parse_config(t_configuration_options *options, ItemList *error_list);
bool parse_config(t_configuration_options *options);
bool reload_config(t_configuration_options *orig_options); bool reload_config(t_configuration_options *orig_options);
bool parse_config(t_configuration_options *options);
void parse_line(char *buff, char *name, char *value); void parse_line(char *buff, char *name, char *value);
char *trim(char *s); char *trim(char *s);
void item_list_append(ItemList *item_list, char *error_message); void item_list_append(ItemList *item_list, char *error_message);

260
dbutils.c
View File

@@ -1,7 +1,7 @@
/* /*
* dbutils.c - Database connection/management functions * dbutils.c - Database connection/management functions
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -33,15 +33,6 @@ char repmgr_schema[MAXLEN] = "";
char repmgr_schema_quoted[MAXLEN] = ""; char repmgr_schema_quoted[MAXLEN] = "";
static int _get_node_record(PGconn *conn, char *cluster, char *sqlquery, t_node_info *node_info); static int _get_node_record(PGconn *conn, char *cluster, char *sqlquery, t_node_info *node_info);
static bool _set_config(PGconn *conn, const char *config_param, const char *sqlquery);
/*
* _establish_db_connection()
*
* Connect to a database using a conninfo string.
*
* NOTE: *do not* use this for replication connections; use establish_db_connection_by_params() instead.
*/
PGconn * PGconn *
_establish_db_connection(const char *conninfo, const bool exit_on_error, const bool log_notice, const bool verbose_only) _establish_db_connection(const char *conninfo, const bool exit_on_error, const bool log_notice, const bool verbose_only)
@@ -86,19 +77,6 @@ _establish_db_connection(const char *conninfo, const bool exit_on_error, const b
} }
} }
/*
* set "synchronous_commit" to "local" in case synchronous replication is in use
*/
else if (set_config(conn, "synchronous_commit", "local") == false)
{
if (exit_on_error)
{
PQfinish(conn);
exit(ERR_DB_CON);
}
}
return conn; return conn;
} }
@@ -138,12 +116,8 @@ PGconn *
establish_db_connection_by_params(const char *keywords[], const char *values[], establish_db_connection_by_params(const char *keywords[], const char *values[],
const bool exit_on_error) const bool exit_on_error)
{ {
PGconn *conn; /* Make a connection to the database */
bool replication_connection = false; PGconn *conn = PQconnectdbParams(keywords, values, true);
int i;
/* Connect to the database using the provided parameters */
conn = PQconnectdbParams(keywords, values, true);
/* Check to see that the backend connection was successfully made */ /* Check to see that the backend connection was successfully made */
if ((PQstatus(conn) != CONNECTION_OK)) if ((PQstatus(conn) != CONNECTION_OK))
@@ -156,28 +130,6 @@ establish_db_connection_by_params(const char *keywords[], const char *values[],
exit(ERR_DB_CON); exit(ERR_DB_CON);
} }
} }
else
{
/*
* set "synchronous_commit" to "local" in case synchronous replication is in
* use (provided this is not a replication connection)
*/
for (i = 0; keywords[i]; i++)
{
if (strcmp(keywords[i], "replication") == 0)
replication_connection = true;
}
if (replication_connection == false && set_config(conn, "synchronous_commit", "local") == false)
{
if (exit_on_error)
{
PQfinish(conn);
exit(ERR_DB_CON);
}
}
}
return conn; return conn;
} }
@@ -327,6 +279,7 @@ is_pgup(PGconn *conn, int timeout)
/* Check the connection status twice in case it changes after reset */ /* Check the connection status twice in case it changes after reset */
bool twice = false; bool twice = false;
/* Check the connection status twice in case it changes after reset */
for (;;) for (;;)
{ {
if (PQstatus(conn) != CONNECTION_OK) if (PQstatus(conn) != CONNECTION_OK)
@@ -428,8 +381,6 @@ int
get_server_version(PGconn *conn, char *server_version) get_server_version(PGconn *conn, char *server_version)
{ {
PGresult *res; PGresult *res;
int server_version_num;
res = PQexec(conn, res = PQexec(conn,
"SELECT current_setting('server_version_num'), " "SELECT current_setting('server_version_num'), "
" current_setting('server_version')"); " current_setting('server_version')");
@@ -443,12 +394,9 @@ get_server_version(PGconn *conn, char *server_version)
} }
if (server_version != NULL) if (server_version != NULL)
strcpy(server_version, PQgetvalue(res, 0, 1)); strcpy(server_version, PQgetvalue(res, 0, 0));
server_version_num = atoi(PQgetvalue(res, 0, 0)); return atoi(PQgetvalue(res, 0, 0));
PQclear(res);
return server_version_num;
} }
@@ -1137,25 +1085,15 @@ drop_replication_slot(PGconn *conn, char *slot_name)
bool bool
start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int server_version_num) start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint)
{ {
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
PGresult *res; PGresult *res;
if (server_version_num >= 100000) sqlquery_snprintf(sqlquery,
{ "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))",
sqlquery_snprintf(sqlquery, time(NULL),
"SELECT pg_catalog.pg_walfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))", fast_checkpoint ? "TRUE" : "FALSE");
time(NULL),
fast_checkpoint ? "TRUE" : "FALSE");
}
else
{
sqlquery_snprintf(sqlquery,
"SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))",
time(NULL),
fast_checkpoint ? "TRUE" : "FALSE");
}
log_verbose(LOG_DEBUG, "start_backup():\n%s\n", sqlquery); log_verbose(LOG_DEBUG, "start_backup():\n%s\n", sqlquery);
@@ -1183,19 +1121,12 @@ start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int se
bool bool
stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num) stop_backup(PGconn *conn, char *last_wal_segment)
{ {
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
PGresult *res; PGresult *res;
if (server_version_num >= 100000) sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_stop_backup())");
{
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_walfile_name(pg_catalog.pg_stop_backup())");
}
else
{
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_stop_backup())");
}
res = PQexec(conn, sqlquery); res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
@@ -1220,12 +1151,19 @@ stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num)
} }
bool bool
_set_config(PGconn *conn, const char *config_param, const char *sqlquery) set_config_bool(PGconn *conn, const char *config_param, bool state)
{ {
char sqlquery[QUERY_STR_LEN];
PGresult *res; PGresult *res;
sqlquery_snprintf(sqlquery,
"SET %s TO %s",
config_param,
state ? "TRUE" : "FALSE");
log_verbose(LOG_DEBUG, "set_config_bool():\n%s\n", sqlquery);
res = PQexec(conn, sqlquery); res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_COMMAND_OK) if (PQresultStatus(res) != PGRES_COMMAND_OK)
@@ -1240,36 +1178,6 @@ _set_config(PGconn *conn, const char *config_param, const char *sqlquery)
return true; return true;
} }
bool
set_config(PGconn *conn, const char *config_param, const char *config_value)
{
char sqlquery[QUERY_STR_LEN];
sqlquery_snprintf(sqlquery,
"SET %s TO '%s'",
config_param,
config_value);
log_verbose(LOG_DEBUG, "set_config():\n%s\n", sqlquery);
return _set_config(conn, config_param, sqlquery);
}
bool
set_config_bool(PGconn *conn, const char *config_param, bool state)
{
char sqlquery[QUERY_STR_LEN];
sqlquery_snprintf(sqlquery,
"SET %s TO %s",
config_param,
state ? "TRUE" : "FALSE");
log_verbose(LOG_DEBUG, "set_config_bool():\n%s\n", sqlquery);
return _set_config(conn, config_param, sqlquery);
}
/* /*
* witness_copy_node_records() * witness_copy_node_records()
@@ -1529,11 +1437,10 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
bool success = true; bool success = true;
struct tm ts; struct tm ts;
/* /* Only attempt to write a record if a connection handle was provided.
* Only attempt to write a record if a connection handle was provided. Also check that the repmgr schema has been properly intialised - if
* Also check that the repmgr schema has been properly initialised - if not it means no configuration file was provided, which can happen with
* not it means no configuration file was provided, which can happen with e.g. `repmgr standby clone`, and we won't know which schema to write to.
* e.g. `repmgr standby clone`, and we won't know which schema to write to.
*/ */
if (conn != NULL && strcmp(repmgr_schema, DEFAULT_REPMGR_SCHEMA_PREFIX) != 0) if (conn != NULL && strcmp(repmgr_schema, DEFAULT_REPMGR_SCHEMA_PREFIX) != 0)
{ {
@@ -1721,110 +1628,6 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
return success; return success;
} }
void
create_checkpoint(PGconn *conn)
{
char sqlquery[MAXLEN];
PGresult *res;
sqlquery_snprintf(sqlquery, "CHECKPOINT");
log_verbose(LOG_DEBUG, "checkpoint:\n%s\n", sqlquery);
res = PQexec(conn, sqlquery);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
{
log_err(_("Unable to create CHECKPOINT:\n%s\n"),
PQerrorMessage(conn));
PQfinish(conn);
exit(ERR_DB_QUERY);
}
log_notice(_("CHECKPOINT created\n"));
}
bool
update_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active)
{
char sqlquery[QUERY_STR_LEN];
char upstream_node_id[MAXLEN];
char slot_name_buf[MAXLEN];
PGresult *res;
/* XXX this segment copied from create_node_record() */
if (upstream_node == NO_UPSTREAM_NODE)
{
/*
* No explicit upstream node id provided for standby - attempt to
* get primary node id
*/
if (strcmp(type, "standby") == 0)
{
int primary_node_id = get_master_node_id(conn, cluster_name);
maxlen_snprintf(upstream_node_id, "%i", primary_node_id);
}
else
{
maxlen_snprintf(upstream_node_id, "%s", "NULL");
}
}
else
{
maxlen_snprintf(upstream_node_id, "%i", upstream_node);
}
if (slot_name != NULL && slot_name[0])
{
maxlen_snprintf(slot_name_buf, "'%s'", slot_name);
}
else
{
maxlen_snprintf(slot_name_buf, "%s", "NULL");
}
/* XXX convert to placeholder query */
sqlquery_snprintf(sqlquery,
"UPDATE %s.repl_nodes SET "
" type = '%s', "
" upstream_node_id = %s, "
" cluster = '%s', "
" name = '%s', "
" conninfo = '%s', "
" slot_name = %s, "
" priority = %i, "
" active = %s "
" WHERE id = %i ",
get_repmgr_schema_quoted(conn),
type,
upstream_node_id,
cluster_name,
node_name,
conninfo,
slot_name_buf,
priority,
active == true ? "TRUE" : "FALSE",
node);
log_verbose(LOG_DEBUG, "update_node_record(): %s\n", sqlquery);
if (action != NULL)
{
log_verbose(LOG_DEBUG, "update_node_record(): action is \"%s\"\n", action);
}
res = PQexec(conn, sqlquery);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
{
log_err(_("Unable to update node record\n%s\n"),
PQerrorMessage(conn));
PQclear(res);
return false;
}
PQclear(res);
return true;
}
/* /*
* Update node record following change of status * Update node record following change of status
@@ -1986,16 +1789,7 @@ _get_node_record(PGconn *conn, char *cluster, char *sqlquery, t_node_info *node_
node_info->node_id = atoi(PQgetvalue(res, 0, 0)); node_info->node_id = atoi(PQgetvalue(res, 0, 0));
node_info->type = parse_node_type(PQgetvalue(res, 0, 1)); node_info->type = parse_node_type(PQgetvalue(res, 0, 1));
node_info->upstream_node_id = atoi(PQgetvalue(res, 0, 2));
if (PQgetisnull(res, 0, 2))
{
node_info->upstream_node_id = NO_UPSTREAM_NODE;
}
else
{
node_info->upstream_node_id = atoi(PQgetvalue(res, 0, 2));
}
strncpy(node_info->name, PQgetvalue(res, 0, 3), MAXLEN); strncpy(node_info->name, PQgetvalue(res, 0, 3), MAXLEN);
strncpy(node_info->conninfo_str, PQgetvalue(res, 0, 4), MAXLEN); strncpy(node_info->conninfo_str, PQgetvalue(res, 0, 4), MAXLEN);
strncpy(node_info->slot_name, PQgetvalue(res, 0, 5), MAXLEN); strncpy(node_info->slot_name, PQgetvalue(res, 0, 5), MAXLEN);

View File

@@ -1,7 +1,7 @@
/* /*
* dbutils.h * dbutils.h
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -122,22 +122,20 @@ char *get_repmgr_schema_quoted(PGconn *conn);
bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg); bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
int get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record); int get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
bool drop_replication_slot(PGconn *conn, char *slot_name); bool drop_replication_slot(PGconn *conn, char *slot_name);
bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int server_version_num); bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint);
bool stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num); bool stop_backup(PGconn *conn, char *last_wal_segment);
bool set_config(PGconn *conn, const char *config_param, const char *config_value);
bool set_config_bool(PGconn *conn, const char *config_param, bool state); bool set_config_bool(PGconn *conn, const char *config_param, bool state);
bool witness_copy_node_records(PGconn *masterconn, PGconn *witnessconn, char *cluster_name); bool witness_copy_node_records(PGconn *masterconn, PGconn *witnessconn, char *cluster_name);
bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active); bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
bool delete_node_record(PGconn *conn, int node, char *action); bool delete_node_record(PGconn *conn, int node, char *action);
int get_node_record(PGconn *conn, char *cluster, int node_id, t_node_info *node_info); int get_node_record(PGconn *conn, char *cluster, int node_id, t_node_info *node_info);
int get_node_record_by_name(PGconn *conn, char *cluster, const char *node_name, t_node_info *node_info); int get_node_record_by_name(PGconn *conn, char *cluster, const char *node_name, t_node_info *node_info);
bool update_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
bool update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active); bool update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active);
bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id); bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id);
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details); bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
void create_checkpoint(PGconn *conn);
int get_node_replication_state(PGconn *conn, char *node_name, char *output); int get_node_replication_state(PGconn *conn, char *node_name, char *output);
t_server_type parse_node_type(const char *type); t_server_type parse_node_type(const char *type);
int get_data_checksum_version(const char *data_directory); int get_data_checksum_version(const char *data_directory);
#endif #endif

View File

@@ -3,7 +3,7 @@
* dirmod.c * dirmod.c
* directory handling functions * directory handling functions
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,6 +1,6 @@
/* /*
* dirmod.h * dirmod.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -53,7 +53,7 @@ will be carried out:
e.g. if a replication cluster is spread over multiple data centres, a split-brain e.g. if a replication cluster is spread over multiple data centres, a split-brain
situation does not occur if there is a network failure between datacentres. Note situation does not occur if there is a network failure between datacentres. Note
that if nodes are split evenly between data centres, a witness server can be that if nodes are split evenly between data centres, a witness server can be
used to establish the "majority" data centre. used to establish the "majority" daat centre.
* `repmgrd` polls all visible servers and waits for each node to return a valid LSN; * `repmgrd` polls all visible servers and waits for each node to return a valid LSN;
it updates the LSN previously stored for this node if it has increased since it updates the LSN previously stored for this node if it has increased since

View File

@@ -22,14 +22,15 @@ of this document).
* * * * * *
In a failover situation, `repmgrd` promotes a standby to master by executing In a failover situation, `repmgrd` promotes a standby to master by
the command defined in `promote_command`. Normally this would be something like: executing the command defined in `promote_command`. Normally this
would be something like:
repmgr standby promote -f /etc/repmgr.conf repmgr standby promote -f /etc/repmgr.conf
By wrapping this in a custom script which adjusts the `pgbouncer` configuration By wrapping this in a custom script which adjusts the `pgbouncer`
on all nodes, it's possible to fence the failed master and redirect write configuration on all nodes, it's possible to fence the failed master
connections to the new master. and redirect write connections to the new master.
The script consists of three sections: The script consists of three sections:
@@ -37,27 +38,20 @@ The script consists of three sections:
* the promotion command itself * the promotion command itself
* commands to reconfigure and restart `pgbouncer` on all nodes * commands to reconfigure and restart `pgbouncer` on all nodes
Note that it requires password-less SSH access between all nodes to be able to Note that it requires password-less SSH access between all nodes to be
update the `pgbouncer` configuration files. able to update the `pgbouncer` configuration files.
For the purposes of this demonstration, we'll assume there are 3 nodes (master For the purposes of this demonstration, we'll assume there are 3 nodes
and two standbys), with `pgbouncer` listening on port 6432 handling connections (master and two standbys), with `pgbouncer` listening on port 6432
to a database called `appdb`. The `postgres` system user must have write handling connections to a database called `appdb`. The `postgres`
access to the `pgbouncer` configuration files on all nodes. We'll assume system user must have write access to the `pgbouncer` configuration
there's a main `pgbouncer` configuration file, `/etc/pgbouncer.ini`, which uses file on all nodes, assumed to be at `/etc/pgbouncer.ini`.
the `%include` directive (available from PgBouncer 1.6) to include a separate
configuration file, `/etc/pgbouncer.database.ini`, which will be modified by
`repmgr`.
* * * The script also requires a template file containing global `pgbouncer`
configuration, which should looks something like this (adjust
settings appropriately for your environment):
> *NOTE*: in this self-contained demonstration, `pgbouncer` is running on the `/var/lib/postgres/repmgr/pgbouncer.ini.template`
> database servers, however in a production environment it will make more
> sense to run `pgbouncer` on either separate nodes or the application server.
* * *
`/etc/pgbouncer.ini` should look something like this:
[pgbouncer] [pgbouncer]
@@ -86,8 +80,6 @@ configuration file, `/etc/pgbouncer.database.ini`, which will be modified by
log_disconnections = 1 log_disconnections = 1
log_pooler_errors = 1 log_pooler_errors = 1
%include /etc/pgbouncer.database.ini
The actual script is as follows; adjust the configurable items as appropriate: The actual script is as follows; adjust the configurable items as appropriate:
`/var/lib/postgres/repmgr/promote.sh` `/var/lib/postgres/repmgr/promote.sh`
@@ -99,52 +91,50 @@ The actual script is as follows; adjust the configurable items as appropriate:
# Configurable items # Configurable items
PGBOUNCER_HOSTS="node1 node2 node3" PGBOUNCER_HOSTS="node1 node2 node3"
PGBOUNCER_DATABASE_INI="/etc/pgbouncer.database.ini"
PGBOUNCER_DATABASE="appdb"
PGBOUNCER_PORT=6432
REPMGR_DB="repmgr" REPMGR_DB="repmgr"
REPMGR_USER="repmgr" REPMGR_USER="repmgr"
REPMGR_SCHEMA="repmgr_test" REPMGR_SCHEMA="repmgr_test"
PGBOUNCER_CONFIG="/etc/pgbouncer.ini"
PGBOUNCER_INI_TEMPLATE="/var/lib/postgres/repmgr/pgbouncer.ini.template"
PGBOUNCER_DATABASE="appdb"
# 1. Pause running pgbouncer instances # 1. Pause running pgbouncer instances
for HOST in $PGBOUNCER_HOSTS for HOST in $PGBOUNCER_HOSTS
do do
psql -t -c "pause" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer psql -t -c "pause" -h $HOST -p $PORT -U postgres pgbouncer
done done
# 2. Promote this node from standby to master # 2. Promote this node from standby to master
repmgr standby promote -f /etc/repmgr.conf repmgr standby promote -f /etc/repmgr.conf
# 3. Reconfigure pgbouncer instances # 3. Reconfigure pgbouncer instances
PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini" PGBOUNCER_INI_NEW="/tmp/pgbouncer.ini.new"
for HOST in $PGBOUNCER_HOSTS for HOST in $PGBOUNCER_HOSTS
do do
# Recreate the pgbouncer config file # Recreate the pgbouncer config file
echo -e "[databases]\n" > $PGBOUNCER_DATABASE_INI_NEW echo -e "[databases]\n" > $PGBOUNCER_INI_NEW
psql -d $REPMGR_DB -U $REPMGR_USER -t -A \ psql -d $REPMGR_DB -U $REPMGR_USER -t -A \
-c "SELECT '${PGBOUNCER_DATABASE}-rw= ' || conninfo || ' application_name=pgbouncer_${HOST}' \ -c "SELECT '$PGBOUNCER_DATABASE= ' || conninfo || ' application_name=pgbouncer_$HOST' \
FROM ${REPMGR_SCHEMA}.repl_nodes \ FROM $REPMGR_SCHEMA.repl_nodes \
WHERE active = TRUE AND type='master'" >> $PGBOUNCER_DATABASE_INI_NEW WHERE active = TRUE AND type='master'" >> $PGBOUNCER_INI_NEW
psql -d $REPMGR_DB -U $REPMGR_USER -t -A \ cat $PGBOUNCER_INI_TEMPLATE >> $PGBOUNCER_INI_NEW
-c "SELECT '${PGBOUNCER_DATABASE}-ro= ' || conninfo || ' application_name=pgbouncer_${HOST}' \
FROM ${REPMGR_SCHEMA}.repl_nodes \
WHERE node_name='${HOST}'" >> $PGBOUNCER_DATABASE_INI_NEW
rsync $PGBOUNCER_DATABASE_INI_NEW $HOST:$PGBOUNCER_DATABASE_INI rsync $PGBOUNCER_INI_NEW $HOST:$PGBOUNCER_CONFIG
psql -tc "reload" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer psql -tc "reload" -h $HOST -U postgres pgbouncer
psql -tc "resume" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer psql -tc "resume" -h $HOST -U postgres pgbouncer
done done
# Clean up generated file # Clean up generated file
rm $PGBOUNCER_DATABASE_INI_NEW rm $PGBOUNCER_INI_NEW
echo "Reconfiguration of pgbouncer complete" echo "Reconfiguration of pgbouncer complete"

View File

@@ -1,6 +1,6 @@
/* /*
* errcode.h * errcode.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,18 +0,0 @@
/*
* repmgr_function.sql
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/
-- SET SEARCH_PATH TO 'repmgr';
CREATE FUNCTION repmgr_update_standby_location(text) RETURNS boolean
AS '$libdir/repmgr_funcs', 'repmgr_update_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_standby_location() RETURNS text
AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_update_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS '$libdir/repmgr_funcs', 'repmgr_update_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS '$libdir/repmgr_funcs', 'repmgr_get_last_updated'
LANGUAGE C STRICT;

View File

@@ -1,24 +0,0 @@
select * from repmgr_update_standby_location('');
repmgr_update_standby_location
--------------------------------
f
(1 row)
select * from repmgr_get_last_standby_location();
repmgr_get_last_standby_location
----------------------------------
(1 row)
select * from repmgr_update_last_updated();
repmgr_update_last_updated
----------------------------
(1 row)
select * from repmgr_get_last_updated();
repmgr_get_last_updated
-------------------------
(1 row)

60
log.c
View File

@@ -1,6 +1,6 @@
/* /*
* log.c - Logging methods * log.c - Logging methods
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This module is a set of methods for logging (currently only syslog) * This module is a set of methods for logging (currently only syslog)
* *
@@ -48,11 +48,6 @@ int log_level = LOG_NOTICE;
int last_log_level = LOG_NOTICE; int last_log_level = LOG_NOTICE;
int verbose_logging = false; int verbose_logging = false;
int terse_logging = false; int terse_logging = false;
/*
* Global variable to be set by the main application to ensure any log output
* emitted before logger_init is called, is output in the correct format
*/
int logger_output_mode = OM_DAEMON;
extern void extern void
stderr_log_with_level(const char *level_name, int level, const char *fmt, ...) stderr_log_with_level(const char *level_name, int level, const char *fmt, ...)
@@ -67,31 +62,22 @@ stderr_log_with_level(const char *level_name, int level, const char *fmt, ...)
static void static void
_stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap) _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap)
{ {
char buf[100]; time_t t;
struct tm *tm;
char buff[100];
/* /*
* Store the requested level so that if there's a subsequent * Store the requested level so that if there's a subsequent
* log_hint() or log_detail(), we can suppress that if appropriate. * log_hint(), we can suppress that if appropriate.
*/ */
last_log_level = level; last_log_level = level;
if (log_level >= level) if (log_level >= level)
{ {
time(&t);
/* Format log line prefix with timestamp if in daemon mode */ tm = localtime(&t);
if (logger_output_mode == OM_DAEMON) strftime(buff, 100, "[%Y-%m-%d %H:%M:%S]", tm);
{ fprintf(stderr, "%s [%s] ", buff, level_name);
time_t t;
struct tm *tm;
time(&t);
tm = localtime(&t);
strftime(buf, 100, "[%Y-%m-%d %H:%M:%S]", tm);
fprintf(stderr, "%s [%s] ", buf, level_name);
}
else
{
fprintf(stderr, "%s: ", level_name);
}
vfprintf(stderr, fmt, ap); vfprintf(stderr, fmt, ap);
@@ -113,20 +99,6 @@ log_hint(const char *fmt, ...)
} }
void
log_detail(const char *fmt, ...)
{
va_list ap;
if (terse_logging == false)
{
va_start(ap, fmt);
_stderr_log_with_level("DETAIL", last_log_level, fmt, ap);
va_end(ap);
}
}
void void
log_verbose(int level, const char *fmt, ...) log_verbose(int level, const char *fmt, ...)
{ {
@@ -204,13 +176,6 @@ logger_init(t_configuration_options *opts, const char *ident)
stderr_log_warning(_("Invalid log level \"%s\" (available values: DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level); stderr_log_warning(_("Invalid log level \"%s\" (available values: DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level);
} }
/*
* STDERR only logging requested - finish here without setting up any further
* logging facility.
*/
if (logger_output_mode == OM_COMMAND_LINE)
return true;
if (facility && *facility) if (facility && *facility)
{ {
@@ -271,10 +236,9 @@ logger_init(t_configuration_options *opts, const char *ident)
stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile); stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile);
fd = freopen(opts->logfile, "a", stderr); fd = freopen(opts->logfile, "a", stderr);
/* /* It's possible freopen() may still fail due to e.g. a race condition;
* It's possible freopen() may still fail due to e.g. a race condition; as it's not feasible to restore stderr after a failed freopen(),
* as it's not feasible to restore stderr after a failed freopen(), we'll write to stdout as a last resort.
* we'll write to stdout as a last resort.
*/ */
if (fd == NULL) if (fd == NULL)
{ {

8
log.h
View File

@@ -1,6 +1,6 @@
/* /*
* log.h * log.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -25,9 +25,6 @@
#define REPMGR_SYSLOG 1 #define REPMGR_SYSLOG 1
#define REPMGR_STDERR 2 #define REPMGR_STDERR 2
#define OM_COMMAND_LINE 1
#define OM_DAEMON 2
extern void extern void
stderr_log_with_level(const char *level_name, int level, const char *fmt,...) stderr_log_with_level(const char *level_name, int level, const char *fmt,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4))); __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
@@ -126,8 +123,6 @@ bool logger_shutdown(void);
void logger_set_verbose(void); void logger_set_verbose(void);
void logger_set_terse(void); void logger_set_terse(void);
void log_detail(const char *fmt, ...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
void log_hint(const char *fmt, ...) void log_hint(const char *fmt, ...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2))); __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
void log_verbose(int level, const char *fmt, ...) void log_verbose(int level, const char *fmt, ...)
@@ -137,6 +132,5 @@ extern int log_type;
extern int log_level; extern int log_level;
extern int verbose_logging; extern int verbose_logging;
extern int terse_logging; extern int terse_logging;
extern int logger_output_mode;
#endif /* _REPMGR_LOG_H_ */ #endif /* _REPMGR_LOG_H_ */

1548
repmgr.c

File diff suppressed because it is too large Load Diff

View File

@@ -26,14 +26,11 @@
# the server's hostname or another identifier unambiguously # the server's hostname or another identifier unambiguously
# associated with the server to avoid confusion # associated with the server to avoid confusion
# Database connection information as a conninfo string (this must be a # Database connection information as a conninfo string
# keyword/value string, not a connection URI). # This must be accessible to all servers in the cluster; for details see:
# #
# https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING # https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING
# #
# All servers in the cluster must be able to access the database
# using this connection string.
#
#conninfo='host=192.168.204.104 dbname=repmgr user=repmgr' #conninfo='host=192.168.204.104 dbname=repmgr user=repmgr'
# #
# If repmgrd is in use, consider explicitly setting `connect_timeout` in the # If repmgrd is in use, consider explicitly setting `connect_timeout` in the
@@ -69,12 +66,6 @@
# (default: NOTICE) # (default: NOTICE)
#loglevel=NOTICE #loglevel=NOTICE
# Note that logging facility settings will only apply to `repmgrd` by default;
# `repmgr` will always write to STDERR unless the switch `--log-to-file` is
# supplied, in which case it will log to the same destination as `repmgrd`.
# This is mainly intended for those cases when `repmgr` is executed directly
# by `repmgrd`.
# Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER # Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER
# (default: STDERR) # (default: STDERR)
#logfacility=STDERR #logfacility=STDERR
@@ -146,15 +137,8 @@
# external command arguments. Values shown are examples. # external command arguments. Values shown are examples.
#pg_ctl_options='-s' #pg_ctl_options='-s'
#pg_basebackup_options='--label=repmgr_backup' #pg_basebackup_options='--xlog-method=s'
# This is the host name of the barman server, which is used for connecting over
# to the barman server (passwordless ssh keys should be in place)
#barman_server='backup_server'
# If you are placing the barman.conf file in a non-standard path, or using
# a name other than barman.conf, use this parameter to specify the path and
# name of the barman configuration file.
#barman_config='/path/to/barman.conf'
# Standby clone settings # Standby clone settings
# ---------------------- # ----------------------
@@ -179,8 +163,9 @@
# monitoring interval in seconds; default is 2 # monitoring interval in seconds; default is 2
#monitor_interval_secs=2 #monitor_interval_secs=2
# Maximum number of seconds to wait for a response from the primary server # Number of seconds to wait for a response from the primary server before
# before deciding it has failed. # deciding it has failed.
#master_response_timeout=60 #master_response_timeout=60
# Number of attempts at what interval (in seconds) to try and # Number of attempts at what interval (in seconds) to try and

108
repmgr.h
View File

@@ -1,6 +1,6 @@
/* /*
* repmgr.h * repmgr.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -55,22 +55,18 @@
#define OPT_COPY_EXTERNAL_CONFIG_FILES 4 #define OPT_COPY_EXTERNAL_CONFIG_FILES 4
#define OPT_CONFIG_ARCHIVE_DIR 5 #define OPT_CONFIG_ARCHIVE_DIR 5
#define OPT_PG_REWIND 6 #define OPT_PG_REWIND 6
#define OPT_PWPROMPT 7
#define OPT_CSV 8 #define OPT_CSV 8
#define OPT_NODE 9 #define OPT_NODE 9
#define OPT_WITHOUT_BARMAN 10 #define OPT_WITHOUT_BARMAN 10
#define OPT_NO_UPSTREAM_CONNECTION 11 #define OPT_NO_UPSTREAM_CONNECTION 11
#define OPT_REGISTER_WAIT 12 #define OPT_REGISTER_WAIT 12
#define OPT_CLUSTER 13 #define OPT_CLUSTER 13
#define OPT_LOG_TO_FILE 14
#define OPT_UPSTREAM_CONNINFO 15
#define OPT_NO_CONNINFO_PASSWORD 16
#define OPT_REPLICATION_USER 17
/* deprecated command line options */ /* deprecated command line options */
#define OPT_INITDB_NO_PWPROMPT 998 #define OPT_INITDB_NO_PWPROMPT 999
#define OPT_IGNORE_EXTERNAL_CONFIG_FILES 999 #define OPT_IGNORE_EXTERNAL_CONFIG_FILES 998
/* values for --copy-external-config-files */
#define CONFIG_FILE_SAMEPATH 1 #define CONFIG_FILE_SAMEPATH 1
#define CONFIG_FILE_PGDATA 2 #define CONFIG_FILE_PGDATA 2
@@ -78,102 +74,53 @@
/* Run time options type */ /* Run time options type */
typedef struct typedef struct
{ {
/* general repmgr options */
char config_file[MAXPGPATH];
bool verbose;
bool terse;
bool force;
char pg_bindir[MAXLEN]; /* overrides setting in repmgr.conf */
/* logging parameters */
char loglevel[MAXLEN]; /* overrides setting in repmgr.conf */
bool log_to_file;
/* connection parameters */
char dbname[MAXLEN]; char dbname[MAXLEN];
char host[MAXLEN]; char host[MAXLEN];
char username[MAXLEN]; char username[MAXLEN];
char dest_dir[MAXPGPATH]; char dest_dir[MAXPGPATH];
char config_file[MAXPGPATH];
char remote_user[MAXLEN]; char remote_user[MAXLEN];
char superuser[MAXLEN]; char superuser[MAXLEN];
char masterport[MAXLEN];
bool conninfo_provided;
bool connection_param_provided;
bool host_param_provided;
/* standby clone parameters */
bool wal_keep_segments_used;
char wal_keep_segments[MAXLEN]; char wal_keep_segments[MAXLEN];
bool verbose;
bool terse;
bool force;
bool wait_for_master;
bool ignore_rsync_warn; bool ignore_rsync_warn;
bool witness_pwprompt;
bool rsync_only; bool rsync_only;
bool fast_checkpoint; bool fast_checkpoint;
bool csv_mode;
bool without_barman; bool without_barman;
bool no_upstream_connection; bool no_upstream_connection;
bool no_conninfo_password;
bool copy_external_config_files; bool copy_external_config_files;
int copy_external_config_files_destination; int copy_external_config_files_destination;
char upstream_conninfo[MAXLEN];
char replication_user[MAXLEN];
char recovery_min_apply_delay[MAXLEN];
/* standby register parameters */
bool wait_register_sync; bool wait_register_sync;
int wait_register_sync_seconds; int wait_register_sync_seconds;
char masterport[MAXLEN];
/*
* configuration file parameters which can be overridden on the
* command line
*/
char loglevel[MAXLEN];
/* witness create parameters */ /* parameter used by STANDBY SWITCHOVER */
bool witness_pwprompt;
/* standby follow parameters */
bool wait_for_master;
/* cluster {show|matrix|crosscheck} parameters */
bool csv_mode;
/* cluster cleanup parameters */
int keep_history;
/* standby switchover parameters */
char remote_config_file[MAXLEN]; char remote_config_file[MAXLEN];
bool pg_rewind_supplied;
char pg_rewind[MAXPGPATH]; char pg_rewind[MAXPGPATH];
char pg_ctl_mode[MAXLEN]; char pg_ctl_mode[MAXLEN];
/* parameter used by STANDBY {ARCHIVE_CONFIG | RESTORE_CONFIG} */
/* standby {archive_config | restore_config} parameters */
char config_archive_dir[MAXLEN]; char config_archive_dir[MAXLEN];
/* parameter used by CLUSTER CLEANUP */
/* {standby|witness} unregister parameters */ int keep_history;
/* parameter used by {STANDBY|WITNESS} UNREGISTER */
int node; int node;
char pg_bindir[MAXLEN];
char recovery_min_apply_delay[MAXLEN];
} t_runtime_options; } t_runtime_options;
#define T_RUNTIME_OPTIONS_INITIALIZER { \ #define T_RUNTIME_OPTIONS_INITIALIZER { "", "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, false, false, false, false, false, false, CONFIG_FILE_SAMEPATH, false, 0, "", "", "", "", "fast", "", 0, UNKNOWN_NODE_ID, "", ""}
/* general repmgr options */ \
"", false, false, false, "", \
/* logging parameters */ \
"", false, \
/* connection parameters */ \
"", "", "", "", "", "", "", \
false, false, false, \
/* standby clone parameters */ \
false, DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, \
false, CONFIG_FILE_SAMEPATH, "", "", "", \
/* standby register paarameters */ \
false, 0, \
/* witness create parameters */ \
false, \
/* standby follow parameters */ \
false, \
/* cluster {show|matrix|crosscheck} parameters */ \
false, \
/* cluster cleanup parameters */ \
0, \
/* standby switchover parameters */ \
"", false, "", "fast", \
/* standby {archive_config | restore_config} parameters */ \
"", \
/* {standby|witness} unregister parameters */ \
UNKNOWN_NODE_ID }
struct BackupLabel struct BackupLabel
{ {
@@ -192,10 +139,9 @@ typedef struct
{ {
char slot[MAXLEN]; char slot[MAXLEN];
char xlog_method[MAXLEN]; char xlog_method[MAXLEN];
bool no_slot; /* from PostgreSQL 10 */
} t_basebackup_options; } t_basebackup_options;
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false } #define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "" }
typedef struct typedef struct
{ {

View File

@@ -1,7 +1,7 @@
/* /*
* repmgr.sql * repmgr.sql
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
*/ */

225
repmgrd.c
View File

@@ -1,7 +1,7 @@
/* /*
* repmgrd.c - Replication manager daemon * repmgrd.c - Replication manager daemon
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This module connects to the nodes of a replication cluster and monitors * This module connects to the nodes of a replication cluster and monitors
* how far are they from master * how far are they from master
@@ -30,10 +30,18 @@
#include <stdlib.h> #include <stdlib.h>
#include <unistd.h> #include <unistd.h>
#include "repmgr.h" #include "repmgr.h"
#include "config.h"
#include "log.h" #include "log.h"
#include "strutil.h"
#include "version.h" #include "version.h"
/* Required PostgreSQL headers */
#include "access/xlogdefs.h"
#include "pqexpbuffer.h"
/* Message strings passed in repmgrSharedState->location */ /* Message strings passed in repmgrSharedState->location */
#define PASSIVE_NODE "PASSIVE_NODE" #define PASSIVE_NODE "PASSIVE_NODE"
@@ -63,7 +71,6 @@ bool failover_done = false;
bool manual_mode_upstream_disconnected = false; bool manual_mode_upstream_disconnected = false;
char *pid_file = NULL; char *pid_file = NULL;
int server_version_num = 0;
static void help(void); static void help(void);
static void usage(void); static void usage(void);
@@ -138,6 +145,8 @@ main(int argc, char **argv)
FILE *fd; FILE *fd;
int server_version_num = 0;
set_progname(argv[0]); set_progname(argv[0]);
/* Disallow running as root to prevent directory ownership problems */ /* Disallow running as root to prevent directory ownership problems */
@@ -198,13 +207,6 @@ main(int argc, char **argv)
} }
} }
/*
* Tell the logger we're a daemon - this will ensure any output logged
* before the logger is initialized will be formatted correctly
*/
logger_output_mode = OM_DAEMON;
/* /*
* Parse the configuration file, if provided. If no configuration file * Parse the configuration file, if provided. If no configuration file
* was provided, or one was but was incomplete, parse_config() will * was provided, or one was but was incomplete, parse_config() will
@@ -245,7 +247,6 @@ main(int argc, char **argv)
} }
logger_init(&local_options, progname()); logger_init(&local_options, progname());
if (verbose) if (verbose)
logger_set_verbose(); logger_set_verbose();
@@ -646,15 +647,15 @@ witness_monitor(void)
} }
else else
{ {
log_info(_("new master found with node ID: %i\n"), master_options.node); log_debug(_("new master found with node ID: %i\n"), master_options.node);
connection_ok = true; connection_ok = true;
/* /*
* Update the repl_nodes table from the new master to reflect the changed * Update the repl_nodes table from the new master to reflect the changed
* node configuration * node configuration
* *
* It would be neat to be able to handle this with e.g. table-based * XXX it would be neat to be able to handle this with e.g. table-based
* logical replication if available in core * logical replication
*/ */
witness_copy_node_records(master_conn, my_local_conn, local_options.cluster_name); witness_copy_node_records(master_conn, my_local_conn, local_options.cluster_name);
@@ -709,46 +710,26 @@ witness_monitor(void)
return; return;
} }
strncpy(monitor_witness_timestamp, PQgetvalue(res, 0, 0), MAXLEN); strcpy(monitor_witness_timestamp, PQgetvalue(res, 0, 0));
PQclear(res); PQclear(res);
/* /*
* Build the SQL to execute on master * Build the SQL to execute on master
*/ */
if (server_version_num >= 100000) sqlquery_snprintf(sqlquery,
{ "INSERT INTO %s.repl_monitor "
sqlquery_snprintf(sqlquery, " (primary_node, standby_node, "
"INSERT INTO %s.repl_monitor " " last_monitor_time, last_apply_time, "
" (primary_node, standby_node, " " last_wal_primary_location, last_wal_standby_location, "
" last_monitor_time, last_apply_time, " " replication_lag, apply_lag )"
" last_wal_primary_location, last_wal_standby_location, " " VALUES(%d, %d, "
" replication_lag, apply_lag )" " '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
" VALUES(%d, %d, " " pg_catalog.pg_current_xlog_location(), NULL, "
" '%s'::TIMESTAMP WITH TIME ZONE, NULL, " " 0, 0) ",
" pg_catalog.pg_current_wal_lsn(), NULL, " get_repmgr_schema_quoted(my_local_conn),
" 0, 0) ", master_options.node,
get_repmgr_schema_quoted(my_local_conn), local_options.node,
master_options.node, monitor_witness_timestamp);
local_options.node,
monitor_witness_timestamp);
}
else
{
sqlquery_snprintf(sqlquery,
"INSERT INTO %s.repl_monitor "
" (primary_node, standby_node, "
" last_monitor_time, last_apply_time, "
" last_wal_primary_location, last_wal_standby_location, "
" replication_lag, apply_lag )"
" VALUES(%d, %d, "
" '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
" pg_catalog.pg_current_xlog_location(), NULL, "
" 0, 0) ",
get_repmgr_schema_quoted(my_local_conn),
master_options.node,
local_options.node,
monitor_witness_timestamp);
}
/* /*
* Execute the query asynchronously, but don't check for a result. We will * Execute the query asynchronously, but don't check for a result. We will
@@ -793,6 +774,7 @@ standby_monitor(void)
PGconn *upstream_conn; PGconn *upstream_conn;
char upstream_conninfo[MAXCONNINFO]; char upstream_conninfo[MAXCONNINFO];
int upstream_node_id; int upstream_node_id;
t_node_info upstream_node;
int active_master_id; int active_master_id;
const char *upstream_node_type = NULL; const char *upstream_node_type = NULL;
@@ -974,8 +956,6 @@ standby_monitor(void)
* Failover handling is handled differently depending on whether * Failover handling is handled differently depending on whether
* the failed node is the master or a cascading standby * the failed node is the master or a cascading standby
*/ */
t_node_info upstream_node;
upstream_node = get_node_info(my_local_conn, local_options.cluster_name, upstream_node_id); upstream_node = get_node_info(my_local_conn, local_options.cluster_name, upstream_node_id);
if (upstream_node.type == MASTER) if (upstream_node.type == MASTER)
@@ -1033,8 +1013,8 @@ standby_monitor(void)
* *
* We should log a message so the user knows of the situation at hand. * We should log a message so the user knows of the situation at hand.
* *
* XXX check if the original master is still active and display a warning * XXX check if the original master is still active and display a
* XXX add event notification * warning
*/ */
log_err(_("It seems this server was promoted manually (not by repmgr) so you might by in the presence of a split-brain.\n")); log_err(_("It seems this server was promoted manually (not by repmgr) so you might by in the presence of a split-brain.\n"));
log_err(_("Check your cluster and manually fix any anomaly.\n")); log_err(_("Check your cluster and manually fix any anomaly.\n"));
@@ -1079,6 +1059,9 @@ standby_monitor(void)
* from the upstream node to write monitoring information * from the upstream node to write monitoring information
*/ */
/* XXX not used? */
upstream_node = get_node_info(my_local_conn, local_options.cluster_name, upstream_node_id);
sprintf(sqlquery, sprintf(sqlquery,
"SELECT id " "SELECT id "
" FROM %s.repl_nodes " " FROM %s.repl_nodes "
@@ -1136,42 +1119,21 @@ standby_monitor(void)
* If receive_location is less than replay location, we were streaming WAL but are * If receive_location is less than replay location, we were streaming WAL but are
* somehow disconnected and evidently in archive recovery * somehow disconnected and evidently in archive recovery
*/ */
sqlquery_snprintf(sqlquery,
" SELECT ts, "
" CASE WHEN (receive_location IS NULL OR receive_location < replay_location) "
" THEN replay_location "
" ELSE receive_location"
" END AS receive_location,"
" replay_location, "
" replay_timestamp, "
" COALESCE(receive_location, '0/0') >= replay_location AS receiving_streamed_wal "
" FROM (SELECT CURRENT_TIMESTAMP AS ts, "
" pg_catalog.pg_last_xlog_receive_location() AS receive_location, "
" pg_catalog.pg_last_xlog_replay_location() AS replay_location, "
" pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
" ) q ");
if (server_version_num >= 100000)
{
sqlquery_snprintf(sqlquery,
" SELECT ts, "
" CASE WHEN (receive_location IS NULL OR receive_location < replay_location) "
" THEN replay_location "
" ELSE receive_location"
" END AS receive_location,"
" replay_location, "
" replay_timestamp, "
" COALESCE(receive_location, '0/0') >= replay_location AS receiving_streamed_wal "
" FROM (SELECT CURRENT_TIMESTAMP AS ts, "
" pg_catalog.pg_last_wal_receive_lsn() AS receive_location, "
" pg_catalog.pg_last_wal_replay_lsn() AS replay_location, "
" pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
" ) q ");
}
else
{
sqlquery_snprintf(sqlquery,
" SELECT ts, "
" CASE WHEN (receive_location IS NULL OR receive_location < replay_location) "
" THEN replay_location "
" ELSE receive_location"
" END AS receive_location,"
" replay_location, "
" replay_timestamp, "
" COALESCE(receive_location, '0/0') >= replay_location AS receiving_streamed_wal "
" FROM (SELECT CURRENT_TIMESTAMP AS ts, "
" pg_catalog.pg_last_xlog_receive_location() AS receive_location, "
" pg_catalog.pg_last_xlog_replay_location() AS replay_location, "
" pg_catalog.pg_last_xact_replay_timestamp() AS replay_timestamp "
" ) q ");
}
res = PQexec(my_local_conn, sqlquery); res = PQexec(my_local_conn, sqlquery);
@@ -1183,9 +1145,9 @@ standby_monitor(void)
return; return;
} }
strncpy(monitor_standby_timestamp, PQgetvalue(res, 0, 0), MAXLEN); strncpy(monitor_standby_timestamp, PQgetvalue(res, 0, 0), MAXLEN);
strncpy(last_xlog_receive_location, PQgetvalue(res, 0, 1), MAXLEN); strncpy(last_xlog_receive_location, PQgetvalue(res, 0, 1), MAXLEN);
strncpy(last_xlog_replay_location, PQgetvalue(res, 0, 2), MAXLEN); strncpy(last_xlog_replay_location, PQgetvalue(res, 0, 2), MAXLEN);
strncpy(last_xact_replay_timestamp, PQgetvalue(res, 0, 3), MAXLEN); strncpy(last_xact_replay_timestamp, PQgetvalue(res, 0, 3), MAXLEN);
receiving_streamed_wal = (strcmp(PQgetvalue(res, 0, 4), "t") == 0) receiving_streamed_wal = (strcmp(PQgetvalue(res, 0, 4), "t") == 0)
@@ -1205,11 +1167,7 @@ standby_monitor(void)
* TODO: investigate whether pg_current_xlog_insert_location() would be a better * TODO: investigate whether pg_current_xlog_insert_location() would be a better
* choice; see: https://github.com/2ndQuadrant/repmgr/issues/189 * choice; see: https://github.com/2ndQuadrant/repmgr/issues/189
*/ */
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_xlog_location()");
if (server_version_num >= 100000)
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_wal_lsn()");
else
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_xlog_location()");
res = PQexec(master_conn, sqlquery); res = PQexec(master_conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
@@ -1223,22 +1181,10 @@ standby_monitor(void)
PQclear(res); PQclear(res);
lsn_master_current_xlog_location = lsn_to_xlogrecptr(last_wal_primary_location, NULL); lsn_master_current_xlog_location = lsn_to_xlogrecptr(last_wal_primary_location, NULL);
lsn_last_xlog_receive_location = lsn_to_xlogrecptr(last_xlog_receive_location, NULL);
lsn_last_xlog_replay_location = lsn_to_xlogrecptr(last_xlog_replay_location, NULL); lsn_last_xlog_replay_location = lsn_to_xlogrecptr(last_xlog_replay_location, NULL);
lsn_last_xlog_receive_location = lsn_to_xlogrecptr(last_xlog_receive_location, NULL);
if (lsn_last_xlog_receive_location >= lsn_last_xlog_replay_location) apply_lag = (long long unsigned int)lsn_last_xlog_receive_location - lsn_last_xlog_replay_location;
{
apply_lag = (long long unsigned int)lsn_last_xlog_receive_location - lsn_last_xlog_replay_location;
}
else
{
/* This should never happen, but in case it does set apply lag to zero */
log_warning("Standby receive (%s) location appears less than standby replay location (%s)\n",
last_xlog_receive_location,
last_xlog_replay_location);
apply_lag = 0;
}
/* Calculate replication lag */ /* Calculate replication lag */
if (lsn_master_current_xlog_location >= lsn_last_xlog_receive_location) if (lsn_master_current_xlog_location >= lsn_last_xlog_receive_location)
@@ -1247,7 +1193,7 @@ standby_monitor(void)
} }
else else
{ {
/* This should never happen, but in case it does set replication lag to zero */ /* This should never happen, but in case it does set lag to zero */
log_warning("Master xlog (%s) location appears less than standby receive location (%s)\n", log_warning("Master xlog (%s) location appears less than standby receive location (%s)\n",
last_wal_primary_location, last_wal_primary_location,
last_xlog_receive_location); last_xlog_receive_location);
@@ -1292,23 +1238,8 @@ standby_monitor(void)
log_verbose(LOG_DEBUG, "standby_monitor:() %s\n", sqlquery); log_verbose(LOG_DEBUG, "standby_monitor:() %s\n", sqlquery);
if (PQsendQuery(master_conn, sqlquery) == 0) if (PQsendQuery(master_conn, sqlquery) == 0)
{ log_warning(_("query could not be sent to master. %s\n"),
log_warning(_("query could not be sent to master: %s\n"),
PQerrorMessage(master_conn)); PQerrorMessage(master_conn));
}
else
{
sqlquery_snprintf(sqlquery,
"SELECT %s.repmgr_update_last_updated();",
get_repmgr_schema_quoted(my_local_conn));
res = PQexec(my_local_conn, sqlquery);
/* not critical if the above query fails*/
if (PQresultStatus(res) != PGRES_TUPLES_OK)
log_warning(_("unable to set last_updated: %s\n"), PQerrorMessage(my_local_conn));
PQclear(res);
}
} }
@@ -1500,11 +1431,7 @@ do_master_failover(void)
terminate(ERR_FAILOVER_FAIL); terminate(ERR_FAILOVER_FAIL);
} }
if (server_version_num >= 100000) sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_wal_receive_lsn()");
else
sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
res = PQexec(node_conn, sqlquery); res = PQexec(node_conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
@@ -1536,12 +1463,7 @@ do_master_failover(void)
} }
/* last we get info about this node, and update shared memory */ /* last we get info about this node, and update shared memory */
sprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
if (server_version_num >= 100000)
sprintf(sqlquery, "SELECT pg_catalog.pg_last_wal_receive_lsn()");
else
sprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()");
res = PQexec(my_local_conn, sqlquery); res = PQexec(my_local_conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
@@ -1591,6 +1513,7 @@ do_master_failover(void)
*/ */
if (PQstatus(node_conn) != CONNECTION_OK) if (PQstatus(node_conn) != CONNECTION_OK)
{ {
/* XXX */
log_info(_("At this point, it could be some race conditions " log_info(_("At this point, it could be some race conditions "
"that are acceptable, assume the node is restarting " "that are acceptable, assume the node is restarting "
"and starting failover procedure\n")); "and starting failover procedure\n"));
@@ -2163,21 +2086,18 @@ check_connection(PGconn **conn, const char *type, const char *conninfo)
/* /*
* set_local_node_status() * set_local_node_status()
* *
* Attempt to connect to the current master server (as stored in the global * If failure of the local node is detected, attempt to connect
* variable `master_conn`) and set the local node's status to the result * to the current master server (as stored in the global variable
* of `is_standby(my_local_conn)`. Normally this will be used to mark * `master_conn`) and update its record to failed.
* a node as failed, but in some circumstances we may be marking it
* as recovered.
*/ */
static bool static bool
set_local_node_status(void) set_local_node_status(void)
{ {
PGresult *res; PGresult *res;
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
int active_master_node_id = NODE_NOT_FOUND; int active_master_node_id = NODE_NOT_FOUND;
char master_conninfo[MAXLEN]; char master_conninfo[MAXLEN];
bool local_node_status;
if (!check_connection(&master_conn, "master", NULL)) if (!check_connection(&master_conn, "master", NULL))
{ {
@@ -2236,29 +2156,24 @@ set_local_node_status(void)
/* /*
* Attempt to set the active record to the correct value. * Attempt to set the active record to the correct value.
* First
*/ */
local_node_status = (is_standby(my_local_conn) == 1);
if (!update_node_record_status(master_conn, if (!update_node_record_status(master_conn,
local_options.cluster_name, local_options.cluster_name,
node_info.node_id, node_info.node_id,
"standby", "standby",
node_info.upstream_node_id, node_info.upstream_node_id,
local_node_status)) is_standby(my_local_conn)==1))
{ {
log_err(_("unable to set local node %i as %s on master: %s\n"), log_err(_("unable to set local node %i as inactive on master: %s\n"),
node_info.node_id, node_info.node_id,
local_node_status == false ? "inactive" : "active",
PQerrorMessage(master_conn)); PQerrorMessage(master_conn));
return false; return false;
} }
log_notice(_("marking this node (%i) as %s on master\n"), log_notice(_("marking this node (%i) as inactive on master\n"), node_info.node_id);
node_info.node_id,
local_node_status == false ? "inactive" : "active");
return true; return true;
} }
@@ -2399,13 +2314,13 @@ lsn_to_xlogrecptr(char *lsn, bool *format_ok)
if (format_ok != NULL) if (format_ok != NULL)
*format_ok = true; *format_ok = true;
return (XLogRecPtr) ((uint64) xlogid) << 32 | (uint64) xrecoff; return (((XLogRecPtr) xlogid * 16 * 1024 * 1024 * 255) + xrecoff);
} }
void void
usage(void) usage(void)
{ {
log_err(_("%s: replication management daemon for PostgreSQL\n"), progname()); log_err(_("%s: Replicator manager daemon \n"), progname());
log_err(_("Try \"%s --help\" for more information.\n"), progname()); log_err(_("Try \"%s --help\" for more information.\n"), progname());
} }

View File

@@ -1,7 +1,7 @@
# #
# Makefile # Makefile
# #
# Copyright (c) 2ndQuadrant, 2010-2017 # Copyright (c) 2ndQuadrant, 2010-2016
# #
MODULE_big = repmgr_funcs MODULE_big = repmgr_funcs

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr_function.sql * repmgr_function.sql
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
*/ */

View File

@@ -1,4 +0,0 @@
select * from repmgr_update_standby_location('');
select * from repmgr_get_last_standby_location();
select * from repmgr_update_last_updated();
select * from repmgr_get_last_updated();

View File

@@ -1,6 +1,6 @@
/* /*
* uninstall_repmgr_funcs.sql * uninstall_repmgr_funcs.sql
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2016
* *
*/ */

View File

@@ -1,7 +1,7 @@
/* /*
* strutil.c * strutil.c
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -90,18 +90,31 @@ maxlen_snprintf(char *str, const char *format,...)
/* /*
* Escape a string for use as a parameter in recovery.conf * Adapted from: src/fe_utils/string_utils.c
* Caller must free returned value *
* Function not publicly available before PostgreSQL 9.6.
*/ */
char * void
escape_recovery_conf_value(const char *src) appendShellString(PQExpBuffer buf, const char *str)
{ {
char *result = escape_single_quotes_ascii(src); const char *p;
if (!result) appendPQExpBufferChar(buf, '\'');
for (p = str; *p; p++)
{ {
fprintf(stderr, _("%s: out of memory\n"), progname()); if (*p == '\n' || *p == '\r')
exit(ERR_INTERNAL); {
fprintf(stderr,
_("shell command argument contains a newline or carriage return: \"%s\"\n"),
str);
exit(ERR_BAD_CONFIG);
}
if (*p == '\'')
appendPQExpBufferStr(buf, "'\"'\"'");
else
appendPQExpBufferChar(buf, *p);
} }
return result;
appendPQExpBufferChar(buf, '\'');
} }

View File

@@ -1,6 +1,6 @@
/* /*
* strutil.h * strutil.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
@@ -49,6 +49,6 @@ extern int
maxlen_snprintf(char *str, const char *format,...) maxlen_snprintf(char *str, const char *format,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
extern char * extern void
escape_recovery_conf_value(const char *src); appendShellString(PQExpBuffer buf, const char *str);
#endif /* _STRUTIL_H_ */ #endif /* _STRUTIL_H_ */

View File

@@ -1,7 +1,7 @@
/* /*
* uninstall_repmgr.sql * uninstall_repmgr.sql
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (C) 2ndQuadrant, 2010-2016
* *
*/ */

View File

@@ -1,6 +1,6 @@
#ifndef _VERSION_H_ #ifndef _VERSION_H_
#define _VERSION_H_ #define _VERSION_H_
#define REPMGR_VERSION "3.3.2" #define REPMGR_VERSION "3.2"
#endif #endif