Compare commits

..

263 Commits

Author SHA1 Message Date
Ian Barwick
78be15b06c docs: update one more instance of the company name 2022-02-15 16:21:43 +09:00
Ian Barwick
f02be50118 docs: update other instances of the company name, where appropriate 2022-02-15 16:14:31 +09:00
Ian Barwick
ac62101dc0 docs: update company name on cover pages 2022-02-15 16:01:51 +09:00
Ian Barwick
78cc278639 docs: finalize release notes 2022-02-15 13:38:24 +09:00
Ian Barwick
ceb65027c6 Bump version number 2022-02-04 09:08:04 +09:00
Ian Barwick
e6caa14ea2 Bump version number
5.3.1
2022-02-03 14:48:47 +09:00
Ian Barwick
88a11f36ca Add include for pwd.h
This was previously included via the PostgreSQL source, but that
seems to have gone away in recent HEAD builds.
2022-02-03 14:32:18 +09:00
Ian Barwick
7f371b11a5 doc: update version matrix
5.2.1 is the latest release in the 5.2.x series.
2022-02-03 13:30:29 +09:00
Ian Barwick
349eacd4b7 doc: update release notes 2022-02-03 13:29:32 +09:00
Ian Barwick
9f2afe9643 Fix upgrade paths from 4.1 ~ 4.3 to 5.2 and later
A number of C functions were added in releases 4.2 to 4.4; however
these were renamed in 5.3 to prevent naming clashes with other
extensions.

This does however mean that when upgrading from one of the above
versions, the intermediate upgrade steps will attempt to create
SQL functions referencing C functions which no longer exist in
repmgr.so, and hence cause the upgrade to fail.

We can work around this by providing empty upgrade scripts
from these versions to 4.4, which skip the problematic CREATE
FUNCTION commands. The functions will be correctly created in
the 5.2--5.3 upgrade script.
2022-02-03 13:29:25 +09:00
Ian Barwick
356f65531f repmgrd: move connection pointer declaration inside relevant block
As it's used only there and nowhere else.
2022-01-04 12:46:50 +09:00
Ian Barwick
2a7579c770 doc: update release notes 2022-01-04 12:33:30 +09:00
zhouhj43183
820d972d41 repmgrd: ensure potentially open connections are closed
When recovering from degraded state in local node monitoring, in some
cases a new connection was opened to the local node without closing
the old one, which will result in memory leakage.
2022-01-04 12:24:42 +09:00
Ian Barwick
d0add49c84 doc: update repmgr.conf.sample
Minor formatting fix.
2021-12-08 09:52:27 +09:00
Ian Barwick
9a84fa84f9 doc: update repmgr.conf.sample
Remove bogus -W option in "repmgr standby follow" example invocation
for the "follow_command" parameter.

The option (which corresponds to "--no-wait") is not used by
"repmgr standby follow".

Per report from Jimmy Angelakos.
2021-12-08 09:52:22 +09:00
Ian Barwick
ff2c56f5cb doc: fix typo 2021-11-05 09:14:19 +09:00
Ian Barwick
3b860bad80 Removed temporary include file workaround 2021-10-29 15:44:16 +09:00
Ian Barwick
70c79aeaec Add dummy include file
This is a workaround required to facilitate Debian package builds
against PostgreSQL Extended.
2021-10-12 16:50:33 +09:00
Ian Barwick
afd876377c doc: repmgr 5.2 is no longer supported. 2021-10-12 14:05:41 +09:00
Ian Barwick
277910cb31 Fix version number 2021-10-12 13:50:22 +09:00
Ian Barwick
9554154677 doc: update version matrix 2021-10-12 13:49:28 +09:00
Ian Barwick
2bbfb5daa0 doc: update release notes 2021-10-12 10:13:26 +09:00
Ian Barwick
82b0e85a66 Bump version number 2021-10-12 10:11:27 +09:00
Ian Barwick
3320eb0983 repmgrd: improve node activation at startup
Commit 79d1f00 modified repmgrd to automatically set an inactive node
to "active" at startup.

However we need to avoid doing that for cases where the node role has
changed (e.g. a former primary was recloned as a standby) but the node
record was not updated.
2021-10-11 14:39:14 +09:00
Ian Barwick
7862941300 repmgrd: add %p event notification parameter for "repmgrd_failover_promote"
This enables an event notification script to identify the former primary
node.
2021-09-28 10:25:45 +09:00
Ian Barwick
f152fc3016 Add --repmgrd option to "repmgr node check"
This provides a simple way for checking whether the node's repmgrd is
running.

GitHub #719.
2021-09-28 09:46:54 +09:00
Ian Barwick
9c260e605d Update Makefile 2021-09-16 17:17:20 +09:00
Ian Barwick
7ad06530e4 doc: update 5.3.0 release notes 2021-09-16 17:15:58 +09:00
Ian Barwick
c787273f91 Add extension script for unpackaged upgrades to 5.3 2021-09-16 13:55:56 +09:00
Ian Barwick
d40055c8dd doc: update 5.3.0 release notes 2021-09-16 13:42:34 +09:00
Ian Barwick
bf0478088c standby: add missing include 2021-08-18 10:26:56 +09:00
Ian Barwick
efd5792de4 Remove redundant shared library function prototypes
From PostgreSQL 9.4 (commit e7128e8d), explicit function prototypes
are not required as they will be generated by the PG_FUNCTION_INFO_V1
macro.

(We were supporting PostgreSQL 9.3 until relatively recently, so there
was nothing particular to gain by removing these earlier).
2021-08-18 10:19:13 +09:00
Ian Barwick
17987a2690 standby switchover: detect if demotion candidate is running as a primary
This shouldn't happen, but if it does, log the fact for easier analysis.
2021-07-28 11:50:57 +09:00
Ian Barwick
5f1ba6db3d standby switchover: improve handling of node rejoin failure
Explicitly check whether the "repmgr node rejoin" command on the
demotion candidate succeeded. Due to the way SSH execution is
currently implemented, we can return either the command execution
status or the command output; to ensure any errors are available,
log them to a temporary file on the demotion candidate and note
its location in case of an error.

While we're at it, improve error message handling when the demotion
candidate fails to rejoin.
2021-07-28 11:42:40 +09:00
Ian Barwick
55efbe60ea standby switchover: optionally delay promotion
This is for testing purposes only and should not be used in production.
2021-07-27 17:01:27 +09:00
Ian Barwick
132f5ebc08 standby switchover: improve logging of repmgrd pause actions
- state how many nodes are to be operated on
- if errors were encountered with any nodes, emit the total number
  of nodes as well as the number of affected nodes
- log nodes where repmgrd was not running anyway as NOTICE, not
  WARNING
2021-07-26 17:46:58 +09:00
Ian Barwick
c901f36f81 Standardize formatting of node ID in log messages
Mostly we have "(ID: %i)", so use that rather than "(ID %i)" which has
crept into a few places.
2021-07-26 17:27:36 +09:00
Ian Barwick
edb49b2747 doc: link to main documentation section about RemoveIPC 2021-07-26 11:01:03 +09:00
Ian Barwick
a35d85ed70 doc: link to service commands section from switchover docs 2021-07-26 09:47:14 +09:00
Ian Barwick
b6b91425d9 doc: document pg_bindir setting
Per suggestion in GitHub #705.
2021-07-20 13:36:08 +09:00
Ian Barwick
32329ca55a doc: link to sample configuration file
Unfortunately it hasn't been possible yet to include all available
configuration items in the main documentation, but we should at least
make it easier to find the full list.
2021-07-20 13:18:57 +09:00
Ian Barwick
79d1f005db repmgrd: activate inactive node record at startup
If a PostgreSQL instance was shut down while repmgrd was running, and
repmgrd was subsequently restarted (this chain of events could occur
during e.g. a server reboot), the node record will have been set to
"inactive". Previously, in this case repmgrd would refuse to start up.
However, as we can determine the node is running, it should normally
be no problem to automatically set the node record to "active".

The old behaviour can be restored by setting the new parameter
"repmgrd_exit_on_inactive_node" to "true".

RM19604.
2021-07-12 17:46:09 +09:00
Ian Barwick
f64f498afb Be more flexible when parsing the output from pg_config --version
The string may not always start with "PostgreSQL" when building
against non-community versions.

Life would be much easier here if there was an option like
"pg_config --version-number" or similar.
2021-07-05 15:38:34 +09:00
Ian Barwick
f10c013e89 doc: note PostgreSQL 14 support
repmgr 5.3 will provide support for PostgreSQL 14.
2021-07-01 16:00:21 +09:00
Ian Barwick
2059a55a99 doc: update release notes 2021-07-01 13:44:33 +09:00
Ian Barwick
99ed17b838 doc: update release notes 2021-07-01 13:28:15 +09:00
Ian Barwick
078b4ad863 standby clone: set "slot_name" in node record if required
If executing "repmgr standby clone --replication-conf-only" on a node
which was set up without replication slots, but the repmgr configuration
was since changed to "use_replication_slots=1", repmgr will attempt to
create the replication slot. This will however fail if "slot_name"
is not set in the node's record, so have repmgr set the slot_name in
this case.

It might be preferable to preemptively create the slot name for each
node when configuring the cluster, however this would be a behavioural
change which would be better off in a major release (for example, it's
conceivable a user runs sanity checks on the node records and expects
to find the slot names empty if replication slots are not in use).
2021-07-01 13:04:23 +09:00
Ian Barwick
2af71c6426 repmgrd: ensure short option "-s" is accepted
The long option --show-pid-file was fine.
2021-06-03 18:41:11 +09:00
Ian Barwick
9349520530 Remove reference to PostgreSQL 9.3 in --help output
It is not supported by repmgr 5.2 and later.
2021-04-15 15:31:59 +09:00
Ian Barwick
14851e61de doc: clarify "connection_check_type='query'" 2021-03-03 13:12:58 +09:00
Ian Barwick
888e1d7a3b docs: update repmgr.conf.sample
Fix description for connection_check_type='connection'.
2021-03-02 11:44:14 +09:00
Ian Barwick
1b4c2a60bb docs: update README
Note latest version number.
2021-03-02 11:26:15 +09:00
Ian Barwick
da163e811c doc: update README
Fix and update broken link.
2021-03-02 11:14:16 +09:00
Ian Barwick
80d1beef7e doc: update GitHub links to new location 2021-03-02 08:58:49 +09:00
Ian Barwick
4f009548f6 doc: remove generated .fo files 2021-03-01 11:06:41 +09:00
Ian Barwick
d266df3143 Change copyright information to "EnterpriseDB Corporation"
RM20485.
2021-03-01 11:03:52 +09:00
Ian Barwick
dd8204e013 Rename various shared library functions
Some of the more generically named functions are at risk of colliding
with functions defined in other libraries. To mitigate that risk,
prefix with "repmgr_", unless the name already has some reference
to repmgr.

This requires an extension version bump.

RM20471.
2021-02-23 10:14:28 +09:00
Ian Barwick
d34b4e71a6 Fix incorrect comment 2021-02-22 13:28:27 +09:00
Ian Barwick
12749c3f63 doc: fix XML markup
An incorrect column count was causing PDF builds to fail.
2021-02-19 20:39:41 +09:00
Ian Barwick
ce59d92731 doc: update repmgr.conf.sample 2021-01-14 15:27:24 +09:00
Ian Barwick
cfbeed50d6 node rejoin: emit rejoin target note information as NOTICE
As it's possible to specify the connection information for any available
node, but currently not possible to rejoin to a node other than the
primary, explicitly mention what the rejoin target will be.
2021-01-06 14:11:37 +09:00
Ian Barwick
da3eaee127 doc: "repmgr node rejoin" clarifications
- make it clearer a node can only be joined to the primary
- update patch status
2021-01-06 12:36:11 +09:00
Ian Barwick
b37a599fc6 Update copyright notices to 2021 2021-01-04 12:54:54 +09:00
Ian Barwick
f011e552d0 Add missing PQconninfoFree() call 2020-12-24 18:07:18 +09:00
Ian Barwick
d1cc05faf9 repmgrd: edit code comment for clarity 2020-12-22 13:58:34 +09:00
Ian Barwick
7ceba84e32 doc: minor grammar tweak 2020-12-22 13:57:31 +09:00
Josh Soref
842c67ca18 doc: various spelling fixes
Via GitHub #687.
2020-12-22 13:47:56 +09:00
Josh Soref
f619c3a8ff Fix various typos in code comments.
Via GitHub #687.
2020-12-22 13:43:06 +09:00
Josh Soref
5a88858596 repmgr: various log ouput typo fixes
Via GitHub #687.
2020-12-22 13:18:11 +09:00
Josh Soref
02bc143c75 repmgr: fix typo in "repmgr node --help" output
Via GitHub #687.
2020-12-22 13:07:16 +09:00
Ian Barwick
c480d01f9c Improve HINT about upgrading the repmgr extension
Per feedback in GitHub #685.
2020-12-15 08:41:46 +09:00
Ian Barwick
e762200a12 doc: update README
Link to recent-ish EDB blog article.
2020-12-08 13:31:21 +09:00
Ian Barwick
2133e1097e standby switchover: remove extraneous space in log message 2020-12-08 13:14:45 +09:00
Ian Barwick
77d7a098a1 doc: add 5.2.1 release date 2020-12-08 12:41:46 +09:00
Ian Barwick
4e9cdf0267 doc: update 5.2.1 release notes 2020-12-04 14:49:20 +09:00
Ian Barwick
0d8bf2a935 Minor string formatting optimization 2020-12-04 10:16:21 +09:00
Ian Barwick
debbda6074 standby clone: tweak error message
Probably a remnant from the 9.1 era, where it was not possible to
take a base backup from a standby.
2020-12-04 10:13:45 +09:00
Ian Barwick
d5b94431f2 standby follow: fix standby.signal generation
Oversight from previous a93c6dfc.
2020-12-02 09:11:20 +09:00
Ian Barwick
93187e9743 Add missing connection close
In a corner-case situation where a standby is unable to attach to
the new primary due to a mismatch in the WAL stream, the connection
used to verify the recovery status of the new primary was not being
closed, leading to a risk of connection exhaustion on the new primary.

Addresses GitHub #682.
2020-12-01 21:33:07 +09:00
Ian Barwick
f7e45863ad standby clone: fix data directory permissions handling for Pg11 and later
Previously, repmgr would forcibly change the permissions on a data
directory to 0700. However from PostgreSQL 11, 0750 is also valid,
so that value should not be changed.
2020-12-01 11:48:22 +09:00
Ian Barwick
89556d6488 standby clone: add --recovery-min-apply-delay to help output 2020-11-30 16:44:48 +09:00
Ian Barwick
4ad868d119 doc: update 5.2.1 release notes 2020-11-30 16:44:48 +09:00
Ian Barwick
a93c6dfca7 Ensure standby.signal is set correctly if -D/--data-directory supplied
When cloning a standby, it's possible to do a "raw" clone by providing
-D/--data-directory but no repmgr.conf file. However the code which
creates "standby.signal" was assuming the presence of a valid
repmgr.conf complete with "data_directory" configuration.

This is very much a niche-use case.
2020-11-27 11:24:15 +09:00
Ian Barwick
4d8bc63834 repmgrd: fix issue with incorrect reconnect_interval
Addresses GitHub #673.
2020-11-25 20:40:28 +09:00
Ian Barwick
7bca9df223 Update Makefile
We don't actually need $(LIBS) in there; this was cargo-culted in
from somewhere.
2020-11-24 17:37:48 +09:00
Ian Barwick
1ac62a4352 Avoid compiler warnings for various strncpy() operations
Here the compiler may complain that the source length is being used,
though in all cases the source length was previously used to
define the length of the destination buffer, so it's not actually
a problem.
2020-11-24 15:42:49 +09:00
Ian Barwick
8f7a32a9a2 repmgr: prevent termination in corner-case situation
If neither the local node nor the upstream are available, and
"standby_disconnect_on_failover" is set, attempting to fetch
the walreceiver PID will result in repmgrd terminating.

Add a check that the connection is valid before attempting to
fetch the walreceiver PID.

Addresses GitHub #675.
2020-11-17 16:34:55 +09:00
Ian Barwick
9c04de11fc standby clone: various clarifications for --replication-conf-only option
In particular, the emitted HINT was not really appropriate for Pg13 and
later.
2020-11-17 09:58:51 +09:00
Ian Barwick
040b1ae4e3 Update corner-case error message
Not possible to build repmgr compatible with Pg12+ against Pg11
and earlier due to the addition of FullTransactionId.
2020-11-17 09:39:41 +09:00
Ian Barwick
703aed3fa3 doc: tweak "repmgr standby clone" reference
As recovery.conf starts to fade away, mention that last.
2020-11-10 16:07:22 +09:00
Ian Barwick
7ee0098771 standby clone: add option --recovery-min-apply-delay
This overrides the equivalent setting in repmgr.conf, if present.

Note this option was available in repmgr versions prior to 4.0, but
was assumed to be redundant. However recently a use-case was made
for its reintroduction.
2020-11-10 15:55:04 +09:00
Ian Barwick
430d12b870 Fix typo 2020-11-10 13:40:38 +09:00
Romain Jacquier
c8b2d23361 Fix help witness
Fix the `repmgr witness --help` command where at the "Unregister" section the message shown was
```
"witness register" unregisters a witness node.
```
instead of
```
"witness unregister" unregisters a witness node.
```

GitHub #676.
2020-11-09 13:34:08 +09:00
Ian Barwick
8543c0bcf6 standby clone: emit pg_basebackup command in --dry-run mode 2020-11-04 12:00:42 +09:00
Ian Barwick
674c06d01c Decouple extension version check from binary version
Until now the extension version has always moved in lock-step
with the binary version, but that doesn't always need to be
the case, so make it possible to have an extension version
which does not match the binary version.
2020-10-30 14:42:58 +09:00
Ian Barwick
970d7a136f Fix return value of pg_reload_conf() database utility function
Would always return "false", but as the value wasn't used anywhere,
the issue was inconsequential.

However while we're at it, actually check the return value in the
two places it's called, to help diagnose any issues in the unlikely
event they occur.

Per issue reported via GitHub PR #671 from user duzhgg.
2020-10-30 14:25:11 +09:00
Ian Barwick
7bde686796 standby clone: handle missing "postgresql.auto.conf"
In PostgreSQL 12 and later we need to append replication configuration
to "postgresql.auto.conf" to guarantee it will be read last, and hence
override any preceding replication configuration which may be haunting
the configuration files.

We've been assuming that "postgresql.auto.conf" will always be present,
but at least one corner case has been observed where that was not the
case on the node being cloned from. Moreover it's perfectly acceptable
that this file does not exist (it will be recreated the next time
ALTER SYSTEM is executed), so we should be prepared to handle that case.

In passing, improve handling of more unlikely errors which might be
encountered when processing "postgresql.auto.conf".
2020-10-30 12:25:03 +09:00
Ian Barwick
ab1447aeca Standardize code style 2020-10-30 11:06:15 +09:00
Ian Barwick
293e37688f config: fix parsing of "replication_type"
This is a legacy parameter which can currently only contain one value,
"physical" (the default).

It can be safely omitted.

Addresses GitHub #672.
2020-10-30 10:14:04 +09:00
Ian Barwick
96718151a6 doc: update README
- remove partial sentence
- remove links to very dated blog entries
2020-10-27 13:58:07 +09:00
Ian Barwick
65ffe51bb4 doc: update README
Link to release notes as a simple way of providing the latest release
information.
2020-10-27 13:52:19 +09:00
Ian Barwick
b6d0288a82 Finalize release date 2020-10-22 21:23:50 +09:00
Ian Barwick
f888407ad8 Additional fix to upgrade script
Drop old "repl_events" table.
2020-10-22 21:22:52 +09:00
Ian Barwick
1512c7b761 Fix extension script for unpackaged upgrades to 5.2
Apparently "ALTER TABLE" (which we were using to convert the
"repl_events" table) does not mark the table as being part of the
extension. Instead, we need to create the new table and copy the
data, as is done with the other tables.
2020-10-22 21:22:47 +09:00
Ian Barwick
8877d4d508 doc: add missing "unpackaged" reference 2020-10-22 21:22:43 +09:00
Ian Barwick
f18b2e900d standby clone: improve Barman source server check
Use "remote_command()" to execute the remote psql command, and
provide the -X option to psql to ensure it doesn't read ~/.psqlrc.
2020-10-22 21:22:04 +09:00
Ian Barwick
5c4aa1856c doc: update README 2020-10-22 21:20:17 +09:00
Ian Barwick
091a2df167 doc: update release notes 2020-10-20 14:09:44 +09:00
Ian Barwick
17a1732eb0 Bump master branch to 5.3dev
Also update the minimum version check to PostgreSQL 9.4.
2020-10-20 13:41:49 +09:00
Ian Barwick
397e0ed5be Silence potential compiler complaint
*We* know the target buffer is sufficiently sized to accept the
source string, but the compiler doesn't.
2020-10-20 09:51:49 +09:00
Ian Barwick
8f3994b071 doc: update "repmgr standby clone" reference
Clarify which replication configuration parameters will be written
for which PostgreSQL version.
2020-10-20 09:35:22 +09:00
Ian Barwick
f7c232b393 Only write "recovery_target_timeline" for PostgreSQL 11 and earlier
"recovery_target_timeline" defaults to "latest" from PostgreSQL 12
(see core commit 2dedf4d9) so no need to write it explicitly.
2020-10-20 09:18:29 +09:00
Ian Barwick
ac2feba380 Improve capture of pg_rewind stderr output
As it seems redirecting stderr to stdin (2>&1) when executing
system commands results in a SIGPIPE (141) return code, making
it impossible to determine the actual return code, redirect
stderr to a temporary file and collate the output from that.

There are possibly better ways of doing this which could
be revisited at a future date.
2020-10-15 14:01:18 +09:00
Ian Barwick
725e9f9851 Remove more unused code 2020-10-15 11:03:18 +09:00
Ian Barwick
00bf4d61fa Remove unused code 2020-10-15 10:54:18 +09:00
Ian Barwick
338c4d3f8a node rejoin: rename variable for clarity 2020-10-14 17:57:20 +09:00
Ian Barwick
250c6291df node rejoin: fix Pg13+ "standby.signal" handling with pg_rewind 2020-10-14 17:51:57 +09:00
Ian Barwick
773159a9e8 standby clone: move check for --waldir pg_basebackup option
When cloning from Barman, and --no-upstream-connection was supplied,
the server version number will not be available at this point in the
code. It will however later be extracted from the Barman metadata,
so move the check for the --waldir pg_basebackup option to after
this point.

Also add an explicit check that a server version number has been
obtained (and fall back to extracting it from the cloned data
directory), as subsequent operations depend on knowing this to
be performed correctly.
2020-10-13 14:27:46 +09:00
Ian Barwick
5f986bc981 node rejoin: handle unclean shutdown in Pg13
From PostgreSQL 13, pg_rewind will automatically handle an unclean
shutdown itself, so as long as --force-rewind was provided, so there
is no need to fail with an error.

Note that pg_rewind handles the unclean shutdown by starting PostgreSQL
in single user mode, which it does before performing any checks as
to whether a rewind is actually necessary.

However pg_rewind doesn't take into account the possible presence
of a standby.signal file, so we remove that and recreate it after
pg_rewind was executed.
2020-10-13 10:18:55 +09:00
Ian Barwick
d62743ddf4 Make repmgr metadata tables dumpable
This makes it easier to extract data for troubleshooting.
2020-10-12 10:02:52 +09:00
Ian Barwick
b195547525 Remove hack to create pre-9.4 monitoring history table
The pg_lsn datatype was introduced in PostgreSQL 9.4, and we no
longer need to support earlier versions.
2020-10-12 09:32:37 +09:00
Ian Barwick
758c18985c node rejoin: better document unrecoverable situation
If two diverged nodes are on the same timeline, currently there's
no way of establishing the divergence point and pg_rewind
is ineffective.

Clarify the log messages to make this clearer.
2020-10-08 21:09:51 +09:00
Ian Barwick
7969dc4800 Enable "node rejoin" to join a target with a lower timeline
This has been possible since PostgreSQL 9.6, but the node rejoin/follow
check did not consider this possibility.
2020-10-08 16:51:16 +09:00
Ian Barwick
0fc8c6c79c Add some break statements to silence compiler warnings 2020-10-08 13:10:00 +09:00
Ian Barwick
e7acb6809b standby clone: improve error logging
When executing "repmgr standby clone" in Barman mode, and --waldir
is set in pg_basebackup options, properly report an error if the
target WAL directory could not be created or is not empty.
2020-10-08 10:46:20 +09:00
Ian Barwick
4b524c52b6 standby clone: honour --waldir setting when cloning from Barman
By setting --waldir in "pg_basebackup_options", standbys cloned using
pg_basebackup would have their WAL directory set to the specified
location and symlinked from the data directory.

This commit causes repmgr to honour that setting even when cloning
from Barman.
2020-10-07 15:13:52 +09:00
Ian Barwick
b3b9281253 Parse pg_basebackup option --waldir/--xlogdir 2020-10-07 15:13:50 +09:00
Ian Barwick
679cfe0852 doc: update release notes
Note PostgreSQL 13 support as a general feature.
2020-10-06 14:17:16 +09:00
Ian Barwick
e10d9fd393 EXPERIMENTAL: synchronise try_primary_reconnect()'s reconnection loop
Per proposal in GitHub #662, this patch attempts to synchronise each
repmgrd's primary reconnection attempts to prevent potential race
conditions. This relies on each node's clock being correcly
synchronised.

Currently this change is experimental and is not enabled by default.
It can be enabled by setting the repmgr.conf parameter
"reconnect_loop_sync".
2020-10-06 13:35:49 +09:00
Ian Barwick
467d19bcd4 Use atoll() to parse system_identifier on 32bit systems
Addresses issue in GitHub #665.
2020-10-06 09:44:31 +09:00
Ian Barwick
be244e2155 Fix typo
s/paremeter/parameter/
2020-10-05 17:06:52 +09:00
Ian Barwick
42283bf344 repmgrd: check local connection after promoting local node
In theory the local connection should not be affected by the node's
promotion. However we're handing over control to an external command
which is usually just "repmgr standby promote", but could potentially
be a user-defined script with unknowable side effects. So it's
better to be safe than sorry.
2020-10-05 16:50:41 +09:00
Ian Barwick
5b254a1be9 repmgrd: add parameter "failover_delay"
This parameter is not documented and intended for use during testing.
It should not be used in production.
2020-10-05 16:43:06 +09:00
Ian Barwick
9aaf7d79a2 standby switchover: in Pg13 and later, promotion overrides paused WAL replay
Preventing a switchover in this case no longer makes sense, so we apply
the checks to PostgreSQL 12 and earlier only.
2020-09-30 15:15:11 +09:00
Ian Barwick
e86c035242 standby promote: in Pg13 and later, promotion overrides paused WAL replay
Aborting in this case no longer makes sense, so we apply the checks
to PostgreSQL 12 and earlier only.
2020-09-30 15:15:07 +09:00
Ian Barwick
73d2088a85 standby follow: don't restart server (PostgreSQL 13 and later)
As of PostgreSQL 13, changes to the fundamental replication
configuration can be applied with a simple SIGHUP, no restart
required.

In case the old behaviour is desired, i.e. a full restart to apply
the configuration changes, the new configuration parameter
"standby_follow_restart" can be set. This parameter has no effect
in PostgreSQL 12 and earlier.
2020-09-29 17:53:51 +09:00
Ian Barwick
48f95f9a39 Fix typo in comment 2020-09-29 15:26:28 +09:00
Ian Barwick
ce229beff8 repmgrd: add configuration option "always_promote"
In certain corner cases, it's possible repmgrd may end up monitoring
a standby which was a former primary, but the node record has not
yet been updated.

Previously repmgrd would abort the promotion with a cryptic message
about being unable to find a node record for node_id -1 (the
default value for an unknown node id).

This commit addes a new configuration option "always_promote", which
determines whether repmgrd should promote the node in this case.
The default is "false", to effectively maintain the existing behaviour.

Logging output has also been improved to make it clearer what has
happened when this situation occurs.
2020-09-29 14:18:00 +09:00
Ian Barwick
16eeae700c repmgrd: minor log message tweak 2020-09-29 10:23:31 +09:00
Ian Barwick
70061c51aa Further improve handling of possible pg_control read errors
Builds on changes in commit 147f454, and ensures appropriate
action is taken if a value cannot be read from pg_control.
2020-09-28 13:59:34 +09:00
houzj.fnst
3ffeffbd8b Remove redundant condition
GitHub #655.
2020-09-28 13:08:18 +09:00
Ian Barwick
147f454d32 Minor sanity check for control file extraction functions
If the control file couldn't be parsed for whatever reason, return
the default value for the requested parameter.

It'd be better to have the caller pass in a pointer to the parameter
and have the function return bool so the caller doesn't assume the
control file was read successfully. This is important for handling
DBState, where no "value unknown" default is available.
2020-09-28 10:47:56 +09:00
Ian Barwick
26b5664741 repmgr: enable "primary unregister --force" to unregister an active primary
The primary must have no registered standby nodes.

Also document usage when unregistering a primary node which is actually
running as a standby.
2020-09-23 15:12:19 +09:00
Ian Barwick
cb86180f4f doc: document include directives 2020-09-18 16:27:52 +09:00
Ian Barwick
1f3e098104 Add option "--dump-config"
This is initially intended for verifying the configuration parsing
mechanism and is currently undocumented.
2020-09-18 15:12:22 +09:00
Ian Barwick
4670515285 config: fix parsing of event_notifications list 2020-09-18 14:36:56 +09:00
Ian Barwick
bccc2673b6 doc: update compatibility matrix 2020-09-18 11:40:47 +09:00
Ian Barwick
158008c5c5 doc: update release notes 2020-09-18 11:29:07 +09:00
Ian Barwick
5f3d1cdeb6 doc: note removal of PostgreSQL 9.3 support 2020-09-17 16:05:16 +09:00
Ian Barwick
82515a9733 doc: document new parameters for "failover_validation_command" 2020-09-17 15:48:18 +09:00
Stanislav Paskalev
73e8373337 Add %v, %u and %t parameters to "failover_validation_command"
These indicate:
 - the number of visible nodes sharing the current upstream
 - the number of nodes on the current upstream
 - the total number of nodes in the entire repmgr cluster.

This allows the failover_validation_command to be used to perform
more thorough validations, including cross-referencing external
cluster management state (e.g. if managed by kubernetes).

GitHub #651.
2020-09-17 15:48:12 +09:00
Ian Barwick
f1bdb09512 doc: note existing pg_rewind corner-case bug 2020-09-15 14:21:14 +09:00
Ian Barwick
028e3ab48d doc: rearrange "repmgr node rejoin" reference for clarity
The <important> section looked like an actual subsection, so convert
that and the following example section into <refsect2> sections.
2020-09-15 13:42:18 +09:00
Ian Barwick
b5b7d635ad doc: fix "release-current" tag 2020-09-04 15:12:33 +09:00
Ian Barwick
146654bf0e Clarify code comment 2020-09-04 15:10:54 +09:00
Ian Barwick
4c3aed2573 rsync: exclude log/pg_log directory depending on PostgreSQL version
This is more for completeness as the data source here is Barman, which
shouldn't contain log files anyway.
2020-09-04 15:03:27 +09:00
Ian Barwick
3945314e65 Remove PostgreSQL 9.3 support
PostgreSQL 9.3 community support ended in November 2018.
2020-09-04 11:37:12 +09:00
Ian Barwick
f4938a4a42 standby clone: tweak --help output wording 2020-09-03 10:41:04 +09:00
Ian Barwick
9a836b3c04 Add option --verify-backup
This causes pg_verifybackup to be executed immediately after
pg_basebackup completes.

PostreSQL 13 and later.
2020-09-03 10:40:32 +09:00
Ian Barwick
c8e52e486f Have make_pg_path() output to a PQexpBuffer
Calling functions are all using one anyway, so there's no point keeping
static buffers around.
2020-09-02 15:30:57 +09:00
Ian Barwick
1f7ac843fd Consolidate role availability checking code 2020-09-01 14:37:33 +09:00
Ian Barwick
8d57d7e001 is_downstream_node_attached(): avoid false negative
If the provided connection does not have sufficient permission to read
"pg_stat_replication.state", and there is an entry for the node in
"pg_stat_replication", assume it's connected. Finer-grained detection
requires additional user permissions, nothing we can do about that.
2020-09-01 14:28:40 +09:00
Ian Barwick
13e7c679cd Minor coding style fixes 2020-09-01 13:35:13 +09:00
Ian Barwick
466590af28 Fix comment 2020-09-01 13:23:46 +09:00
Ian Barwick
a88c80248c repmgrd: minor tweaks to witness node synchronisation
Explicitly roll back if any operation fails, and add debugging output
to track elapsed time between synchronisation intervals.
2020-09-01 09:58:14 +09:00
Ian Barwick
1131e3aad2 PostgreSQL 13: support "wal_keep_size"
Renamed from "wal_keep_segments" in core commit f5dff459.
2020-08-31 17:18:41 +09:00
Ian Barwick
0630d9644e Improve replication connection check
Previously the check verifying that a node has connected to its upstream
merely assumed the presence of a record in pg_stat_replication indicates
a successful replication connection. However the record may contain a
state other than "streaming", typically "startup" (which will occur when
a node has diverged from its upstream and will therefore never
transition to "streaming"), which needs to be taken into account when
considering the state of the replication connection to avoid false
positives.
2020-08-27 16:09:25 +09:00
Ian Barwick
c50a2d049c doc: note use of wildcards in .pgpass file 2020-08-19 10:32:34 +09:00
Ian Barwick
20100f5aaa docs: link to PostgreSQL roadmap 2020-08-06 09:50:23 +09:00
Ian Barwick
2d20c110bf doc: update "repmgr witness register" description
Add missing "Options" section.
2020-08-06 09:50:19 +09:00
Ian Barwick
aed0045c3a docs: reformat additonal config file upgrade notes into a new section
It's easier to link to the information that way.
2020-08-06 09:50:16 +09:00
Martín Marqués
cd81046c26 doc: add two notes on section related to configuration files
Add notes to the documention mentioning that after postgres or repmgr
upgrades (postgres major upgrades), there are some changes that need
to be taken care of.

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2020-08-06 09:50:11 +09:00
Ian Barwick
fb32284cdc remove superfluous debugging output 2020-08-06 09:05:55 +09:00
Ian Barwick
893044f1e9 Add missing pfree() calls 2020-07-16 11:08:20 +09:00
Ian Barwick
271d407c7c doc: update sample configuration file
Clarify parameters for recovery_min_apply_delay
2020-07-16 11:07:49 +09:00
Ian Barwick
1d0103ca44 cluster matrix/crosscheck: improve text mode output formatting
Previously these actions were hard-wired to assume node IDs would only
ever have two digits at most.

Refactor to use the same table generation code as other actions, which
properly handles variable column sizes.
2020-07-06 13:52:28 +09:00
Ian Barwick
51d56684a7 node status: clarify "archive_mode" message on standbys
"archive_mode = 'always'" available from PostgreSQL 9.5.
2020-07-06 10:21:00 +09:00
Ian Barwick
4d88f177a7 doc: clarify "node rejoin" usage
Emphasize that conninfo must be provided for a running node.
2020-07-06 09:55:06 +09:00
Ian Barwick
de3f0802b4 Update source comments to clarify data directory modifications 2020-06-19 13:51:00 +09:00
Ian Barwick
2985b9d91f Remove vestiges of deprecated "--node" option
The option was never actually set anywhere.

By removing it, readline will now produce a reasoanably helpful message
in the offchance it is provided, e.g.:

   option '--node=foo' is ambiguous; possibilities: '--node-id' '--node-name'
2020-06-10 13:11:22 +09:00
Ian Barwick
300e11eb76 Rearrange some command line option handling definitions for clarity 2020-06-10 13:02:13 +09:00
Ian Barwick
53546b1c88 node rejoin: remove unneeded PQfinish() 2020-06-10 10:36:55 +09:00
Ian Barwick
547bbb06d8 doc: note downstream node (dis)connection monitoring in more places 2020-06-09 16:21:36 +09:00
Ian Barwick
0d0ffc675c standby clone: add a strategic Assert 2020-06-09 14:31:49 +09:00
Ian Barwick
11dc923a20 standby clone: minor code cleanup 2020-06-09 14:31:44 +09:00
Ian Barwick
e97319f01d Fix typo in comment 2020-06-09 14:31:40 +09:00
Ian Barwick
db1cb1433f Rename the TablespaceDataListCell element "f" to "fptr" for clarity
And add a few more comments to make it clearer what's going on.
2020-06-09 14:31:36 +09:00
Ian Barwick
c1428a3ecd standby clone: fixes for Barman tablespace handling.
repmgr creates a file with a list of tablespace files to fetch from
Barman, however the file may not actually have been flushed to disk
at the point the rsync operation was executed, so may be incomplete
or empty.

Also fix handling of tablespace remapping.

Addresses GitHub #650.
2020-06-09 10:52:10 +09:00
Ian Barwick
fc568a9101 run_file_backup(): fix comments
Explicitly document use-case for this function, and fix a comment
which probably got munged by pg_indent.
2020-06-08 12:45:38 +09:00
Ian Barwick
e65738c989 Explicitly unset search path when connecting to database 2020-05-22 16:11:55 +09:00
Ian Barwick
aaea24b58b repmgrd: log reconnection of paired connection 2020-05-22 14:21:17 +09:00
Ian Barwick
d75a35a788 repmgrd: clarify why node is not configured for automatic failover 2020-05-22 11:21:48 +09:00
Ian Barwick
8233560629 repmgrd: ensure cascaded standby reconnects to primary
If the primary connection went away, and the upstream is not the
primary, attempt to reconnect if the monitoring update fails.

If the upstream is the primary, the reconnection will happen on
the next connection check.
2020-05-22 11:11:58 +09:00
Ian Barwick
cf60844c45 repmgrd: ensure primary connection is reset if same as upstream
Addresses GitHub #633.
2020-05-22 11:11:54 +09:00
Ian Barwick
a0d3fae7ab standby register: ensure location field is compared during record check 2020-05-21 14:35:03 +09:00
Ian Barwick
f05978e2e1 doc: update example log output to match current code 2020-05-15 10:33:47 +09:00
Ian Barwick
b7475792e7 configuration: add maximum nesting depth
As with ff4771ab, this is the same behaviour as core PostgreSQL.
2020-05-14 16:27:35 +09:00
Ian Barwick
5c16e94672 configuration: clean up unneeded code and comments 2020-05-14 16:27:22 +09:00
Ian Barwick
ff4771ab02 configuration: reject direct recursion
Ensure a configuration file can't include itself.

This is the same behaviour as core PostgreSQL.
2020-05-14 15:32:04 +09:00
Ian Barwick
d59cadd5f6 Remove old configuration handling code
This expunges two large and cumbersome sets of if/else statements and
the T_CONFIGURATION_OPTIONS_INITIALIZER macro, all of which needed to
be kept in sync when adding/modifying configuration file parameters.
2020-05-14 11:57:16 +09:00
Ian Barwick
04aee7b406 Set defaults before loading configuration file 2020-05-14 11:57:07 +09:00
Ian Barwick
029164a817 Make configuration default value usage more consistent 2020-05-14 11:57:04 +09:00
Ian Barwick
3dde8f1386 "retire" old configuration handling code 2020-05-14 11:57:00 +09:00
Ian Barwick
94fbf76b2e be quiet, griping compiler! 2020-05-14 11:56:56 +09:00
Ian Barwick
4a1855fabe Place configuration settings struct in separate file 2020-05-14 11:56:45 +09:00
Ian Barwick
d79d4c50b2 handle tablespace mapping 2020-05-14 11:56:42 +09:00
Ian Barwick
2071fa8c7e Initial implementation of an iterable configuration item list
This implements storing the configuration file parameter definitions in
an iterable list. This will replace the existing way of populating the
configuration struct, which is a long and cumbersome if/else structure,
and will make it possible to later dump the imported configuration.
2020-05-14 11:56:38 +09:00
Ian Barwick
9945a3a4a8 Handle "include_dir" 2020-05-14 11:56:35 +09:00
Ian Barwick
0df0db1281 Handle "include_if_exists" 2020-05-14 11:56:30 +09:00
Ian Barwick
682ec9184a Handle "include" 2020-05-14 11:56:26 +09:00
Ian Barwick
fdc6f61257 Pass base configuration file directory to configuration parser
If provided, the parser will use this to process include directives
with unqualified filenames.
2020-05-14 11:56:23 +09:00
Ian Barwick
f5018e42f3 Initial refactoring of configuration file parsing
Have the configuration file parsing routine itself open the respective
configuration file, rather than passing a file pointer from the original
caller. This is required for handling include directives, which we'll
want to do for sanity-checking the PostgreSQL configuration on a freshly
cloned, unstarted standby.
2020-05-14 11:56:19 +09:00
Ian Barwick
26689871dc Update filename in comment 2020-05-14 10:50:36 +09:00
Ian Barwick
a863dc7f6c repmgrd: additional check for the upstream connection
It's possible the upstream server was intermittently unavailable in
the interval between checks, invalidating the upstream connection.
With check types "ping" and "connection", the connection would not be
restored, so if the availability check was successful, additionally
verify the upstream connection and restore if necessary.

Addresses GitHub #633.
2020-05-14 10:26:57 +09:00
Ian Barwick
9b6fe6858a doc: update repmgr.conf.sample
Was missing "query" option for "connection_check_type".
2020-05-12 17:05:22 +09:00
Ian Barwick
2f667116d8 repmgrd: include node name in log output
Missed in commit fd52df0.
2020-05-12 15:31:47 +09:00
Ian Barwick
8ee4fac5bb repmgrd: minor refactoring of try_primary_reconnect() 2020-05-12 14:52:14 +09:00
Ian Barwick
bb56387aaa repmgrd: consolidate connection closing code
PQfinish() should only be called on local PGconn pointers which
will not be reused.
2020-05-12 14:48:39 +09:00
Ian Barwick
5d00094936 repmgrd: ensure "close_connection()" always called after connection failure 2020-05-12 14:41:33 +09:00
Ian Barwick
ebdfdc530d repmgrd: ensure PQfinish() always executed on failed connections in NodeInfoLists
clear_node_info_list() will clean up any remaining active connections,
but we need to ensure all failed connections are cleaned up at the point
of failure to prevent leaks.

Per report in GitHub #643.
2020-05-12 14:22:08 +09:00
Ian Barwick
e5d3285d02 repmgrd: remove redundant log message 2020-05-11 16:59:32 +09:00
Ian Barwick
fd52df0fab repmgrd: include node name in log output in more places
Still a few places where only the node ID was reported, but it's always
useful to have the node name as well.
2020-05-11 16:55:31 +09:00
Ian Barwick
1b5ad743b5 standby clone: explicitly set closed connection pointers to NULL
We omitted to do this with the connections used when checking the system
identifier, which means libpq calls by the teardown function using the
pointer risk using unallocated memory.

Addresses issue reported in GitHub #644.
2020-05-11 13:52:10 +09:00
Ian Barwick
389c0ab9c0 Clarify use of function parameter 2020-05-11 13:37:42 +09:00
Ian Barwick
72dfe28e81 doc: clarify usage of "-f /etc/repmgr.conf" in examples 2020-05-08 10:22:52 +09:00
Ian Barwick
bc566f7a42 standby check: ignore upstream/downstream connections if node is witness
Per report in GitHub #641.
2020-05-08 09:37:30 +09:00
Ian Barwick
d1ab6ce28b standby clone: emit warning, not error if server is 9.3 and tablespace_mapping provided 2020-05-07 10:23:07 +09:00
Ian Barwick
bcc284cac9 Refactor configuration file reload handling
Rather than parse the configuration file into a new structure and
copy changed values from that into the main structure, we'll copy
the existing structure before parsing the changed configuration
file directly into the nmain structure, and revert using the copy
if any issues are encountered.

This is necessary as preparation for further reworking of the
configuration file structure handling. It also makes the reload
idempotent.

While we're at it, make some general improvements to the reload
handling, particularly:

 - improve logging to show "before" and "after" values
 - collate change notifications and only display if no errors
   were found
 - remove unnecessary double-logging of errors
 - various bugfixes
2020-05-05 15:29:07 +09:00
Ian Barwick
d37513312a Move the main configfile structure into configfile.c
This is required for a later refactoring of the configuration file
handling.
2020-05-05 14:43:55 +09:00
Ian Barwick
3ca642fee1 repmgrd: log receipt of SIGHUP at log level NOTICE
PostgreSQL itself logs it at log level LOG, which we don't have,
but NOTICE seems reasonable, especially as we log SIGTERM as that.
2020-05-05 13:41:23 +09:00
Ian Barwick
be8e5b45fa Add utility function validate_conninfo_string() 2020-05-05 13:41:18 +09:00
Ian Barwick
5ee4540640 Fix typo in comment 2020-05-01 12:13:06 +09:00
Ian Barwick
6507a374f7 doc: update legacy upgrade procedure
From PostgreSQL 13, the "CREATE EXTENSION ... FROM" syntax is no longer
available. It's unlikely at this point that someone will find themselves
with a PostgreSQL 13 database and the legacy repmgr schema, but we'll
cover all bases just in case.
2020-05-01 10:27:33 +09:00
Ian Barwick
c884f21a58 Tidy up log message 2020-04-28 14:23:39 +09:00
Ian Barwick
11e1b7a2c5 Check function return values in modify_auto_conf() 2020-04-27 14:26:04 +09:00
Ian Barwick
d0c5dffe91 standby clone: explicitly log that replication slots not in use
Helps with diagnosing output.
2020-04-27 13:57:18 +09:00
Ian Barwick
38b3447bd3 Add repmgr home page to --help output
Per PostgreSQL commit 1933ae629e7b706c6c23673a381e778819db307d it seems
to be all the rage these days.
2020-04-24 09:41:56 +09:00
Ian Barwick
3200c8c4e4 doc: clarify usage of the "passfile" parameter. 2020-04-23 15:03:40 +09:00
Ian Barwick
971309c830 Fix parsing of database connection check results in "standby clone" 2020-04-23 13:19:20 +09:00
Ian Barwick
1628bfb846 Update references to "recovery.conf" in _do_create_replication_conf() 2020-04-23 11:42:13 +09:00
Ian Barwick
8adcb1348d repmgrd: improve logging of promote_command failure
- log failure *before* we check if the primary has reappeared
- log the error code
2020-04-21 15:02:15 +09:00
Ian Barwick
025e66ea46 standby switchover: check superuser connection on demotion candidate
Add a sanity check that rempgr, when remotely executed on the demotion
candidate, is able to connect as superuser. If not, emit a diagnostic
command as a hint.
2020-04-21 11:25:01 +09:00
Ian Barwick
4e48301d78 standby switchover: note database name for superuser connections
It's useful to have a confirmation of which database repmgr is trying
to connect to when the -S/--superuser connection is provided.

It will always be the database defined in the repmgr.conf "conninfo"
parameter, but having the name available is useful when e.g.
troubleshooting issues with .pgpass configuration.
2020-04-20 16:49:47 +09:00
Ian Barwick
f8b214f721 doc: have Makefile clean up generated html files 2020-04-20 15:40:13 +09:00
Ian Barwick
96acd3f915 doc: clarify .pgpass usage with -S/--superuser option 2020-04-20 15:34:29 +09:00
Ian Barwick
c1584d587c doc: remove DEBUG output from example 2020-04-20 12:14:47 +09:00
Ian Barwick
2f26a02b5c doc: clarify usage of -F/--force with "standby promote"
Per GitHub #632.
2020-04-20 12:11:49 +09:00
Ian Barwick
cb2fb53556 Fix debug logging
Per GitHub #630.
2020-04-20 11:07:51 +09:00
Ian Barwick
e1953742a1 doc: note additional diagnostic options provided by "node check" 2020-04-17 11:53:56 +09:00
Ian Barwick
97d83bd443 standby switchover: add hint for diagnosing remote DB connection failure
Output a command, which when excuted on the local node (promotion
candidate) will attempt to remotely connect to the demotion candidate
and display both the connection message encountered and the connection
parameters used.

This is useful for corner-cases where the connection normally succeeds if a
particular environment variable (e.g. PGPORT) is normally set, but is
not set in the environment where SSH is executed.
2020-04-17 11:20:02 +09:00
Ian Barwick
45e96f21a5 node check: add option --db-connection
This is intended for diagnostic purposes, primarily when diagnosing
the connection parameters used when repmgr is being executed on a
remote node.
2020-04-15 17:48:23 +09:00
Ian Barwick
9df1731fb8 doc: update release notes 2020-04-15 14:50:51 +09:00
Ian Barwick
cfd35852b7 standby switchover: improve archive check error handling
Explicitly log if a database connection failure caused the check
to fail.

It's unlikely this situation will be encountered, as the data directory
check will already have run and checked for connection failure, however
there's a small chance the connection could fail between checks.
2020-04-15 14:08:33 +09:00
Ian Barwick
32dde4eaaf standby switchover: improve directory check failure handling
It's possible that the remote data directory check will fail if e.g.
connection configuration is not consistent across all nodes. This
modification ensures a database error connection is reported, rather
than a spurios issue with the data directory configuration.
2020-04-15 14:08:29 +09:00
Ian Barwick
78f89a4d47 node check: report connection error if --optformat provided
The --optformat option is intended for use when repmgr is being invoked
remotely by another repmgr instance, typically during a switchover
operation.

Previously no output was returned if the local repmgr was unable to
connect to its local PostgreSQL instance, which made diagnosing
various corner-case problems trickier than it should be.
2020-04-15 14:08:25 +09:00
Ian Barwick
410dd40526 standby switchover: standardize log message 2020-04-15 10:24:44 +09:00
Ian Barwick
3d85d8f5ff doc: update link to Debian package archive
See also https://www.df7cb.de/blog/2020/apt-archive.postgresql.org.html
2020-04-14 12:32:59 +09:00
Ian Barwick
d20711c267 Update Makefile for 5.2dev 2020-04-13 17:49:58 +09:00
Ian Barwick
e4a7da0132 Add upgrade route for repmgr 3.x to repmgr 5.1
The removal of some extensions functions means it's not possible to
follow the conventional incremental upgrade path; instead we'll
create a script for direct upgrades to 5.1.
2020-04-13 17:45:19 +09:00
Ian Barwick
4f0e8a503a doc: fix typo in 5.1.0 release section ID 2020-04-13 17:17:06 +09:00
Ian Barwick
c39289d570 doc: finalize release notes 2020-04-13 17:17:01 +09:00
Ian Barwick
662b603770 doc: update release notes 2020-04-13 17:16:55 +09:00
Ian Barwick
fa4f37ddb9 Bump master branch to 5.2dev 2020-04-08 14:22:51 +09:00
97 changed files with 7689 additions and 2466 deletions

View File

@@ -2,7 +2,7 @@ License and Contributions
=========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2020, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
Copyright 2010-2021, EnterpriseDB Corporation. See the files COPYRIGHT and LICENSE for
details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -12,10 +12,10 @@ which has received funding from the European Union's Seventh Framework Programme
(FP7/2007-2013) under grant agreement 258862.
Contributions to `repmgr` are welcome, and will be listed in the file `CREDITS`.
2ndQuadrant Limited requires that any contributions provide a copyright
EnterpriseDB Corporation requires that any contributions provide a copyright
assignment and a disclaimer of any work-for-hire ownership claims from the
employer of the developer. This lets us make sure that all of the repmgr
distribution remains free code. Please contact info@2ndQuadrant.com for a
distribution remains free code. Please contact info@enterprise.com for a
copy of the relevant Copyright Assignment Form.
Code style

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2020, 2ndQuadrant Limited
Copyright (c) 2010-2021, EnterpriseDB Corporation
All rights reserved.
This program is free software: you can redistribute it and/or modify

2
FAQ.md
View File

@@ -5,6 +5,6 @@ The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](http
The repmgr 3.x FAQ can be found here:
https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/FAQ.md
https://github.com/EnterpriseDB/repmgr/blob/REL3_3_STABLE/FAQ.md
Note that repmgr 3.x is no longer supported.

55
HISTORY
View File

@@ -1,3 +1,54 @@
5.3.1 2022-??-??
repmgrd: fixes for potential connection leaks (hslightdb)
5.3.0 2021-10-12
standby switchover: improve handling of node rejoin failure (Ian)
repmgrd: prefix all shared library functions with "repmgr_" to
minimize the risk of clashes with other shared libraries (Ian)
repmgrd: at startup, if node record is marked as "inactive", attempt
to set it to "active" (Ian)
standby clone: set "slot_name" in node record if required (Ian)
node rejoin: emit rejoin target note information as NOTICE (Ian)
repmgrd: ensure short option "-s" is accepted (Ian)
5.2.1 2020-12-07
config: fix parsing of "replication_type"; GitHub #672 (Ian)
standby clone: handle missing "postgresql.auto.conf" (Ian)
standby clone: add option --recovery-min-apply-delay (Ian)
standby clone: fix data directory permissions handling for
PostgreSQL 11 and later (Ian)
repmgrd: prevent termination when local node not available and
standby_disconnect_on_failover; GitHub #675 (Ian)
repmgrd: ensure reconnect_interval" is correctly handled;
GitHub #673 (Ian)
5.2.0 2020-10-22
general: add support for PostgreSQL 13 (Ian)
general: remove support for PostgreSQL 9.3 (Ian)
config: add support for file inclusion directives (Ian)
repmgr: "primary unregister --force" will unregister an active primary
with no registered standby nodes (Ian)
repmgr: add option --verify-backup to "standby clone" (Ian)
repmgr: "standby clone" honours --waldir option if set in
"pg_basebackup_options" (Ian)
repmgr: add option --db-connection to "node check" (Ian)
repmgr: report database connection error if the --optformat option was
provided to "node check" (Ian)
repmgr: improve "node rejoin" checks (Ian)
repmgr: enable "node rejoin" to join a target with a lower timeline (Ian)
repmgr: support pg_rewind's automatic crash recovery in Pg13 and later (Ian)
repmgr: improve output formatting for cluster matrix/crosscheck (Ian)
repmgr: improve database connection failure error checking on the
demotion candidate during "standby switchover" (Ian)
repmgr: make repmgr metadata tables dumpable (Ian)
repmgr: fix issue with tablespace mapping when cloning from Barman;
GitHub #650 (Ian)
repmgr: improve handling of pg_control read errors (Ian)
repmgrd: add additional optional parameters to "failover_validation command"
(spaskalev; GitHub #651)
repmgrd: ensure primary connection is reset if same as upstream;
GitHub #633 (Ian)
5.1.0 2020-04-13
repmgr: remove BDR 2.x support
repmgr: don't query upstream's data directory (Ian)
@@ -5,11 +56,11 @@
repmgr: ensure postgresql.auto.conf is created with correct permissions (Ian)
repmgr: minimize requirement to check upstream data directory location
during "standby clone" (Ian)
repmgr: warn about missing pg_rewind prerequisites when excuting
repmgr: warn about missing pg_rewind prerequisites when executing
"standby clone" (Ian)
repmgr: add --upstream option to "node check"
repmgr: report error code on follow/rejoin failure due to non-available
replication slot (Ian)
0 replication slot (Ian)
repmgr: ensure "node rejoin" checks for available replication slots (Ian)
repmgr: improve "standby switchover" completion checks (Ian)
repmgr: add replication configuration file ownership check to

View File

@@ -12,6 +12,8 @@ EXTENSION = repmgr
DATA = \
repmgr--unpackaged--4.0.sql \
repmgr--unpackaged--5.1.sql \
repmgr--unpackaged--5.2.sql \
repmgr--unpackaged--5.3.sql \
repmgr--4.0.sql \
repmgr--4.0--4.1.sql \
repmgr--4.1.sql \
@@ -24,7 +26,11 @@ DATA = \
repmgr--4.4--5.0.sql \
repmgr--5.0.sql \
repmgr--5.0--5.1.sql \
repmgr--5.1.sql
repmgr--5.1.sql \
repmgr--5.1--5.2.sql \
repmgr--5.2.sql \
repmgr--5.2--5.3.sql \
repmgr--5.3.sql
REGRESS = repmgr_extension
@@ -57,8 +63,11 @@ $(info Building against PostgreSQL $(MAJORVERSION))
REPMGR_CLIENT_OBJS = repmgr-client.o \
repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o \
repmgr-action-cluster.o repmgr-action-node.o repmgr-action-service.o repmgr-action-daemon.o \
configfile.o configfile-scan.o log.o strutil.o controldata.o dirutil.o compat.o dbutils.o sysutils.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o configfile.o configfile-scan.o log.o dbutils.o strutil.o controldata.o compat.o sysutils.o
configdata.o configfile.o configfile-scan.o log.o strutil.o controldata.o dirutil.o compat.o \
dbutils.o sysutils.o
REPMGRD_OBJS = repmgrd.o repmgrd-physical.o configdata.o configfile.o configfile-scan.o log.o \
dbutils.o strutil.o controldata.o compat.o sysutils.o
DATE=$(shell date "+%Y-%m-%d")
repmgr_version.h: repmgr_version.h.in
@@ -70,10 +79,10 @@ configfile-scan.c: configfile-scan.l
$(REPMGR_CLIENT_OBJS): repmgr-client.h repmgr_version.h
repmgr: $(REPMGR_CLIENT_OBJS)
$(CC) $(CFLAGS) $(REPMGR_CLIENT_OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
$(CC) $(CFLAGS) $(REPMGR_CLIENT_OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) -o $@$(X)
repmgrd: $(REPMGRD_OBJS)
$(CC) $(CFLAGS) $(REPMGRD_OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
$(CC) $(CFLAGS) $(REPMGRD_OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) -o $@$(X)
$(REPMGR_CLIENT_OBJS): $(HEADERS)
$(REPMGRD_OBJS): $(HEADERS)

View File

@@ -7,10 +7,10 @@ replication capabilities with utilities to set up standby servers, monitor
replication, and perform administrative tasks such as failover or switchover
operations.
PostgreSQL 12, 11, 10, 9.6 and 9.5 are fully supported.
PostgreSQL 9.4 and 9.3 are supported, with some restrictions.
The most recent `repmgr` version (5.2.1) supports all PostgreSQL versions from
9.5 to 13. PostgreSQL 9.4 is also supported, with some restrictions.
`repmgr` is distributed under the GNU GPL 3 and maintained by 2ndQuadrant.
`repmgr` is distributed under the GNU GPL 3 and maintained by EnterpriseDB.
Documentation
-------------
@@ -21,10 +21,11 @@ The full `repmgr` documentation is available here:
The old `README` file for `repmgr` 3.x is available here:
> https://github.com/2ndQuadrant/repmgr/blob/REL3_3_STABLE/README.md
> https://github.com/EnterpriseDB/repmgr/blob/REL3_3_STABLE/README.md
Note that the `repmgr` 3.x series is no longer supported and contains known bugs;
please upgrade to the current `repmgr` version as soon as possible.
please upgrade to the [current repmgr version](https://repmgr.org/docs/current/appendix-release-notes.html)
as soon as possible.
Versions
--------
@@ -54,11 +55,11 @@ Directories
Support and Assistance
----------------------
2ndQuadrant provides 24x7 production support for `repmgr`, including
EnterpriseDB provides 24x7 production support for `repmgr`, including
configuration assistance, installation verification and training for
running a robust replication cluster. For further details see:
* https://2ndquadrant.com/en/support/
* [EDB Support Services](https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql)
There is a mailing list/forum to discuss contributions or issues:
@@ -68,23 +69,12 @@ The IRC channel #repmgr is registered with freenode.
Please report bugs and other issues to:
* https://github.com/2ndQuadrant/repmgr
See
* https://github.com/EnterpriseDB/repmgr
Further information is available at https://repmgr.org/
We'd love to hear from you about how you use repmgr. Case studies and
news are always welcome. Send us an email at info@2ndQuadrant.com, or
send a postcard to
repmgr
c/o 2ndQuadrant
7200 The Quorum
Oxford Business Park North
Oxford
OX4 2JZ
United Kingdom
news are always welcome.
Thanks from the repmgr core team.
@@ -100,7 +90,4 @@ Further reading
* [repmgr documentation](https://repmgr.org/docs/current/index.html)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 1](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-1/)
* [How to Automate PostgreSQL 12 Replication and Failover with repmgr - Part 2](https://www.2ndquadrant.com/en/blog/how-to-automate-postgresql-12-replication-and-failover-with-repmgr-part-2/)
* https://blog.2ndquadrant.com/repmgr-3-2-is-here-barman-support-brand-new-high-availability-features/
* https://blog.2ndquadrant.com/improvements-in-repmgr-3-1-4/
* https://blog.2ndquadrant.com/managing-useful-clusters-repmgr/
* https://blog.2ndquadrant.com/easier_postgresql_90_clusters/
* [How to implement repmgr for PostgreSQL automatic failover](https://www.enterprisedb.com/postgres-tutorials/how-implement-repmgr-postgresql-automatic-failover)

View File

@@ -1,7 +1,7 @@
TODO
====
This file contains a list of improvements which are desireable and/or have
This file contains a list of improvements which are desirable and/or have
been requested, and which we aim to address/implement when time and resources
permit.
@@ -17,4 +17,4 @@ repmgrd nodes to prevent unintended failover; this is obviously inconvenient.
We'll need to implement some way of notifying each repmgrd to suspend automatic
failover until further notice.
Requested in GitHub #410 ( https://github.com/2ndQuadrant/repmgr/issues/410 )
Requested in GitHub #410 ( https://github.com/EnterpriseDB/repmgr/issues/410 )

View File

@@ -6,7 +6,7 @@
* supported PostgreSQL versions. They're unlikely to change but
* it would be worth keeping an eye on them for any fixes/improvements.
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,6 +1,6 @@
/*
* compat.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California

946
configdata.c Normal file
View File

@@ -0,0 +1,946 @@
/*
* configdata.c - contains structs with parsed configuration data
*
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "repmgr.h"
#include "configfile.h"
/*
* Parsed configuration settings are stored here
*/
t_configuration_options config_file_options;
/*
* Configuration settings are defined here
*/
struct ConfigFileSetting config_file_settings[] =
{
/* ================
* node information
* ================
*/
/* node_id */
{
"node_id",
CONFIG_INT,
{ .intptr = &config_file_options.node_id },
{ .intdefault = UNKNOWN_NODE_ID },
{ .intminval = MIN_NODE_ID },
{},
{}
},
/* node_name */
{
"node_name",
CONFIG_STRING,
{ .strptr = config_file_options.node_name },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.node_name) },
{}
},
/* conninfo */
{
"conninfo",
CONFIG_STRING,
{ .strptr = config_file_options.conninfo },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.conninfo) },
{}
},
/* replication_user */
{
"replication_user",
CONFIG_STRING,
{ .strptr = config_file_options.replication_user },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.replication_user) },
{}
},
/* data_directory */
{
"data_directory",
CONFIG_STRING,
{ .strptr = config_file_options.data_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.data_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* config_directory */
{
"config_directory",
CONFIG_STRING,
{ .strptr = config_file_options.config_directory },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.config_directory) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* pg_bindir */
{
"pg_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.pg_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* repmgr_bindir */
{
"repmgr_bindir",
CONFIG_STRING,
{ .strptr = config_file_options.repmgr_bindir },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgr_bindir) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* replication_type */
{
"replication_type",
CONFIG_REPLICATION_TYPE,
{ .replicationtypeptr = &config_file_options.replication_type },
{ .replicationtypedefault = DEFAULT_REPLICATION_TYPE },
{},
{},
{}
},
/* ================
* logging settings
* ================
*/
/*
* log_level
* NOTE: the default for "log_level" is set in log.c and does not need
* to be initialised here
*/
{
"log_level",
CONFIG_STRING,
{ .strptr = config_file_options.log_level },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_level) },
{}
},
/* log_facility */
{
"log_facility",
CONFIG_STRING,
{ .strptr = config_file_options.log_facility },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_facility) },
{}
},
/* log_file */
{
"log_file",
CONFIG_STRING,
{ .strptr = config_file_options.log_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.log_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* log_status_interval */
{
"log_status_interval",
CONFIG_INT,
{ .intptr = &config_file_options.log_status_interval },
{ .intdefault = DEFAULT_LOG_STATUS_INTERVAL, },
{ .intminval = 0 },
{},
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* use_replication_slots */
{
"use_replication_slots",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_replication_slots },
{ .booldefault = DEFAULT_USE_REPLICATION_SLOTS },
{},
{},
{}
},
/* pg_basebackup_options */
{
"pg_basebackup_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_basebackup_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_basebackup_options) },
{}
},
/* restore_command */
{
"restore_command",
CONFIG_STRING,
{ .strptr = config_file_options.restore_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.restore_command) },
{}
},
/* tablespace_mapping */
{
"tablespace_mapping",
CONFIG_TABLESPACE_MAPPING,
{ .tablespacemappingptr = &config_file_options.tablespace_mapping },
{},
{},
{},
{}
},
/* recovery_min_apply_delay */
{
"recovery_min_apply_delay",
CONFIG_STRING,
{ .strptr = config_file_options.recovery_min_apply_delay },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.recovery_min_apply_delay) },
{
.process_func = &parse_time_unit_parameter,
.providedptr = &config_file_options.recovery_min_apply_delay_provided
}
},
/* archive_cleanup_command */
{
"archive_cleanup_command",
CONFIG_STRING,
{ .strptr = config_file_options.archive_cleanup_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.archive_cleanup_command) },
{}
},
/* use_primary_conninfo_password */
{
"use_primary_conninfo_password",
CONFIG_BOOL,
{ .boolptr = &config_file_options.use_primary_conninfo_password },
{ .booldefault = DEFAULT_USE_PRIMARY_CONNINFO_PASSWORD },
{},
{},
{}
},
/* passfile */
{
"passfile",
CONFIG_STRING,
{ .strptr = config_file_options.passfile },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.passfile) },
{}
},
/* ======================
* standby clone settings
* ======================
*/
/* promote_check_timeout */
{
"promote_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_timeout },
{ .intdefault = DEFAULT_PROMOTE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* promote_check_interval */
{
"promote_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.promote_check_interval },
{ .intdefault = DEFAULT_PROMOTE_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* =======================
* standby follow settings
* =======================
*/
/* primary_follow_timeout */
{
"primary_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_follow_timeout },
{ .intdefault = DEFAULT_PRIMARY_FOLLOW_TIMEOUT, },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_timeout */
{
"standby_follow_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_follow_timeout },
{ .intdefault = DEFAULT_STANDBY_FOLLOW_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_follow_restart */
{
"standby_follow_restart",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_follow_restart },
{ .booldefault = DEFAULT_STANDBY_FOLLOW_RESTART },
{},
{},
{}
},
/* ===========================
* standby switchover settings
* ===========================
*/
/* shutdown_check_timeout */
{
"shutdown_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.shutdown_check_timeout },
{ .intdefault = DEFAULT_SHUTDOWN_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* standby_reconnect_timeout */
{
"standby_reconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.standby_reconnect_timeout },
{ .intdefault = DEFAULT_STANDBY_RECONNECT_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* wal_receive_check_timeout */
{
"wal_receive_check_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.wal_receive_check_timeout },
{ .intdefault = DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ====================
* node rejoin settings
* ====================
*/
/* node_rejoin_timeout */
{
"node_rejoin_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.node_rejoin_timeout },
{ .intdefault = DEFAULT_NODE_REJOIN_TIMEOUT },
{ .intminval = 1 },
{},
{}
},
/* ===================
* node check settings
* ===================
*/
/* archive_ready_warning */
{
"archive_ready_warning",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_warning },
{ .intdefault = DEFAULT_ARCHIVE_READY_WARNING },
{ .intminval = 1 },
{},
{}
},
/* archive_ready_critical */
{
"archive_ready_critical",
CONFIG_INT,
{ .intptr = &config_file_options.archive_ready_critical },
{ .intdefault = DEFAULT_ARCHIVE_READY_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_warning */
{
"replication_lag_warning",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_warning },
{ .intdefault = DEFAULT_REPLICATION_LAG_WARNING },
{ .intminval = 1 },
{},
{}
},
/* replication_lag_critical */
{
"replication_lag_critical",
CONFIG_INT,
{ .intptr = &config_file_options.replication_lag_critical },
{ .intdefault = DEFAULT_REPLICATION_LAG_CRITICAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* witness settings
* ================
*/
/* witness_sync_interval */
{
"witness_sync_interval",
CONFIG_INT,
{ .intptr = &config_file_options.witness_sync_interval },
{ .intdefault = DEFAULT_WITNESS_SYNC_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* ================
* repmgrd settings
* ================
*/
/* failover */
{
"failover",
CONFIG_FAILOVER_MODE,
{ .failovermodeptr = &config_file_options.failover },
{ .failovermodedefault = FAILOVER_MANUAL },
{},
{},
{}
},
/* location */
{
"location",
CONFIG_STRING,
{ .strptr = config_file_options.location },
{ .strdefault = DEFAULT_LOCATION },
{},
{ .strmaxlen = sizeof(config_file_options.location) },
{}
},
/* priority */
{
"priority",
CONFIG_INT,
{ .intptr = &config_file_options.priority },
{ .intdefault = DEFAULT_PRIORITY, },
{ .intminval = 0 },
{},
{}
},
/* promote_command */
{
"promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.promote_command) },
{}
},
/* follow_command */
{
"follow_command",
CONFIG_STRING,
{ .strptr = config_file_options.follow_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.follow_command) },
{}
},
/* monitor_interval_secs */
{
"monitor_interval_secs",
CONFIG_INT,
{ .intptr = &config_file_options.monitor_interval_secs },
{ .intdefault = DEFAULT_MONITORING_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* reconnect_attempts */
{
"reconnect_attempts",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_attempts },
{ .intdefault = DEFAULT_RECONNECTION_ATTEMPTS },
{ .intminval = 0 },
{},
{}
},
/* reconnect_interval */
{
"reconnect_interval",
CONFIG_INT,
{ .intptr = &config_file_options.reconnect_interval },
{ .intdefault = DEFAULT_RECONNECTION_INTERVAL },
{ .intminval = 0 },
{},
{}
},
/* monitoring_history */
{
"monitoring_history",
CONFIG_BOOL,
{ .boolptr = &config_file_options.monitoring_history },
{ .booldefault = DEFAULT_MONITORING_HISTORY },
{},
{},
{}
},
/* degraded_monitoring_timeout */
{
"degraded_monitoring_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.degraded_monitoring_timeout },
{ .intdefault = DEFAULT_DEGRADED_MONITORING_TIMEOUT },
{ .intminval = -1 },
{},
{}
},
/* async_query_timeout */
{
"async_query_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.async_query_timeout },
{ .intdefault = DEFAULT_ASYNC_QUERY_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* primary_notification_timeout */
{
"primary_notification_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.primary_notification_timeout },
{ .intdefault = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_standby_startup_timeout */
{
"repmgrd_standby_startup_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.repmgrd_standby_startup_timeout },
{ .intdefault = DEFAULT_REPMGRD_STANDBY_STARTUP_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* repmgrd_pid_file */
{
"repmgrd_pid_file",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_pid_file },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_pid_file) },
{ .postprocess_func = &repmgr_canonicalize_path }
},
/* repmgrd_exit_on_inactive_node */
{
"repmgrd_exit_on_inactive_node",
CONFIG_BOOL,
{ .boolptr = &config_file_options.repmgrd_exit_on_inactive_node},
{ .booldefault = DEFAULT_REPMGRD_EXIT_ON_INACTIVE_NODE },
{},
{},
{}
},
/* standby_disconnect_on_failover */
{
"standby_disconnect_on_failover",
CONFIG_BOOL,
{ .boolptr = &config_file_options.standby_disconnect_on_failover },
{ .booldefault = DEFAULT_STANDBY_DISCONNECT_ON_FAILOVER },
{},
{},
{}
},
/* sibling_nodes_disconnect_timeout */
{
"sibling_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.sibling_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* connection_check_type */
{
"connection_check_type",
CONFIG_CONNECTION_CHECK_TYPE,
{ .checktypeptr = &config_file_options.connection_check_type },
{ .checktypedefault = DEFAULT_CONNECTION_CHECK_TYPE },
{},
{},
{}
},
/* primary_visibility_consensus */
{
"primary_visibility_consensus",
CONFIG_BOOL,
{ .boolptr = &config_file_options.primary_visibility_consensus },
{ .booldefault = DEFAULT_PRIMARY_VISIBILITY_CONSENSUS },
{},
{},
{}
},
/* always_promote */
{
"always_promote",
CONFIG_BOOL,
{ .boolptr = &config_file_options.always_promote },
{ .booldefault = DEFAULT_ALWAYS_PROMOTE },
{},
{},
{}
},
/* failover_validation_command */
{
"failover_validation_command",
CONFIG_STRING,
{ .strptr = config_file_options.failover_validation_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.failover_validation_command) },
{}
},
/* election_rerun_interval */
{
"election_rerun_interval",
CONFIG_INT,
{ .intptr = &config_file_options.election_rerun_interval },
{ .intdefault = DEFAULT_ELECTION_RERUN_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_check_interval */
{
"child_nodes_check_interval",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_check_interval },
{ .intdefault = DEFAULT_CHILD_NODES_CHECK_INTERVAL },
{ .intminval = 1 },
{},
{}
},
/* child_nodes_disconnect_min_count */
{
"child_nodes_disconnect_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT },
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_min_count */
{
"child_nodes_connected_min_count",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_connected_min_count },
{ .intdefault = DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT},
{ .intminval = -1 },
{},
{}
},
/* child_nodes_connected_include_witness */
{
"child_nodes_connected_include_witness",
CONFIG_BOOL,
{ .boolptr = &config_file_options.child_nodes_connected_include_witness },
{ .booldefault = DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS },
{},
{},
{}
},
/* child_nodes_disconnect_timeout */
{
"child_nodes_disconnect_timeout",
CONFIG_INT,
{ .intptr = &config_file_options.child_nodes_disconnect_timeout },
{ .intdefault = DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT },
{ .intminval = 0 },
{},
{}
},
/* child_nodes_disconnect_command */
{
"child_nodes_disconnect_command",
CONFIG_STRING,
{ .strptr = config_file_options.child_nodes_disconnect_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.child_nodes_disconnect_command) },
{}
},
/* ================
* service settings
* ================
*/
/* pg_ctl_options */
{
"pg_ctl_options",
CONFIG_STRING,
{ .strptr = config_file_options.pg_ctl_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.pg_ctl_options) },
{}
},
/* service_start_command */
{
"service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_start_command) },
{}
},
/* service_stop_command */
{
"service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_stop_command) },
{}
},
/* service_restart_command */
{
"service_restart_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_restart_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_restart_command) },
{}
},
/* service_reload_command */
{
"service_reload_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_reload_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_reload_command) },
{}
},
/* service_promote_command */
{
"service_promote_command",
CONFIG_STRING,
{ .strptr = config_file_options.service_promote_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.service_promote_command) },
{}
},
/* ========================
* repmgrd service settings
* ========================
*/
/* repmgrd_service_start_command */
{
"repmgrd_service_start_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_start_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_start_command) },
{}
},
/* repmgrd_service_stop_command */
{
"repmgrd_service_stop_command",
CONFIG_STRING,
{ .strptr = config_file_options.repmgrd_service_stop_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.repmgrd_service_stop_command) },
{}
},
/* ===========================
* event notification settings
* ===========================
*/
/* event_notification_command */
{
"event_notification_command",
CONFIG_STRING,
{ .strptr = config_file_options.event_notification_command },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.event_notification_command) },
{}
},
{
"event_notifications",
CONFIG_EVENT_NOTIFICATION_LIST,
{ .notificationlistptr = &config_file_options.event_notifications },
{},
{},
{},
{}
},
/* ===============
* barman settings
* ===============
*/
/* barman_host */
{
"barman_host",
CONFIG_STRING,
{ .strptr = config_file_options.barman_host },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_host) },
{}
},
/* barman_server */
{
"barman_server",
CONFIG_STRING,
{ .strptr = config_file_options.barman_server },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_server) },
{}
},
/* barman_config */
{
"barman_config",
CONFIG_STRING,
{ .strptr = config_file_options.barman_config },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.barman_config) },
{}
},
/* ==================
* rsync/ssh settings
* ==================
*/
/* rsync_options */
{
"rsync_options",
CONFIG_STRING,
{ .strptr = config_file_options.rsync_options },
{ .strdefault = "" },
{},
{ .strmaxlen = sizeof(config_file_options.rsync_options) },
{}
},
/* ssh_options */
{
"ssh_options",
CONFIG_STRING,
{ .strptr = config_file_options.ssh_options },
{ .strdefault = DEFAULT_SSH_OPTIONS },
{},
{ .strmaxlen = sizeof(config_file_options.ssh_options) },
{}
},
/* ==================================
* undocumented experimental settings
* ==================================
*/
/* reconnect_loop_sync */
{
"reconnect_loop_sync",
CONFIG_BOOL,
{ .boolptr = &config_file_options.reconnect_loop_sync },
{ .booldefault = false },
{},
{},
{}
},
/* ==========================
* undocumented test settings
* ==========================
*/
/* promote_delay */
{
"promote_delay",
CONFIG_INT,
{ .intptr = &config_file_options.promote_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
/* failover_delay */
{
"failover_delay",
CONFIG_INT,
{ .intptr = &config_file_options.failover_delay },
{ .intdefault = 0 },
{ .intminval = 1 },
{},
{}
},
{
"connection_check_query",
CONFIG_STRING,
{ .strptr = config_file_options.connection_check_query },
{ .strdefault = "SELECT 1" },
{},
{ .strmaxlen = sizeof(config_file_options.connection_check_query) },
{}
},
/* End-of-list marker */
{
NULL, CONFIG_INT, {}, {}, {}, {}, {}
}
};

View File

@@ -5,6 +5,8 @@
%{
#include <setjmp.h>
#include <sys/stat.h>
#include <dirent.h>
#include "repmgr.h"
#include "configfile.h"
@@ -38,7 +40,13 @@ static sigjmp_buf *CONF_flex_fatal_jmp;
static char *CONF_scanstr(const char *s);
static int CONF_flex_fatal(const char *msg);
static bool ProcessConfigFile(FILE *fp, const char *config_file, KeyValueList *contents, t_configuration_options *options, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static bool ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
static char *AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file);
%}
@@ -90,20 +98,91 @@ STRING \'([^'\\\n]|\\.|\'\')*\'
%%
extern bool
ProcessRepmgrConfigFile(FILE *fp, const char *config_file, t_configuration_options *options, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(fp, config_file, NULL, options, error_list, warning_list);
}
extern bool
ProcessPostgresConfigFile(FILE *fp, const char *config_file, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(fp, config_file, contents, NULL, error_list, warning_list);
return ProcessConfigFile(base_dir, config_file, NULL, true, 0, NULL, error_list, warning_list);
}
extern bool
ProcessPostgresConfigFile(const char *config_file, const char *base_dir, bool strict, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
return ProcessConfigFile(base_dir, config_file, NULL, strict, 0, contents, error_list, warning_list);
}
static bool
ProcessConfigFile(FILE *fp, const char *config_file, KeyValueList *contents, t_configuration_options *options, ItemList *error_list, ItemList *warning_list)
ProcessConfigFile(const char *base_dir, const char *config_file, const char *calling_file, bool strict, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *abs_path;
bool success = true;
FILE *fp;
/*
* Reject file name that is all-blank (including empty), as that leads to
* confusion --- we'd try to read the containing directory as a file.
*/
if (strspn(config_file, " \t\r\n") == strlen(config_file))
{
return false;
}
/*
* Reject too-deep include nesting depth. This is just a safety check to
* avoid dumping core due to stack overflow if an include file loops back
* to itself. The maximum nesting depth is pretty arbitrary.
*/
if (depth > 10)
{
item_list_append_format(error_list,
_("could not open configuration file \"%s\": maximum nesting depth exceeded"),
config_file);
return false;
}
abs_path = AbsoluteConfigLocation(base_dir, config_file, calling_file);
/* Reject direct recursion */
if (calling_file && strcmp(abs_path, calling_file) == 0)
{
item_list_append_format(error_list,
_("configuration file recursion in \"%s\""),
calling_file);
pfree(abs_path);
return false;
}
fp = fopen(abs_path, "r");
if (!fp)
{
if (strict == false)
{
item_list_append_format(error_list,
"skipping configuration file \"%s\"",
abs_path);
}
else
{
item_list_append_format(error_list,
"could not open configuration file \"%s\": %s",
abs_path,
strerror(errno));
success = false;
}
}
else
{
success = ProcessConfigFp(fp, abs_path, calling_file, depth + 1, base_dir, contents, error_list, warning_list);
}
free(abs_path);
return success;
}
static bool
ProcessConfigFp(FILE *fp, const char *config_file, const char *calling_file, int depth, const char *base_dir, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
volatile bool OK = true;
volatile YY_BUFFER_STATE lex_buffer = NULL;
@@ -179,22 +258,62 @@ ProcessConfigFile(FILE *fp, const char *config_file, KeyValueList *contents, t_c
ConfigFileLineno++;
}
/* OK, process the option name and value */
if (contents != NULL)
/* Handle include files */
if (base_dir != NULL && strcasecmp(opt_name, "include_dir") == 0)
{
key_value_list_replace_or_set(contents,
opt_name,
opt_value);
/*
* An include_dir directive isn't a variable and should be
* processed immediately.
*/
if (!ProcessConfigDirectory(base_dir, opt_value, config_file,
depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include_if_exists") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
false, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else if (base_dir != NULL && strcasecmp(opt_name, "include") == 0)
{
if (!ProcessConfigFile(base_dir, opt_value, config_file,
true, depth + 1, contents,
error_list, warning_list))
OK = false;
yy_switch_to_buffer(lex_buffer);
pfree(opt_name);
pfree(opt_value);
}
else
{
/* OK, process the option name and value */
if (contents != NULL)
{
key_value_list_replace_or_set(contents,
opt_name,
opt_value);
}
else
{
parse_configuration_item(error_list,
warning_list,
opt_name,
opt_value);
}
}
if (options != NULL)
{
parse_configuration_item(options,
error_list,
warning_list,
opt_name,
opt_value);
}
/* break out of loop if read EOF, else loop for next line */
if (token == 0)
@@ -253,6 +372,132 @@ cleanup:
return OK;
}
/*
* Read and parse all config files in a subdirectory in alphabetical order
*
* includedir is the absolute or relative path to the subdirectory to scan.
*
* See ProcessConfigFp for further details.
*/
static bool
ProcessConfigDirectory(const char *base_dir, const char *includedir, const char *calling_file, int depth, KeyValueList *contents, ItemList *error_list, ItemList *warning_list)
{
char *directory;
DIR *d;
struct dirent *de;
char **filenames;
int num_filenames;
int size_filenames;
bool status;
/*
* Reject directory name that is all-blank (including empty), as that
* leads to confusion --- we'd read the containing directory, typically
* resulting in recursive inclusion of the same file(s).
*/
if (strspn(includedir, " \t\r\n") == strlen(includedir))
{
item_list_append_format(error_list,
_("empty configuration directory name: \"%s\""),
includedir);
return false;
}
directory = AbsoluteConfigLocation(base_dir, includedir, calling_file);
d = opendir(directory);
if (d == NULL)
{
item_list_append_format(error_list,
_("could not open configuration directory \"%s\": %s"),
directory,
strerror(errno));
status = false;
goto cleanup;
}
/*
* Read the directory and put the filenames in an array, so we can sort
* them prior to processing the contents.
*/
size_filenames = 32;
filenames = (char **) palloc(size_filenames * sizeof(char *));
num_filenames = 0;
while ((de = readdir(d)) != NULL)
{
struct stat st;
char filename[MAXPGPATH];
/*
* Only parse files with names ending in ".conf". Explicitly reject
* files starting with ".". This excludes things like "." and "..",
* as well as typical hidden files, backup files, and editor debris.
*/
if (strlen(de->d_name) < 6)
continue;
if (de->d_name[0] == '.')
continue;
if (strcmp(de->d_name + strlen(de->d_name) - 5, ".conf") != 0)
continue;
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
if (stat(filename, &st) == 0)
{
if (!S_ISDIR(st.st_mode))
{
/* Add file to array, increasing its size in blocks of 32 */
if (num_filenames >= size_filenames)
{
size_filenames += 32;
filenames = (char **) repalloc(filenames,
size_filenames * sizeof(char *));
}
filenames[num_filenames] = pstrdup(filename);
num_filenames++;
}
}
else
{
/*
* stat does not care about permissions, so the most likely reason
* a file can't be accessed now is if it was removed between the
* directory listing and now.
*/
item_list_append_format(error_list,
_("could not stat file \"%s\": %s"),
filename, strerror(errno));
status = false;
goto cleanup;
}
}
if (num_filenames > 0)
{
int i;
qsort(filenames, num_filenames, sizeof(char *), pg_qsort_strcmp);
for (i = 0; i < num_filenames; i++)
{
if (!ProcessConfigFile(base_dir, filenames[i], calling_file,
true, depth, contents,
error_list, warning_list))
{
status = false;
goto cleanup;
}
}
}
status = true;
cleanup:
if (d)
closedir(d);
pfree(directory);
return status;
}
/*
* scanstr
@@ -348,6 +593,39 @@ CONF_scanstr(const char *s)
return newStr;
}
/*
* Given a configuration file or directory location that may be a relative
* path, return an absolute one. We consider the location to be relative to
* the directory holding the calling file, or to DataDir if no calling file.
*/
static char *
AbsoluteConfigLocation(const char *base_dir, const char *location, const char *calling_file)
{
char abs_path[MAXPGPATH];
if (is_absolute_path(location))
return strdup(location);
if (calling_file != NULL)
{
strlcpy(abs_path, calling_file, sizeof(abs_path));
get_parent_directory(abs_path);
join_path_components(abs_path, abs_path, location);
canonicalize_path(abs_path);
}
else if (base_dir != NULL)
{
join_path_components(abs_path, base_dir, location);
canonicalize_path(abs_path);
}
else
{
strlcpy(abs_path, location, sizeof(abs_path));
}
return strdup(abs_path);
}
/*
* Flex fatal errors bring us here. Stash the error message and jump back to

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
/*
* configfile.h
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
*
* This program is free software: you can redistribute it and/or modify
@@ -29,7 +29,7 @@
#define TARGET_TIMELINE_LATEST 0
/*
* This is defined src/include/utils.h, however it's not practical
* This is defined in src/include/utils.h, however it's not practical
* to include that from a frontend application.
*/
#define PG_AUTOCONF_FILENAME "postgresql.auto.conf"
@@ -50,6 +50,11 @@ typedef enum
CHECK_CONNECTION
} ConnectionCheckType;
typedef enum
{
REPLICATION_TYPE_PHYSICAL
} ReplicationType;
typedef struct EventNotificationListCell
{
struct EventNotificationListCell *next;
@@ -78,6 +83,58 @@ typedef struct TablespaceList
} TablespaceList;
typedef enum
{
CONFIG_BOOL,
CONFIG_INT,
CONFIG_STRING,
CONFIG_FAILOVER_MODE,
CONFIG_CONNECTION_CHECK_TYPE,
CONFIG_EVENT_NOTIFICATION_LIST,
CONFIG_TABLESPACE_MAPPING,
CONFIG_REPLICATION_TYPE
} ConfigItemType;
typedef struct ConfigFileSetting
{
const char *name;
ConfigItemType type;
union
{
int *intptr;
char *strptr;
bool *boolptr;
failover_mode_opt *failovermodeptr;
ConnectionCheckType *checktypeptr;
EventNotificationList *notificationlistptr;
TablespaceList *tablespacemappingptr;
ReplicationType *replicationtypeptr;
} val;
union {
int intdefault;
const char *strdefault;
bool booldefault;
failover_mode_opt failovermodedefault;
ConnectionCheckType checktypedefault;
ReplicationType replicationtypedefault;
} defval;
union {
int intminval;
} minval;
union {
int strmaxlen;
} maxval;
struct {
void (*process_func)(const char *, const char *, char *, ItemList *errors);
void (*postprocess_func)(const char *, const char *, char *, ItemList *errors);
bool *providedptr;
} process;
} ConfigFileSetting;
/* Declare the main configfile structure for client applications */
extern ConfigFileSetting config_file_settings[];
typedef struct
{
/* node information */
@@ -89,7 +146,7 @@ typedef struct
char config_directory[MAXPGPATH];
char pg_bindir[MAXPGPATH];
char repmgr_bindir[MAXPGPATH];
int replication_type;
ReplicationType replication_type;
/* log settings */
char log_level[MAXLEN];
@@ -115,6 +172,7 @@ typedef struct
/* standby follow settings */
int primary_follow_timeout;
int standby_follow_timeout;
bool standby_follow_restart;
/* standby switchover settings */
int shutdown_check_timeout;
@@ -148,10 +206,12 @@ typedef struct
int primary_notification_timeout;
int repmgrd_standby_startup_timeout;
char repmgrd_pid_file[MAXPGPATH];
bool repmgrd_exit_on_inactive_node;
bool standby_disconnect_on_failover;
int sibling_nodes_disconnect_timeout;
ConnectionCheckType connection_check_type;
bool primary_visibility_consensus;
bool always_promote;
char failover_validation_command[MAXPGPATH];
int election_rerun_interval;
int child_nodes_check_interval;
@@ -187,77 +247,35 @@ typedef struct
char rsync_options[MAXLEN];
char ssh_options[MAXLEN];
/* undocumented test settings */
/*
* undocumented settings
*
* These settings are for testing or experimental features
* and may be changed without notice.
*/
/* experimental settings */
bool reconnect_loop_sync;
/* test settings */
int promote_delay;
int failover_delay;
char connection_check_query[MAXLEN];
} t_configuration_options;
/*
* The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file
*/
#define T_CONFIGURATION_OPTIONS_INITIALIZER { \
/* node information */ \
UNKNOWN_NODE_ID, "", "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL, \
/* log settings */ \
"", "", "", DEFAULT_LOG_STATUS_INTERVAL, \
/* standby clone settings */ \
false, "", "", { NULL, NULL }, "", false, "", false, "", \
/* standby promote settings */ \
DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
/* standby follow settings */ \
DEFAULT_PRIMARY_FOLLOW_TIMEOUT, \
DEFAULT_STANDBY_FOLLOW_TIMEOUT, \
/* standby switchover settings */ \
DEFAULT_SHUTDOWN_CHECK_TIMEOUT, \
DEFAULT_STANDBY_RECONNECT_TIMEOUT, \
DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT, \
/* node rejoin settings */ \
DEFAULT_NODE_REJOIN_TIMEOUT, \
/* node check settings */ \
DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
/* witness settings */ \
DEFAULT_WITNESS_SYNC_INTERVAL, \
/* repmgrd settings */ \
FAILOVER_MANUAL, DEFAULT_LOCATION, DEFAULT_PRIORITY, "", "", \
DEFAULT_MONITORING_INTERVAL, \
DEFAULT_RECONNECTION_ATTEMPTS, \
DEFAULT_RECONNECTION_INTERVAL, \
false, -1, \
DEFAULT_ASYNC_QUERY_TIMEOUT, \
DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT, \
-1, "", false, DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT, \
CHECK_PING, true, "", DEFAULT_ELECTION_RERUN_INTERVAL, \
DEFAULT_CHILD_NODES_CHECK_INTERVAL, \
DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT, \
DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT, \
DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS, \
DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT, "", \
/* service settings */ \
"", "", "", "", "", "", \
/* repmgrd service settings */ \
"", "", \
/* event notification settings */ \
"", "", { NULL, NULL }, \
/* barman settings */ \
"", "", "", \
/* rsync/ssh settings */ \
"", "", \
/* undocumented test settings */ \
0 \
}
/* Declare the main configfile structure for client applications */
extern t_configuration_options config_file_options;
typedef struct
{
char slot[MAXLEN];
char wal_method[MAXLEN];
char waldir[MAXPGPATH];
bool no_slot; /* from PostgreSQL 10 */
} t_basebackup_options;
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false }
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", "", false }
typedef enum
@@ -314,10 +332,11 @@ typedef struct
void set_progname(const char *argv0);
const char *progname(void);
void load_config(const char *config_file, bool verbose, bool terse, t_configuration_options *options, char *argv0);
bool reload_config(t_configuration_options *orig_options, t_server_type server_type);
void load_config(const char *config_file, bool verbose, bool terse, char *argv0);
bool reload_config(t_server_type server_type);
void dump_config(void);
void parse_configuration_item(t_configuration_options *options, ItemList *error_list, ItemList *warning_list, const char *name, const char *value);
void parse_configuration_item(ItemList *error_list, ItemList *warning_list, const char *name, const char *value);
bool parse_recovery_conf(const char *data_dir, t_recovery_conf *conf);
@@ -330,6 +349,9 @@ int repmgr_atoi(const char *s,
ItemList *error_list,
int minval);
void parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemList *errors);
void repmgr_canonicalize_path(const char *name, const char *value, char *config_item, ItemList *errors);
bool parse_pg_basebackup_options(const char *pg_basebackup_options,
t_basebackup_options *backup_options,
int server_version_num,
@@ -337,17 +359,21 @@ bool parse_pg_basebackup_options(const char *pg_basebackup_options,
int parse_output_to_argv(const char *string, char ***argv_array);
void free_parsed_argv(char ***argv_array);
const char *format_failover_mode(failover_mode_opt failover);
/* called by repmgr-client and repmgrd */
void exit_with_cli_errors(ItemList *error_list, const char *repmgr_command);
void print_item_list(ItemList *item_list);
const char *print_replication_type(ReplicationType type);
const char *print_connection_check_type(ConnectionCheckType type);
char *print_event_notification_list(EventNotificationList *list);
char *print_tablespace_mapping(TablespaceList *tablespacemappingptr);
extern bool modify_auto_conf(const char *data_dir, KeyValueList *items);
extern bool ProcessRepmgrConfigFile(FILE *fp, const char *config_file, t_configuration_options *options, ItemList *error_list, ItemList *warning_list);
extern bool ProcessRepmgrConfigFile(const char *config_file, const char *base_dir, ItemList *error_list, ItemList *warning_list);
extern bool ProcessPostgresConfigFile(FILE *fp, const char *config_file, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
extern bool ProcessPostgresConfigFile(const char *config_file, const char *base_dir, bool strict, KeyValueList *contents, ItemList *error_list, ItemList *warning_list);
#endif /* _REPMGR_CONFIGFILE_H_ */

30
configure vendored
View File

@@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for repmgr 5.1.0.
# Generated by GNU Autoconf 2.69 for repmgr 5.3.0.
#
# Report bugs to <repmgr@googlegroups.com>.
#
@@ -11,7 +11,7 @@
# This configure script is free software; the Free Software Foundation
# gives unlimited permission to copy, distribute and modify it.
#
# Copyright (c) 2010-2020, 2ndQuadrant Ltd.
# Copyright (c) 2010-2021, EnterpriseDB Corporation
## -------------------- ##
## M4sh Initialization. ##
## -------------------- ##
@@ -582,8 +582,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='repmgr'
PACKAGE_TARNAME='repmgr'
PACKAGE_VERSION='5.1.0'
PACKAGE_STRING='repmgr 5.1.0'
PACKAGE_VERSION='5.3.0'
PACKAGE_STRING='repmgr 5.3.0'
PACKAGE_BUGREPORT='repmgr@googlegroups.com'
PACKAGE_URL='https://repmgr.org/'
@@ -1181,7 +1181,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures repmgr 5.1.0 to adapt to many kinds of systems.
\`configure' configures repmgr 5.3.0 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1242,7 +1242,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of repmgr 5.1.0:";;
short | recursive ) echo "Configuration of repmgr 5.3.0:";;
esac
cat <<\_ACEOF
@@ -1316,14 +1316,14 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
repmgr configure 5.1.0
repmgr configure 5.3.0
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
This configure script is free software; the Free Software Foundation
gives unlimited permission to copy, distribute and modify it.
Copyright (c) 2010-2020, 2ndQuadrant Ltd.
Copyright (c) 2010-2021, EnterpriseDB Corporation
_ACEOF
exit
fi
@@ -1335,7 +1335,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by repmgr $as_me 5.1.0, which was
It was created by repmgr $as_me 5.3.0, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@@ -1811,11 +1811,11 @@ fi
pgac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
major_version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \([0-9]\{1,2\}\).*$/\1/')
$SED -e 's/^[^0-9]\+ \([0-9]\{1,2\}\).*$/\1/')
if test "$major_version_num" -lt '10'; then
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \([0-9]*\)\.\([0-9]*\)\([a-zA-Z0-9.]*\)$/\1.\2/')
$SED -e 's/^[^0-9]\+ \([0-9]*\)\.\([0-9]*\)\([a-zA-Z0-9.]*\)$/\1.\2/')
if test -z "$version_num"; then
as_fn_error $? "could not detect the PostgreSQL version, wrong or broken pg_config?" "$LINENO" 5
@@ -1824,12 +1824,12 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([0-9]*\)\.\([0-9]*\)$/\1\2/')
if test "$version_num_int" -lt '93'; then
if test "$version_num_int" -lt '94'; then
as_fn_error $? "repmgr is not compatible with detected PostgreSQL version: $version_num" "$LINENO" 5
fi
else
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \(.\+\)$/\1/')
$SED -e 's/^[^0-9]\+ \(.\+\)$/\1/')
if test -z "$version_num"; then
as_fn_error $? "could not detect the PostgreSQL version, wrong or broken pg_config?" "$LINENO" 5
@@ -2487,7 +2487,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by repmgr $as_me 5.1.0, which was
This file was extended by repmgr $as_me 5.3.0, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@@ -2550,7 +2550,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
repmgr config.status 5.1.0
repmgr config.status 5.3.0
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@@ -1,6 +1,6 @@
AC_INIT([repmgr], [5.1.0], [repmgr@googlegroups.com], [repmgr], [https://repmgr.org/])
AC_INIT([repmgr], [5.3.1], [repmgr@googlegroups.com], [repmgr], [https://repmgr.org/])
AC_COPYRIGHT([Copyright (c) 2010-2020, 2ndQuadrant Ltd.])
AC_COPYRIGHT([Copyright (c) 2010-2021, EnterpriseDB Corporation])
AC_CONFIG_HEADER(config.h)
@@ -19,11 +19,11 @@ fi
pgac_pg_config_version=$($PG_CONFIG --version 2>/dev/null)
major_version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \([[0-9]]\{1,2\}\).*$/\1/')
$SED -e 's/^[[^0-9]]\+ \([[0-9]]\{1,2\}\).*$/\1/')
if test "$major_version_num" -lt '10'; then
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \([[0-9]]*\)\.\([[0-9]]*\)\([[a-zA-Z0-9.]]*\)$/\1.\2/')
$SED -e 's/^[[^0-9]]\+ \([[0-9]]*\)\.\([[0-9]]*\)\([[a-zA-Z0-9.]]*\)$/\1.\2/')
if test -z "$version_num"; then
AC_MSG_ERROR([could not detect the PostgreSQL version, wrong or broken pg_config?])
@@ -32,12 +32,12 @@ if test "$major_version_num" -lt '10'; then
version_num_int=$(echo "$version_num"|
$SED -e 's/^\([[0-9]]*\)\.\([[0-9]]*\)$/\1\2/')
if test "$version_num_int" -lt '93'; then
if test "$version_num_int" -lt '94'; then
AC_MSG_ERROR([repmgr is not compatible with detected PostgreSQL version: $version_num])
fi
else
version_num=$(echo "$pgac_pg_config_version"|
$SED -e 's/^PostgreSQL \(.\+\)$/\1/')
$SED -e 's/^[[^0-9]]\+ \(.\+\)$/\1/')
if test -z "$version_num"; then
AC_MSG_ERROR([could not detect the PostgreSQL version, wrong or broken pg_config?])

View File

@@ -2,11 +2,11 @@
* controldata.c - functions for reading the pg_control file
*
* The functions provided here enable repmgr to read a pg_control file
* in a version-indepent way, even if the PostgreSQL instance is not
* in a version-independent way, even if the PostgreSQL instance is not
* running. For that reason we can't use on the pg_control_*() functions
* provided in PostgreSQL 9.6 and later.
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -90,7 +90,9 @@ get_system_identifier(const char *data_directory)
uint64 system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
control_file_info = get_controlfile(data_directory);
system_identifier = control_file_info->system_identifier;
if (control_file_info->control_file_processed == true)
system_identifier = control_file_info->system_identifier;
pfree(control_file_info);
@@ -98,19 +100,21 @@ get_system_identifier(const char *data_directory)
}
DBState
get_db_state(const char *data_directory)
bool
get_db_state(const char *data_directory, DBState *state)
{
ControlFileInfo *control_file_info = NULL;
DBState state;
bool control_file_processed;
control_file_info = get_controlfile(data_directory);
control_file_processed = control_file_info->control_file_processed;
state = control_file_info->state;
if (control_file_processed == true)
*state = control_file_info->state;
pfree(control_file_info);
return state;
return control_file_processed;
}
@@ -122,7 +126,8 @@ get_latest_checkpoint_location(const char *data_directory)
control_file_info = get_controlfile(data_directory);
checkPoint = control_file_info->checkPoint;
if (control_file_info->control_file_processed == true)
checkPoint = control_file_info->checkPoint;
pfree(control_file_info);
@@ -134,11 +139,12 @@ int
get_data_checksum_version(const char *data_directory)
{
ControlFileInfo *control_file_info = NULL;
int data_checksum_version = -1;
int data_checksum_version = UNKNOWN_DATA_CHECKSUM_VERSION;
control_file_info = get_controlfile(data_directory);
data_checksum_version = (int) control_file_info->data_checksum_version;
if (control_file_info->control_file_processed == true)
data_checksum_version = (int) control_file_info->data_checksum_version;
pfree(control_file_info);
@@ -277,8 +283,19 @@ get_controlfile(const char *DataDir)
return control_file_info;
}
if (version_num >= 90500)
if (version_num >= 120000)
{
#if PG_ACTUAL_VERSION_NUM >= 120000
expected_size = sizeof(ControlFileData12);
ControlFileDataPtr = palloc0(expected_size);
#endif
}
else if (version_num >= 110000)
{
expected_size = sizeof(ControlFileData11);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90500)
{
expected_size = sizeof(ControlFileData95);
ControlFileDataPtr = palloc0(expected_size);
@@ -288,12 +305,6 @@ get_controlfile(const char *DataDir)
expected_size = sizeof(ControlFileData94);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90300)
{
expected_size = sizeof(ControlFileData93);
ControlFileDataPtr = palloc0(expected_size);
}
if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
{
@@ -322,7 +333,7 @@ get_controlfile(const char *DataDir)
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
#else
fprintf(stderr, "ERROR: please use a repmgr version built for PostgreSQL 12\n");
fprintf(stderr, "ERROR: please use a repmgr version built for PostgreSQL 12 or later\n");
exit(ERR_BAD_CONFIG);
#endif
}
@@ -359,17 +370,6 @@ get_controlfile(const char *DataDir)
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
else if (version_num >= 90300)
{
ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
control_file_info->timeline = ptr->checkPointCopy.ThisTimeLineID;
control_file_info->minRecoveryPointTLI = ptr->minRecoveryPointTLI;
control_file_info->minRecoveryPoint = ptr->minRecoveryPoint;
}
pfree(ControlFileDataPtr);

View File

@@ -1,6 +1,6 @@
/*
* controldata.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -31,9 +31,7 @@ typedef struct
} ControlFileInfo;
/* Same for 9.3, 9.4 */
typedef struct CheckPoint93
typedef struct CheckPoint94
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
@@ -53,7 +51,7 @@ typedef struct CheckPoint93
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestActiveXid;
} CheckPoint93;
} CheckPoint94;
/* Same for 9.5, 9.6, 10, 11 */
@@ -128,65 +126,6 @@ typedef struct CheckPoint12
} CheckPoint12;
#endif
typedef struct ControlFileData93
{
uint64 system_identifier;
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
int MaxConnections;
int max_prepared_xacts;
int max_locks_per_xact;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
/* flag indicating internal format of timestamp, interval, time */
bool enableIntTimes; /* int64 storage enabled? */
/* flags indicating pass-by-value status of various types */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
/* Are data pages protected by checksums? Zero if no checksum version */
uint32 data_checksum_version;
} ControlFileData93;
/*
* Following field added since 9.3:
*
* int max_worker_processes;
*/
typedef struct ControlFileData94
{
@@ -200,7 +139,7 @@ typedef struct ControlFileData94
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
CheckPoint94 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
@@ -438,7 +377,7 @@ typedef struct ControlFileData12
#endif
extern int get_pg_version(const char *data_directory, char *version_string);
extern DBState get_db_state(const char *data_directory);
extern bool get_db_state(const char *data_directory, DBState *state);
extern const char *describe_db_state(DBState state);
extern int get_data_checksum_version(const char *data_directory);
extern uint64 get_system_identifier(const char *data_directory);

347
dbutils.c
View File

@@ -1,7 +1,7 @@
/*
* dbutils.c - Database connection/management functions
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
*
* This program is free software: you can redistribute it and/or modify
@@ -153,6 +153,9 @@ _establish_db_connection(const char *conninfo, const bool exit_on_error, const b
if (param_get(&conninfo_params, "replication") != NULL)
is_replication_connection = true;
/* use a secure search_path */
param_set(&conninfo_params, "options", "-csearch_path=");
connection_string = param_list_to_string(&conninfo_params);
log_debug(_("connecting to: \"%s\""), connection_string);
@@ -303,6 +306,9 @@ establish_db_connection_by_params(t_conninfo_param_list *param_list,
param_set_ine(param_list, "connect_timeout", "2");
param_set_ine(param_list, "fallback_application_name", "repmgr");
/* use a secure search_path */
param_set(param_list, "options", "-csearch_path=");
/* Connect to the database using the provided parameters */
conn = PQconnectdbParams((const char **) param_list->keywords, (const char **) param_list->values, true);
@@ -705,10 +711,37 @@ param_get(t_conninfo_param_list *param_list, const char *param)
}
/*
* Validate a conninfo string by attempting to parse it.
*
* "errmsg": passed to PQconninfoParse(), may be NULL
*
* NOTE: PQconninfoParse() verifies the string format and checks for
* valid options but does not sanity check values.
*/
bool
validate_conninfo_string(const char *conninfo_str, char **errmsg)
{
PQconninfoOption *connOptions = NULL;
connOptions = PQconninfoParse(conninfo_str, errmsg);
if (connOptions == NULL)
return false;
PQconninfoFree(connOptions);
return true;
}
/*
* Parse a conninfo string into a t_conninfo_param_list
*
* See conn_to_param_list() to do the same for a PGconn
* See conn_to_param_list() to do the same for a PGconn.
*
* "errmsg": passed to PQconninfoParse(), may be NULL
*
* "ignore_local_params": ignores those parameters specific
* to a local installation, i.e. when parsing an upstream
@@ -729,8 +762,7 @@ parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_lis
for (option = connOptions; option && option->keyword; option++)
{
/* Ignore non-set or blank parameter values */
if ((option->val == NULL) ||
(option->val != NULL && option->val[0] == '\0'))
if (option->val == NULL || option->val[0] == '\0')
continue;
/* Ignore settings specific to the upstream node */
@@ -775,8 +807,7 @@ conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list)
for (option = connOptions; option && option->keyword; option++)
{
/* Ignore non-set or blank parameter values */
if ((option->val == NULL) ||
(option->val != NULL && option->val[0] == '\0'))
if (option->val == NULL || option->val[0] == '\0')
continue;
/* Ignore "password" */
@@ -1226,7 +1257,7 @@ bool
pg_reload_conf(PGconn *conn)
{
PGresult *res = NULL;
bool success = false;
bool success = true;
res = PQexec(conn, "SELECT pg_catalog.pg_reload_conf()");
@@ -1643,7 +1674,12 @@ identify_system(PGconn *repl_conn, t_system_identification *identification)
return false;
}
#if defined(__i386__) || defined(__i386)
identification->system_identifier = atoll(PQgetvalue(res, 0, 0));
#else
identification->system_identifier = atol(PQgetvalue(res, 0, 0));
#endif
identification->timeline = atoi(PQgetvalue(res, 0, 1));
identification->xlogpos = parse_lsn(PQgetvalue(res, 0, 2));
@@ -1680,7 +1716,11 @@ system_identifier(PGconn *conn)
}
else
{
#if defined(__i386__) || defined(__i386)
system_identifier = atoll(PQgetvalue(res, 0, 0));
#else
system_identifier = atol(PQgetvalue(res, 0, 0));
#endif
}
PQclear(res);
@@ -1799,7 +1839,7 @@ can_execute_pg_promote(PGconn *conn)
bool has_pg_promote= false;
/* pg_promote() available from PostgreSQL 12 */
if(PQserverVersion(conn) < 120000)
if (PQserverVersion(conn) < 120000)
return false;
initPQExpBuffer(&query);
@@ -1827,48 +1867,59 @@ can_execute_pg_promote(PGconn *conn)
}
/*
* Determine if the user associated with the current connection is
* a member of the "pg_monitor" default role, or optionally one
* of its three constituent "subroles".
*/
bool
connection_has_pg_settings(PGconn *conn)
connection_has_pg_monitor_role(PGconn *conn, const char *subrole)
{
bool has_pg_settings = false;
PQExpBufferData query;
PGresult *res;
bool has_pg_monitor_role = false;
/* superusers can always read pg_settings */
/* superusers can read anything, no role check needed */
if (is_superuser_connection(conn, NULL) == true)
return true;
/* pg_monitor and associated "subroles" introduced in PostgreSQL 10 */
if (PQserverVersion(conn) < 100000)
return false;
initPQExpBuffer(&query);
appendPQExpBufferStr(&query,
" SELECT CASE "
" WHEN pg_catalog.pg_has_role('pg_monitor','MEMBER') "
" THEN TRUE ");
if (subrole != NULL)
{
has_pg_settings = true;
}
/* from PostgreSQL 10, a non-superuser may have been granted access */
else if(PQserverVersion(conn) >= 100000)
{
PQExpBufferData query;
PGresult *res;
initPQExpBuffer(&query);
appendPQExpBufferStr(&query,
" SELECT CASE "
" WHEN pg_catalog.pg_has_role('pg_monitor','MEMBER') "
" THEN TRUE "
" WHEN pg_catalog.pg_has_role('pg_read_all_settings','MEMBER') "
" THEN TRUE "
" ELSE FALSE "
" END AS has_pg_settings");
res = PQexec(conn, query.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_db_error(conn, query.data,
_("connection_has_pg_settings(): unable to query user roles"));
}
else
{
has_pg_settings = atobool(PQgetvalue(res, 0, 0));
}
termPQExpBuffer(&query);
PQclear(res);
appendPQExpBuffer(&query,
" WHEN pg_catalog.pg_has_role('%s','MEMBER') "
" THEN TRUE ",
subrole);
}
return has_pg_settings;
appendPQExpBufferStr(&query,
" ELSE FALSE "
" END AS has_pg_monitor");
res = PQexec(conn, query.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_db_error(conn, query.data,
_("connection_has_pg_monitor_role(): unable to query user roles"));
}
else
{
has_pg_monitor_role = atobool(PQgetvalue(res, 0, 0));
}
termPQExpBuffer(&query);
PQclear(res);
return has_pg_monitor_role;
}
@@ -2497,7 +2548,10 @@ resume_wal_replay(PGconn *conn)
/* Node record functions */
/* ===================== */
/*
* Note: init_defaults may only be false when the caller is refreshing a previously
* populated record.
*/
static RecordStatus
_get_node_record(PGconn *conn, char *sqlquery, t_node_info *node_info, bool init_defaults)
{
@@ -2528,6 +2582,10 @@ _get_node_record(PGconn *conn, char *sqlquery, t_node_info *node_info, bool init
}
/*
* Note: init_defaults may only be false when the caller is refreshing a previously
* populated record.
*/
static void
_populate_node_record(PGresult *res, t_node_info *node_info, int row, bool init_defaults)
{
@@ -2858,6 +2916,37 @@ get_all_node_records(PGconn *conn, NodeInfoList *node_list)
return success;
}
bool
get_all_nodes_count(PGconn *conn, int *count)
{
PQExpBufferData query;
PGresult *res = NULL;
bool success = true;
initPQExpBuffer(&query);
appendPQExpBufferStr(&query,
" SELECT count(*) "
" FROM repmgr.nodes n ");
log_verbose(LOG_DEBUG, "get_all_nodes_count():\n%s", query.data);
res = PQexec(conn, query.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_db_error(conn, query.data, _("get_all_nodes_count(): unable to execute query"));
success = false;
}
else
{
*count = atoi(PQgetvalue(res, 0, 0));
}
PQclear(res);
termPQExpBuffer(&query);
return success;
}
void
get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *node_list)
@@ -2951,7 +3040,7 @@ get_child_nodes(PGconn *conn, int node_id, NodeInfoList *node_list)
" WHERE n.upstream_node_id = %i ",
node_id);
log_verbose(LOG_DEBUG, "get_active_sibling_node_records():\n%s", query.data);
log_verbose(LOG_DEBUG, "get_child_nodes():\n%s", query.data);
res = PQexec(conn, query.data);
@@ -3514,11 +3603,21 @@ witness_copy_node_records(PGconn *primary_conn, PGconn *witness_conn)
return false;
}
get_all_node_records(primary_conn, &nodes);
if (get_all_node_records(primary_conn, &nodes) == false)
{
rollback_transaction(witness_conn);
return false;
}
for (cell = nodes.head; cell; cell = cell->next)
{
create_node_record(witness_conn, NULL, cell->node_info);
if (create_node_record(witness_conn, NULL, cell->node_info) == false)
{
rollback_transaction(witness_conn);
return false;
}
}
/* and done */
@@ -4143,7 +4242,7 @@ _create_event(PGconn *conn, t_configuration_options *options, int node_id, char
}
break;
case 'p':
/* %p: primary id ("standby_switchover": former primary id) */
/* %p: primary id ("standby_switchover"/"repmgrd_failover_promote": former primary id) */
src_ptr++;
if (event_info->node_id != UNKNOWN_NODE_ID)
{
@@ -4728,7 +4827,7 @@ cancel_query(PGconn *conn, int timeout)
* Wait until current query finishes, ignoring any results.
* Usually this will be an async query or query cancellation.
*
* Returns 1 for success; 0 if any error ocurred; -1 if timeout reached.
* Returns 1 for success; 0 if any error occurred; -1 if timeout reached.
*/
int
wait_connection_availability(PGconn *conn, int timeout)
@@ -5479,21 +5578,9 @@ get_replication_info(PGconn *conn, t_server_type node_type, ReplInfo *replicatio
}
else
{
if (PQserverVersion(conn) >= 90400)
{
appendPQExpBufferStr(&query,
" COALESCE(pg_catalog.pg_last_xlog_receive_location(), '0/0'::PG_LSN) AS last_wal_receive_lsn, "
" COALESCE(pg_catalog.pg_last_xlog_replay_location(), '0/0'::PG_LSN) AS last_wal_replay_lsn, ");
}
else
{
/* 9.3 does not have "pg_lsn" datatype */
appendPQExpBufferStr(&query,
" COALESCE(pg_catalog.pg_last_xlog_receive_location(), '0/0') AS last_wal_receive_lsn, "
" COALESCE(pg_catalog.pg_last_xlog_replay_location(), '0/0') AS last_wal_replay_lsn, ");
}
appendPQExpBufferStr(&query,
" COALESCE(pg_catalog.pg_last_xlog_receive_location(), '0/0'::PG_LSN) AS last_wal_receive_lsn, "
" COALESCE(pg_catalog.pg_last_xlog_replay_location(), '0/0'::PG_LSN) AS last_wal_replay_lsn, "
" CASE WHEN pg_catalog.pg_is_in_recovery() IS FALSE "
" THEN FALSE "
" ELSE pg_catalog.pg_is_xlog_replay_paused() "
@@ -5664,28 +5751,11 @@ get_node_replication_stats(PGconn *conn, t_node_info *node_info)
appendPQExpBufferStr(&query,
" SELECT pg_catalog.current_setting('max_wal_senders')::INT AS max_wal_senders, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_stat_replication) AS attached_wal_receivers, ");
/* no replication slots in PostgreSQL 9.3 */
if (PQserverVersion(conn) < 90400)
{
appendPQExpBufferStr(&query,
" 0 AS max_replication_slots, "
" 0 AS total_replication_slots, "
" 0 AS active_replication_slots, "
" 0 AS inactive_replication_slots, ");
}
else
{
appendPQExpBufferStr(&query,
" current_setting('max_replication_slots')::INT AS max_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE slot_type='physical') AS total_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE active IS TRUE AND slot_type='physical') AS active_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE active IS FALSE AND slot_type='physical') AS inactive_replication_slots, ");
}
appendPQExpBufferStr(&query,
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_stat_replication) AS attached_wal_receivers, "
" current_setting('max_replication_slots')::INT AS max_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE slot_type='physical') AS total_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE active IS TRUE AND slot_type='physical') AS active_replication_slots, "
" (SELECT pg_catalog.count(*) FROM pg_catalog.pg_replication_slots WHERE active IS FALSE AND slot_type='physical') AS inactive_replication_slots, "
" pg_catalog.pg_is_in_recovery() AS in_recovery");
log_verbose(LOG_DEBUG, "get_node_replication_stats():\n%s", query.data);
@@ -5720,16 +5790,15 @@ get_node_replication_stats(PGconn *conn, t_node_info *node_info)
NodeAttached
is_downstream_node_attached(PGconn *conn, char *node_name)
is_downstream_node_attached(PGconn *conn, char *node_name, char **node_state)
{
PQExpBufferData query;
PGresult *res = NULL;
int c = 0;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" SELECT pg_catalog.count(*) "
" SELECT pid, state "
" FROM pg_catalog.pg_stat_replication "
" WHERE application_name = '%s'",
node_name);
@@ -5748,31 +5817,68 @@ is_downstream_node_attached(PGconn *conn, char *node_name)
return NODE_ATTACHED_UNKNOWN;
}
if (PQntuples(res) != 1)
{
log_verbose(LOG_WARNING, _("unexpected number of tuples (%i) returned"), PQntuples(res));
termPQExpBuffer(&query);
/*
* If there's more than one entry in pg_stat_application, there's no
* way we can reliably determine which one belongs to the node we're
* checking, so there's nothing more we can do.
*/
if (PQntuples(res) > 1)
{
log_error(_("multiple entries with \"application_name\" set to \"%s\" found in \"pg_stat_replication\""),
node_name);
log_hint(_("verify that a unique node name is configured for each node"));
termPQExpBuffer(&query);
PQclear(res);
return NODE_ATTACHED_UNKNOWN;
}
c = atoi(PQgetvalue(res, 0, 0));
termPQExpBuffer(&query);
PQclear(res);
if (c == 0)
if (PQntuples(res) == 0)
{
log_verbose(LOG_WARNING, _("node \"%s\" not found in \"pg_stat_replication\""), node_name);
log_warning(_("node \"%s\" not found in \"pg_stat_replication\""), node_name);
PQclear(res);
return NODE_DETACHED;
}
if (c > 1)
log_verbose(LOG_WARNING, _("multiple entries with \"application_name\" set to \"%s\" found in \"pg_stat_replication\""),
node_name);
/*
* If the connection is not a superuser or member of pg_read_all_stats, we
* won't be able to retrieve the "state" column, so we'll assume
* the node is attached.
*/
if (connection_has_pg_monitor_role(conn, "pg_read_all_stats"))
{
const char *state = PQgetvalue(res, 0, 1);
if (node_state != NULL)
{
int state_len = strlen(state);
*node_state = palloc0(state_len + 1);
strncpy(*node_state, state, state_len);
}
if (strcmp(state, "streaming") != 0)
{
log_warning(_("node \"%s\" attached in state \"%s\""),
node_name,
state);
PQclear(res);
return NODE_NOT_ATTACHED;
}
}
else if (node_state != NULL)
{
*node_state = palloc0(1);
*node_state[0] = '\0';
}
PQclear(res);
return NODE_ATTACHED;
}
@@ -5902,6 +6008,43 @@ is_wal_replay_paused(PGconn *conn, bool check_pending_wal)
return is_paused;
}
/* repmgrd status functions */
CheckStatus
get_repmgrd_status(PGconn *conn)
{
PQExpBufferData query;
PGresult *res = NULL;
CheckStatus repmgrd_status = CHECK_STATUS_CRITICAL;
initPQExpBuffer(&query);
appendPQExpBufferStr(&query,
" SELECT "
" CASE "
" WHEN repmgr.repmgrd_is_running() "
" THEN "
" CASE "
" WHEN repmgr.repmgrd_is_paused() THEN 1 ELSE 0 "
" END "
" ELSE 2 "
" END AS repmgrd_status");
res = PQexec(conn, query.data);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_db_error(conn, query.data, _("unable to execute repmgrd status query"));
}
else
{
repmgrd_status = atoi(PQgetvalue(res, 0, 0));
}
termPQExpBuffer(&query);
PQclear(res);
return repmgrd_status;
}
/* miscellaneous debugging functions */

View File

@@ -1,7 +1,7 @@
/*
* dbutils.h
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -119,9 +119,14 @@ typedef enum
typedef enum
{
/* unable to query "pg_stat_replication" or other error */
NODE_ATTACHED_UNKNOWN = -1,
NODE_DETACHED,
NODE_ATTACHED
/* node has record in "pg_stat_replication" and state is not "streaming" */
NODE_ATTACHED,
/* node has record in "pg_stat_replication" but state is not "streaming" */
NODE_NOT_ATTACHED,
/* node has no record in "pg_stat_replication" */
NODE_DETACHED
} NodeAttached;
typedef enum
@@ -413,6 +418,7 @@ void conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
void param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
void param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
char *param_get(t_conninfo_param_list *param_list, const char *param);
bool validate_conninfo_string(const char *conninfo_str, char **errmsg);
bool parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params);
char *param_list_to_string(t_conninfo_param_list *param_list);
char *normalize_conninfo_string(const char *conninfo_str);
@@ -447,7 +453,7 @@ TimeLineHistoryEntry *get_timeline_history(PGconn *repl_conn, TimeLineID tli);
/* user/role information functions */
bool can_execute_pg_promote(PGconn *conn);
bool connection_has_pg_settings(PGconn *conn);
bool connection_has_pg_monitor_role(PGconn *conn, const char *subrole);
bool is_replication_role(PGconn *conn, char *rolname);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
@@ -490,6 +496,7 @@ bool get_local_node_record(PGconn *conn, int node_id, t_node_info *node_info);
bool get_primary_node_record(PGconn *conn, t_node_info *node_info);
bool get_all_node_records(PGconn *conn, NodeInfoList *node_list);
bool get_all_nodes_count(PGconn *conn, int *count);
void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
bool get_child_nodes(PGconn *conn, int node_id, NodeInfoList *node_list);
@@ -589,12 +596,15 @@ bool get_replication_info(PGconn *conn, t_server_type node_type, ReplInfo *repl
int get_replication_lag_seconds(PGconn *conn);
TimeLineID get_node_timeline(PGconn *conn, char *timeline_id_str);
void get_node_replication_stats(PGconn *conn, t_node_info *node_info);
NodeAttached is_downstream_node_attached(PGconn *conn, char *node_name);
NodeAttached is_downstream_node_attached(PGconn *conn, char *node_name, char **node_state);
void set_upstream_last_seen(PGconn *conn, int upstream_node_id);
int get_upstream_last_seen(PGconn *conn, t_server_type node_type);
bool is_wal_replay_paused(PGconn *conn, bool check_pending_wal);
/* repmgrd status functions */
CheckStatus get_repmgrd_status(PGconn *conn);
/* miscellaneous debugging functions */
const char *print_node_status(NodeStatus node_status);
const char *print_pqping_status(PGPing ping_status);

View File

@@ -3,7 +3,7 @@
* dirmod.c
* directory handling functions
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -109,9 +109,56 @@ create_dir(const char *path)
bool
set_dir_permissions(const char *path)
set_dir_permissions(const char *path, int server_version_num)
{
return (chmod(path, 0700) != 0) ? false : true;
struct stat stat_buf;
bool no_group_access =
(server_version_num != UNKNOWN_SERVER_VERSION_NUM) &&
(server_version_num < 110000);
/*
* At this point the path should exist, so this check is very
* much just-in-case.
*/
if (stat(path, &stat_buf) != 0)
{
if (errno == ENOENT)
{
log_warning(_("directory \"%s\" does not exist"), path);
}
else
{
log_warning(_("could not read permissions of directory \"%s\""),
path);
log_detail("%s", strerror(errno));
}
return false;
}
/*
* If mode is not 0700 or 0750, attempt to change.
*/
if ((no_group_access == true && (stat_buf.st_mode & (S_IRWXG | S_IRWXO)))
|| (no_group_access == false && (stat_buf.st_mode & (S_IWGRP | S_IRWXO))))
{
/*
* Currently we default to 0700.
* There is no facility to override this directly,
* but the user can manually create the directory with
* the desired permissions.
*/
if (chmod(path, 0700) != 0) {
log_error(_("unable to change permissions of directory \"%s\""), path);
log_detail("%s", strerror(errno));
return false;
}
return true;
}
/* Leave as-is */
return true;
}
@@ -158,7 +205,7 @@ mkdir_p(char *path, mode_t omode)
/*
* POSIX 1003.2: For each dir operand that does not name an
* existing directory, effects equivalent to those caused by the
* following command shall occcur:
* following command shall occur:
*
* mkdir -p -m $(umask -S),u+wx $(dirname dir) && mkdir [-m mode]
* dir
@@ -242,7 +289,7 @@ is_pg_running(const char *path)
{
/*
* No PID file - PostgreSQL shouldn't be running. From 9.3 (the
* earliesty version we care about) removal of the PID file will
* earliest version we care about) removal of the PID file will
* cause the postmaster to shut down, so it's highly unlikely
* that PostgreSQL will still be running.
*/
@@ -303,7 +350,7 @@ create_pg_dir(const char *path, bool force)
switch (check_dir(path))
{
case DIR_NOENT:
/* directory does not exist, attempt to create it */
/* Directory does not exist, attempt to create it. */
log_info(_("creating directory \"%s\"..."), path);
if (!create_dir(path))
@@ -314,14 +361,23 @@ create_pg_dir(const char *path, bool force)
}
break;
case DIR_EMPTY:
/* exists but empty, fix permissions and use it */
/*
* Directory exists but empty, fix permissions and use it.
*
* Note that at this point the caller might not know the server
* version number, so in this case "set_dir_permissions()" will
* accept 0750 as a valid setting. As this is invalid in Pg10 and
* earlier, the caller should call "set_dir_permissions()" again
* when it has the number.
*
* We need to do the permissions check here in any case to catch
* fatal permissions early.
*/
log_info(_("checking and correcting permissions on existing directory \"%s\""),
path);
if (!set_dir_permissions(path))
if (!set_dir_permissions(path, UNKNOWN_SERVER_VERSION_NUM))
{
log_error(_("unable to change permissions of directory \"%s\""), path);
log_detail("%s", strerror(errno));
return false;
}
break;

View File

@@ -1,6 +1,6 @@
/*
* dirutil.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -35,7 +35,7 @@ typedef enum
} PgDirState;
extern int mkdir_p(char *path, mode_t omode);
extern bool set_dir_permissions(const char *path);
extern bool set_dir_permissions(const char *path, int server_version_num);
extern DataDirState check_dir(const char *path);
extern bool create_dir(const char *path);

View File

@@ -95,6 +95,8 @@ clean:
rm -f repmgr.html
rm -f repmgr-A4.pdf
rm -f repmgr-US.pdf
rm -f *.fo
rm -f html/*
maintainer-clean:
rm -rf html

View File

@@ -22,6 +22,9 @@
&repmgr; 5 is fundamentally the same code base as &repmgr; 4, but provides
support for the revised replication configuration mechanism in PostgreSQL 12.
</para>
<para>
Support for PostgreSQL 9.3 is no longer available from &repmgr; 5.2.
</para>
</note>
<para>
&repmgr; 3.x builds on the improved replication facilities added
@@ -59,7 +62,7 @@
<tip>
<para>
2ndQuadrant's recommended configuration is to configure
Our recommended configuration is to configure
<ulink url="https://www.pgbarman.org/">Barman</ulink> as a fallback
source of WAL files, rather than maintain replication slots for
each standby. See also: <link linkend="cloning-from-barman-restore-command">Using Barman as a WAL file source</link>.
@@ -124,7 +127,7 @@
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
Either use PostgreSQL packages provided by the community or EnterpriseDB; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
@@ -167,7 +170,7 @@
<para>
If different &quot;minor&quot; &repmgr; versions (e.g. 4.1.1 and 4.1.6) are installed,
&repmgr; will function, but we strongly recommend always running the same version
to ensure there are no unexpected suprises, e.g. a newer version behaving slightly
to ensure there are no unexpected surprises, e.g. a newer version behaving slightly
differently to the older version.
</para>
<para>
@@ -209,11 +212,11 @@
</para>
</sect2>
<sect2 id="faq-third-party-packages" xreflabel="Compatability with third party vendor packages">
<sect2 id="faq-third-party-packages" xreflabel="Compatibility with third party vendor packages">
<title>Are &repmgr; packages compatible with <literal>$third_party_vendor</literal>'s packages?</title>
<para>
&repmgr; packages provided by 2ndQuadrant are compatible with the community-provided PostgreSQL
packages and any software provided by 2ndQuadrant.
&repmgr; packages provided by EnterpriseDB are compatible with the community-provided PostgreSQL
packages and specified software provided by EnterpriseDB.
</para>
<para>
A number of other vendors provide their own versions of PostgreSQL packages, often with different
@@ -308,7 +311,7 @@
</para>
</sect2>
<sect2 id="faq-repmgr-shared-preload-libaries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<sect2 id="faq-repmgr-shared-preload-libraries-no-repmgrd" xreflabel="shared_preload_libraries without repmgrd">
<title>Do I need to include <literal>shared_preload_libraries = 'repmgr'</literal>
in <filename>postgresql.conf</filename> if I'm not using &repmgrd;?</title>
<para>
@@ -347,6 +350,9 @@
and earlier) with the absolute path to the WAL directory in <varname>pg_basebackup_options</varname>.
For more details see <xref linkend="cloning-advanced-pg-basebackup-options"/>.
</para>
<para>
In &repmgr; 5.2 and later, this setting will also be honoured when cloning from Barman.
</para>
</sect2>
<sect2 id="faq-repmgr-events-no-fkey" xreflabel="No foreign key on node_id in repmgr.events">
@@ -373,6 +379,22 @@
</para>
</sect2>
<sect2 id="faq-repmgr-exclude-metadata-from-dump" xreflabel="Excluding repmgr metadata from pg_dump output">
<title>How can I exclude &repmgr; metadata from <application>pg_dump</application> output?</title>
<para>
Beginning with &repmgr; 5.2, the metadata tables associated with the &repmgr; extension
(<literal>repmgr.nodes</literal>, <literal>repmgr.events</literal> and <literal>repmgr.monitoring_history</literal>)
have been marked as dumpable as they contain configuration and user-generated data.
</para>
<para>
To exclude these from <application>pg_dump</application> output, add the flag <option>--exclude-schema=repmgr</option>.
</para>
<para>
To exclude individual &repmgr; metadata tables from <application>pg_dump</application> output, add the flag
e.g. <option>--exclude-table=repmgr.monitoring_history</option>. This flag can be provided multiple times
to exclude individual tables,
</para>
</sect2>
</sect1>
@@ -437,7 +459,7 @@
</title>
<para>
<varname>promote_command</varname> or <varname>follow_command</varname> can be user-defined scripts,
so &repmgr; will not apply <option>pg_bindir</option> even if excuting &repmgr;. Always provide the full
so &repmgr; will not apply <option>pg_bindir</option> even if executing &repmgr;. Always provide the full
path; see <xref linkend="repmgrd-automatic-failover-configuration"/> for more details.
</para>
</sect2>
@@ -457,7 +479,7 @@
is out-of-date, which may lead to incorrect failover behaviour.
</para>
<para>
The onus is therefore on the adminstrator to manually set the cluster to a stable, healthy state before
The onus is therefore on the administrator to manually set the cluster to a stable, healthy state before
starting &repmgrd;.
</para>
</sect2>

View File

@@ -50,19 +50,18 @@
<title>CentOS repositories</title>
<para>
&repmgr; packages are available from the public 2ndQuadrant repository, and also the
PostgreSQL community repository. The 2ndQuadrant repository is updated immediately
after each
&repmgr; release.
&repmgr; packages are available from the public EDB repository, and also the
PostgreSQL community repository. The EDB repository is updated immediately
after each &repmgr; release.
</para>
<table id="centos-2ndquadrant-repository">
<title>2ndQuadrant public repository</title>
<title>EDB public repository</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
<entry><ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
@@ -253,7 +252,7 @@
</indexterm>
<para>
&repmgr; <literal>.deb</literal> packages are provided by 2ndQuadrant as well as the
&repmgr; <literal>.deb</literal> packages are provided by EDB as well as the
PostgreSQL Community APT repository, and are available for each community-supported
PostgreSQL version, currently supported Debian releases, and currently supported
Ubuntu LTS releases.
@@ -263,12 +262,12 @@
<title>APT repositories</title>
<table id="apt-2ndquadrant-repository">
<title>2ndQuadrant public repository</title>
<title>EDB public repository</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink></entry>
<entry><ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
@@ -398,11 +397,11 @@
</indexterm>
<indexterm>
<primary>packages</primary>
<secondary>snaphots</secondary>
<secondary>snapshots</secondary>
</indexterm>
<para>
For testing new features and bug fixes, from time to time 2ndQuadrant provides
For testing new features and bug fixes, from time to time EDB provides
so-called &quot;snapshot packages&quot; via its public repository. These packages
are built from the &repmgr; source at a particular point in time, and are not formal
releases.
@@ -414,22 +413,22 @@
</para>
</note>
<para>
To install a snapshot package, it's necessary to install the 2ndQuadrant public snapshot repository,
following the instructions here: <ulink url="https://dl.2ndquadrant.com/default/release/site/">https://dl.2ndquadrant.com/default/release/site/</ulink> but replace <literal>release</literal> with <literal>snapshot</literal>
To install a snapshot package, it's necessary to install the EDB public snapshot repository,
following the instructions here: <ulink url="https://dl.enterprisedb.com/default/release/site/">https://dl.enterprisedb.com/default/release/site/</ulink> but replace <literal>release</literal> with <literal>snapshot</literal>
in the appropriate URL.
</para>
<para>
For example, to install the snapshot RPM repository for PostgreSQL 9.6, execute (as <literal>root</literal>):
<programlisting>
curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | bash</programlisting>
curl https://dl.enterprisedb.com/default/snapshot/get/9.6/rpm | bash</programlisting>
or as a normal user with root sudo access:
<programlisting>
curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | sudo bash</programlisting>
curl https://dl.enterprisedb.com/default/snapshot/get/9.6/rpm | sudo bash</programlisting>
</para>
<para>
Alternatively you can browse the repository here:
<ulink url="https://dl.2ndquadrant.com/default/snapshot/browse/">https://dl.2ndquadrant.com/default/snapshot/browse/</ulink>.
<ulink url="https://dl.enterprisedb.com/default/snapshot/browse/">https://dl.enterprisedb.com/default/snapshot/browse/</ulink>.
</para>
<para>
Once the repository is installed, installing or updating &repmgr; will result in the latest snapshot
@@ -439,7 +438,7 @@ curl https://dl.2ndquadrant.com/default/snapshot/get/9.6/rpm | sudo bash</progra
The package name will be formatted like this:
<programlisting>
repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
containg the snapshot build number (here: <literal>320</literal>) and the hash
containing the snapshot build number (here: <literal>320</literal>) and the hash
of the <application>git</application> commit it was built from (here: <literal>g5113ab0</literal>).
</para>
@@ -471,7 +470,7 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
<title>Debian/Ubuntu</title>
<para>
An archive of old packages (<literal>3.3.2</literal> and later) for Debian/Ubuntu-based systems is available here:
<ulink url="http://atalia.postgresql.org/morgue/r/repmgr/">http://atalia.postgresql.org/morgue/r/repmgr/</ulink>
<ulink url="https://apt-archive.postgresql.org/">https://apt-archive.postgresql.org/</ulink>
</para>
</sect2>
@@ -494,32 +493,6 @@ repmgr96-4.1.1-0.0git320.g5113ab0.1.el7.x86_64.rpm</programlisting>
yum install repmgr96-4.0.6-1.rhel6</programlisting>
</para>
<sect3 id="packages-old-versions-rhel-centos-repmgr3">
<title>repmgr 3 packages</title>
<para>
Old &repmgr; 3 RPM packages (<literal>3.2</literal> and later) can be retrieved from the
(deprecated) 2ndQuadrant repository at
<ulink url="http://packages.2ndquadrant.com/repmgr/yum/">http://packages.2ndquadrant.com/repmgr/yum/</ulink>
by installing the appropriate repository RPM:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
<listitem>
<simpara>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
</itemizedlist>
</sect3>
</sect2>
</sect1>

View File

@@ -16,8 +16,426 @@
</para>
<!-- remember to update the release date in ../repmgr_version.h.in -->
<sect1 id="release-5.3.1">
<title id="release-current">Release 5.3.1</title>
<para><emphasis>Tue 15 February, 2022</emphasis></para>
<para>
&repmgr; 5.3.1 is a minor release.
</para>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Fix upgrade path from &repmgr; 4.2 and 4.3 to &repmgr; 5.3.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: ensure potentially open connections are closed.
</para>
<para>
In some cases, when recovering from degraded state in local node monitoring,
new connection was opened to the local node without closing
the old one, which will result in memory leakage.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.1-0">
<sect1 id="release-5.3.0">
<title>Release 5.3.0</title>
<para><emphasis>Tue 12 October, 2021</emphasis></para>
<para>
&repmgr; 5.3.0 is a major release.
</para>
<para>
This release provides support for <ulink url="https://www.postgresql.org/docs/14/release-14.html">PostgreSQL 14</ulink>,
released in September 2021.
</para>
<sect2>
<title>Improvements</title>
<para>
<itemizedlist>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Improve handling of node rejoin failure on the demotion candidate.
</para>
<para>
Previously &repmgr; did not check whether <command>repmgr node rejoin</command> actually
succeeded on the demotion candidate, and would always wait up to <varname>node_rejoin_timeout</varname>
seconds for it to attach to the promotion candidate, even if this would never happen.
</para>
<para>
This makes it easier to identify unexpected events during a switchover operation, such as
the demotion candidate being unexpectedly restarted by an external process.
</para>
<para>
Note that the output of the <link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link>
operation on the demotion candidate will now be logged to a temporary file on that node;
the location of the file will be reported in the error message, if one is emitted.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: at startup, if node record is marked as "inactive", attempt
to set it to "active".
</para>
<para>
This behaviour can be overridden by setting the configuration parameter
<varname>repmgrd_exit_on_inactive_node</varname> to <literal>true</literal>.
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>:
emit rejoin target note information as <literal>NOTICE</literal>.
</para>
<para>
This makes it clearer what &repmgr; is trying to do.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check">repmgr node check</link>:
option <option>--repmgrd</option> added to check &repmgrd;
status.
</para>
</listitem>
<listitem>
<para>
Add <literal>%p</literal> <link linkend="event-notifications">event notification parameter</link>
providing the node ID of the former primary for the <literal>repmgrd_failover_promote</literal> event.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
<command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>:
if using <option>--replication-conf-only</option> on a node
which was set up without replication slots, but the &repmgr; configuration
was since changed to <option>use_replication_slots=1</option>,
&repmgr; will now set <varname>slot_name</varname> in the
node record, if it was previously empty.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: rename internal shared library functions to minimize the
risk of clashes with other shared libraries.
</para>
<para>
This does not affect user-facing SQL functions. However an upgrade
of the installed extension version is required.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: ensure short option <option>-s</option> is accepted.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.2.1">
<title>Release 5.2.1</title>
<para><emphasis>Mon 7 December, 2020</emphasis></para>
<para>
&repmgr; 5.2.1 is a minor release.
</para>
<sect2>
<title>Improvements</title>
<para>
<itemizedlist>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
option <option>--recovery-min-apply-delay</option> added, overriding any
setting present in <filename>repmgr.conf</filename>.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Configuration: fix parsing of <option>replication_type</option> configuration parameter. GitHub #672.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
handle case where <filename>postgresql.auto.conf</filename> is absent on the
source node.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
in PostgreSQL 11 and later, an existing data directory's permissions
will not be changed to <option>0700</option> if they are already set to
<option>0750</option>.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: prevent termination when local node not available and
<option>standby_disconnect_on_failover</option> is set. GitHub #675.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: ensure <option>reconnect_interval</option> is correctly handled.
GitHub #673.
</para>
</listitem>
<listitem>
<para>
<command>repmgr witness --help</command>: fix <command>witness unregister</command>
description. GitHub #676.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.2.0">
<title>Release 5.2.0</title>
<para><emphasis>Thu 22 October, 2020</emphasis></para>
<para>
&repmgr; 5.2.0 is a major release.
</para>
<para>
This release provides support for <ulink url="https://www.postgresql.org/docs/13/release-13.html">PostgreSQL 13</ulink>, released in September 2020.
</para>
<para>
This release removes support for PostgreSQL 9.3, which was
<ulink url="https://www.postgresql.org/docs/9.3/release-9-3-25.html">designated EOL in November 2018</ulink>.
</para>
<sect2>
<title>General improvements</title>
<para>
<itemizedlist>
<listitem>
<para>
Configuration: support <command>include</command>, <command>include_dir</command> and
<command>include_if_exists</command> directives (see <xref linkend="configuration-file-include-directives"/>).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link>:
Improve sanity check failure log output from the demotion candidate.
</para>
<para>
If database connection configuration is not consistent across all nodes, it's
possible remote &repmgr; invocations (e.g. during switchover, from the promotion candidate
to the demotion candidate) will not be able to connect to the database. This will
now be explicitly reported as a database connection failure, rather than as a failure
of the respective sanity check.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link> /
<link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link>:
improve text mode output format, in particular so that node identifiers of arbitrary length are
displayed correctly.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-primary-unregister">repmgr primary unregister</link>:
the <option>--force</option> can be provided to unregister an active primary node, provided
it has no registered standby nodes.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>: new option
<option>--verify-backup</option> to run PostgreSQL's
<ulink url="https://www.postgresql.org/docs/13/app-pgverifybackup.html">pg_verifybackup</ulink>
utility after cloning a standby to verify the integrity of the copied data
(PostgreSQL 13 and later).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
when cloning from Barman, setting <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
<option>pg_basebackup_options</option> will cause &repmgr; to create
a WAL directory outside of the main data directory and symlink
it from there, in the same way as would happen when cloning
using <application>pg_basebackup</application>.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-standby-follow">repmgr standby follow</link>:
In PostgreSQL 13 and later, a standby no longer requires a restart to
follow a new upstream node.
</para>
<para>
The old behaviour (always restarting the standby to follow a new node)
can be restored by setting the configuration file parameter
<varname>standby_follow_restart</varname> to <literal>true</literal>.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
enable a node to attach to a target node even the target node
has a lower timeline (PostgreSQL 9.6 and later).
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
in PostgreSQL 13 and later, support <application>pg_rewind</application>'s
ability to automatically run crash recovery on a PostgreSQL instance
which was not shut down cleanly.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check">repmgr node check</link>:
option <option>--db-connection</option> added to check if &repmgr;
can connect to the database on the local node.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check">repmgr node check</link>:
report database connection error if the <option>--optformat</option> was provided.
</para>
</listitem>
<listitem>
<para>
Improve handling of pg_control read errors.
</para>
</listitem>
<listitem>
<para>
It is now possible to dump the contents of &repmgr; metadata tables with
<application>pg_dump</application>.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>repmgrd enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
Following additional parameters can be provided to <varname>failover_validation_command</varname>:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><literal>%n</literal>: node ID</simpara>
</listitem>
<listitem>
<simpara><literal>%a</literal>: node name</simpara>
</listitem>
<listitem>
<simpara><literal>%v</literal>: number of visible nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%u</literal>: number of shared upstream nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%t</literal>: total number of nodes</simpara>
</listitem>
</itemizedlist>
</para>
</listitem>
<listitem>
<para>
Configuration option <varname>always_promote</varname> (default: <literal>false</literal>)
to control whether a node should be promoted if the &repmgr; metadata is not up-to-date
on that node.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
<link linkend="repmgr-standby-clone">repmgr standby clone</link>:
fix issue with cloning from Barman where the tablespace mapping file was
not flushed to disk before attempting to retrieve files from Barman. GitHub #650.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-rejoin">repmgr node rejoin</link>:
ensure that when verifying a standby node has attached to its upstream, the
node has started streaming before confirming the success of the rejoin operation.
</para>
</listitem>
<listitem>
<para>
&repmgrd;: ensure primary connection is reset if same as upstream. GitHub #633.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-5.1.0">
<title>Release 5.1.0</title>
<para><emphasis>Mon 13 April, 2020</emphasis></para>
@@ -211,7 +629,7 @@
</sect1>
<sect1 id="release-5.0">
<title id="release-current">Release 5.0</title>
<title>Release 5.0</title>
<para><emphasis>Tue 15 October, 2019</emphasis></para>
<para>
@@ -921,7 +1339,7 @@ REPMGRD_OPTS="--daemonize=false"</programlisting>
</para>
<para>
Possible values are <literal>ping</literal> (default; uses <command>PQping()</command> to
determine server availability), <literal>connection</literal> (attempst to make a new connection to
determine server availability), <literal>connection</literal> (attempts to make a new connection to
the upstream node), and <literal>query</literal> (determines server availability
by executing an SQL statement on the node via the existing connection).
</para>

View File

@@ -7,20 +7,20 @@
</indexterm>
<para>
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides 24x7
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides 24x7
production support for &repmgr; and other PostgreSQL
products, including configuration assistance, installation
verification and training for running a robust replication cluster.
</para>
<para>
For further details see: <ulink url="https://2ndquadrant.com/en/support/">https://2ndquadrant.com/en/support/</ulink>
For further details see: <ulink url="https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql">Support Center</ulink>
</para>
<para>
A mailing list/forum is provided via Google groups to discuss contributions or issues: <ulink url="https://groups.google.com/group/repmgr">https://groups.google.com/group/repmgr</ulink>.
</para>
<para>
Please report bugs and other issues to: <ulink url="https://github.com/2ndQuadrant/repmgr">https://github.com/2ndQuadrant/repmgr</ulink>.
Please report bugs and other issues to: <ulink url="https://github.com/EnterpriseDB/repmgr">https://github.com/EnterpriseDB/repmgr</ulink>.
</para>
<important>
@@ -64,7 +64,7 @@
<listitem>
<simpara>
<filename>repmpgr.conf</filename> files (suitably anonymized if necessary)
<filename>repmgr.conf</filename> files (suitably anonymized if necessary)
</simpara>
</listitem>

View File

@@ -15,7 +15,7 @@
<para>
<xref linkend="repmgr-standby-clone"/> can use
<ulink url="https://www.2ndquadrant.com/">2ndQuadrant</ulink>'s
<ulink url="https://www.enterprisedb.com/">EDB</ulink>'s
<ulink url="https://www.pgbarman.org/">Barman</ulink> application
to clone a standby (and also as a fallback source for WAL files).
</para>
@@ -46,6 +46,7 @@
<para>
WAL management on the primary becomes much easier as there's no need
to use replication slots, and <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>)
does not need to be set.
</para>
</listitem>
@@ -147,6 +148,15 @@ description = "Main cluster"
section in <command>man 5 ssh_config</command> for more details.
</simpara>
</tip>
<para>
If you wish to place WAL files in a location outside the main
PostgreSQL data directory, set <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
<option>pg_basebackup_options</option> to the target directory
(must be an absolute filepath). &repmgr; will create and
symlink to this directory in exactly the same way
<application>pg_basebackup</application> would.
</para>
<para>
It's now possible to clone a standby from Barman, e.g.:
<programlisting>
@@ -234,7 +244,8 @@ description = "Main cluster"
that any standby connected to the primary using a replication slot will always
be able to retrieve the required WAL files. This removes the need to manually
manage WAL file retention by estimating the number of WAL files that need to
be maintained on the primary using <varname>wal_keep_segments</varname>.
be maintained on the primary using <varname>wal_keep_segments</varname>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>).
Do however be aware that if a standby is disconnected, WAL will continue to
accumulate on the primary until either the standby reconnects or the replication
slot is dropped.
@@ -288,7 +299,7 @@ description = "Main cluster"
build up indefinitely, possibly leading to server failure.
</simpara>
<simpara>
As an alternative we recommend using 2ndQuadrant's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
As an alternative we recommend using EDB's <ulink url="https://www.pgbarman.org/">Barman</ulink>,
which offloads WAL management to a separate server, removing the requirement to use a replication
slot for each individual standby to reserve WAL. See section <xref linkend="cloning-from-barman"/>
for more details on using &repmgr; together with Barman.
@@ -308,7 +319,7 @@ description = "Main cluster"
Cascading replication, introduced with PostgreSQL 9.2, enables a standby server
to replicate from another standby server rather than directly from the primary,
meaning replication changes "cascade" down through a hierarchy of servers. This
can be used to reduce load on the primary and minimize bandwith usage between
can be used to reduce load on the primary and minimize bandwidth usage between
sites. For more details, see the
<ulink url="https://www.postgresql.org/docs/current/warm-standby.html#CASCADING-REPLICATION">
PostgreSQL cascading replication documentation</ulink>.
@@ -380,7 +391,7 @@ description = "Main cluster"
cluster, you may wish to clone a downstream standby whose upstream node
does not yet exist. In this case you can clone from the primary (or
another upstream node); provide the parameter <literal>--upstream-conninfo</literal>
to explictly set the upstream's <varname>primary_conninfo</varname> string
to explicitly set the upstream's <varname>primary_conninfo</varname> string
in <filename>recovery.conf</filename>.
</simpara>
</tip>
@@ -438,6 +449,13 @@ description = "Main cluster"
WAL directory. Any WALs generated during the cloning process will be copied here, and
a symlink will automatically be created from the main data directory.
</para>
<tip>
<para>
The <literal>--waldir</literal> (<literal>--xlogdir</literal>) option,
if present in <varname>pg_basebackup_options</varname>, will be honoured by &repmgr;
when cloning from Barman (&repmgr; 5.2 and later).
</para>
</tip>
<para>
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">PostgreSQL pg_basebackup documentation</ulink>
for more details of available options.

View File

@@ -7,6 +7,14 @@
<secondary>optional settings</secondary>
</indexterm>
<note>
<simpara>
This section documents a subset of optional configuration settings; for a full
for a full and annotated view of all configuration options see the
see the <ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">sample repmgr.conf file</ulink>
</simpara>
</note>
<variablelist>
@@ -132,5 +140,50 @@ ssh_options='-q -o ConnectTimeout=10'</programlisting>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-pg-bindir" xreflabel="pg_bindir">
<term><varname>pg_bindir</varname> (<type>string</type>)
<indexterm>
<primary><varname>pg_bindir</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Path to the PostgreSQL binary directory (location of <application>pg_ctl</application>,
<application>pg_basebackup</application> etc.). Only required
if these are not in the system <varname>PATH</varname>.
</para>
<tip>
<para>
When &repmgr; is executed via <application>SSH</application> (e.g. when running
<command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>,
<command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command> or
<command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>,
or if it is executed as cronjob), a login shell will not be used and only the
default system <varname>PATH</varname> will be set. Therefore it's recommended to set
<varname>pg_bindir</varname> so &repmgr; can correctly invoke binaries on a remote
system and avoid potential path issues.
</para>
</tip>
<para>
Debian/Ubuntu users: you will probably need to set this to the directory where
<application>pg_ctl</application> is located, e.g. <filename>/usr/lib/postgresql/9.6/bin/</filename>.
</para>
<para>
<emphasis>NOTE</emphasis>: <varname>pg_bindir</varname> is only used when &repmgr; directly
executes PostgreSQL binaries; any user-defined scripts
<emphasis>must</emphasis> be specified with the full path.
</para>
</listitem>
</varlistentry>
</variablelist>
<tip>
<simpara>
See the <ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">sample repmgr.conf file</ulink>
for a full and annotated view of all configuration options.
</simpara>
</tip>
</sect1>

View File

@@ -96,6 +96,9 @@
</variablelist>
</para>
<para>
See <xref linkend="configuration-file-optional-settings"/> for further configuration options.
</para>
</sect1>

View File

@@ -27,7 +27,9 @@
<note>
<para>
If using <application>systemd</application>, ensure you have <varname>RemoveIPC</varname> set to <literal>off</literal>.
See the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
See the <ulink url="https://www.postgresql.org/docs/current/index.html">PostgreSQL documentation</ulink> section
<ulink url="https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC">systemd RemoveIPC</ulink>
and also the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
entry in the <ulink url="https://wiki.postgresql.org/wiki/Main_Page">PostgreSQL wiki</ulink> for details.
</para>
</note>

View File

@@ -80,6 +80,51 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
</para>
</note>
<sect3 id="configuration-file-include-directives" xreflabel="configuration file include directives">
<title>Configuration file include directives</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>include directives</secondary>
</indexterm>
<para>
From &repmgr; 5.2, the configuration file can contain the following include directives:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>include</option>: include the specified file,
either as an absolute path or path relative to the current file
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_if_exists</option>: include the specified file.
The file is specified as an absolute path or path relative to the current file.
However, if it does not exist, an error will not be raised.
</simpara>
</listitem>
<listitem>
<simpara>
<option>include_dir</option>: include files in the specified directory
which have the <filename>.conf</filename> suffix.
The directory is specified either as an absolute path or path
relative to the current file
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
These behave in exactly the same way as the PostgreSQL configuration file processing;
see the <ulink url="https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES">PostgreSQL documentation</ulink>
for additional details.
</para>
</sect3>
</sect2>
@@ -119,7 +164,7 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
</para>
<para>
For a full list of annotated configuration items, see the file
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>.
</para>
<para>
For &repmgrd;-specific settings, see <xref linkend="repmgrd-configuration"/>.
@@ -182,6 +227,14 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
</itemizedlist>
</para>
<para>
In examples provided in this documentation, it is assumed the configuration file is located
at <filename>/etc/repmgr.conf</filename>. If &repmgr; is installed from a package, the
configuration file will probably be located at another location specified by the packager;
see appendix <xref linkend="appendix-packages"/> for configuration file locations in
different packaging systems.
</para>
<para>
Note that if a file is explicitly specified with <literal>-f/--config-file</literal>,
an error will be raised if it is not found or not readable, and no attempt will be made to
@@ -202,6 +255,61 @@ conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'</programlistin
<filename>/path/to/repmgr.conf</filename>).
</para>
</note>
</sect2>
<sect2 id="configuration-file-postgresql-major-upgrades" xreflabel="configuration file and PostgreSQL major version upgrades">
<title>Configuration file and PostgreSQL major version upgrades</title>
<indexterm>
<primary>repmgr.conf</primary>
<secondary>PostgreSQL major version upgrades</secondary>
</indexterm>
<para>
When upgrading the PostgreSQL cluster to a new major version, <filename>repmgr.conf</filename>
will probably needed to be updated.
</para>
<para>
Usually <option>pg_bindir</option> and <option>data_directory</option> will need to be modified,
particularly if the default package locations are used, as these usually change.
</para>
<para>
It's also possible the location of <filename>repmgr.conf</filename> itself will change
(e.g. from <filename>/etc/repmgr/11/repmgr.conf</filename> to <filename>/etc/repmgr/12/repmgr.conf</filename>).
This is stored as part of the &repmgr; metadata and is used by &repmgr; to execute &repmgr; remotely
(e.g. during a <link linkend="performing-switchover">switchover operation</link>).
</para>
<para>
If the content and/or location of <filename>repmgr.conf</filename> has changed, the &repmgr; metadata
needs to be updated to reflect this. The &repmgr; metadata can be updated on each node with:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<link linkend="repmgr-primary-register">
<command>repmgr primary register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-standby-register">
<command>repmgr standby register --force -f /path/to/repmgr.conf</command>
</link>
</simpara>
</listitem>
<listitem>
<simpara>
<link linkend="repmgr-witness-register">
<command>repmgr witness register --force -f /path/to/repmgr.conf -h primary_host</command>
</link>
</simpara>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>

View File

@@ -127,8 +127,31 @@ node2:5432:repmgr:repmgr:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:replication:repluser:foo</programlisting>
If you are planning to use the <option>-S</option>/<option>--superuser</option> option,
there must also be an entry enabling the superuser to connect to the &repmgr; database.
Assuming the superuser is <literal>postgres</literal>, the file would look like this:
<programlisting>
node1:5432:repmgr:repmgr:foo
node1:5432:repmgr:postgres:foo
node1:5432:replication:repluser:foo
node2:5432:repmgr:repmgr:foo
node2:5432:repmgr:postgres:foo
node2:5432:replication:repluser:foo
node3:5432:repmgr:repmgr:foo
node3:5432:repmgr:postgres:foo
node3:5432:replication:repluser:foo</programlisting>
</para>
<para>
The <filename>~/.pgpass</filename> file can be simplified with the use of wildcards if
there is no requirement to restrict provision of passwords to particular hosts, ports
or databases. The preceding file could then be formatted like this:
<programlisting>
*:*:*:repmgr:foo
*:*:*:postgres:foo
</programlisting>
</para>
<note>
<para>
It's possible to specify an alternative location for the <filename>~/.pgpass</filename> file, either via
@@ -140,6 +163,11 @@ node3:5432:replication:repluser:foo</programlisting>
location on all nodes, as when connecting to a remote node, the file referenced is the one on the
local node.
</para>
<para>
Additionally, you <emphasis>must</emphasis> specify the passfile location in <filename>repmgr.conf</filename>
with the <option>passfile</option> option so &repmgr; can write the correct path when creating the
<option>primary_conninfo</option> parameter for replication configuration on standbys.
</para>
</note>
</sect2>

View File

@@ -270,7 +270,7 @@
<varlistentry>
<term><option>wal_keep_segments</option></term>
<term><option>wal_keep_segments</option> / <option>wal_keep_size</option></term>
<listitem>
@@ -279,25 +279,36 @@
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<indexterm>
<primary>wal_keep_size</primary>
<secondary>PostgreSQL configuration</secondary>
</indexterm>
<para>
Normally there is no need to set <option>wal_keep_segments</option> (default: <literal>0</literal>), as it
is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL segments are available to standbys.
Replication slots and/or an archiving solution such as Barman are recommended to ensure standbys have a reliable
Normally there is no need to set <option>wal_keep_segments</option>
(PostgreSQL 13 and later: <varname>wal_keep_size</varname>; default: <literal>0</literal>),
as it is <emphasis>not</emphasis> a reliable way of ensuring that all required WAL
segments are available to standbys. Replication slots and/or an archiving solution
such as Barman are recommended to ensure standbys have a reliable
source of WAL segments at all times.
</para>
<para>
The only reason ever to set <option>wal_keep_segments</option> is you have
you have configured <option>pg_basebackup_options</option>
The only reason ever to set <option>wal_keep_segments</option> / <option>wal_keep_size</option>
is you have you have configured <option>pg_basebackup_options</option>
in <filename>repmgr.conf</filename> to include the setting <literal>--wal-method=fetch</literal>
(PostgreSQL 9.6 and earlier: <literal>--xlog-method=fetch</literal>)
<emphasis>and</emphasis> you have <emphasis>not</emphasis> set <option>restore_command</option>
in <filename>repmgr.conf</filename> to fetch WAL files from a reliable source such as Barman,
in which case you'll need to set <option>wal_keep_segments</option>
to a sufficiently high number to ensure that all WAL files required by the standby
are retained. However we do not recommend managing replication in this way.
are retained. However we do not recommend WAL retention in this way.
</para>
<para>
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS">wal_keep_segments</ulink>.
<!--
PostgreSQL documentation: <ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-WAL-KEEP-SIZE">wal_keep_size</ulink>.
-->
</para>
</listitem>
</varlistentry>

View File

@@ -95,7 +95,8 @@
</para>
<para>
The following parameters are provided for a subset of event notifications:
The following parameters are provided for a subset of event notifications; their meaning may
change according to context:
</para>
<variablelist>
@@ -108,6 +109,9 @@
<para>
node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"/> only)
</para>
<para>
node ID of the former primary (<literal>repmgrd_failover_promote</literal> only)
</para>
</listitem>
</varlistentry>
<varlistentry>
@@ -133,7 +137,7 @@
<para>
The values provided for <literal>%c</literal> and <literal>%a</literal>
will probably contain spaces, so should always be quoted.
may contain spaces, so should always be quoted.
</para>
<para>

View File

@@ -22,16 +22,15 @@
<para>
&repmgr; RPM packages for RedHat/CentOS variants and Fedora are available from the
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink>; see following
<ulink url="https://www.enterprisedb.com">EDB</ulink>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink>; see following
section for details.
</para>
<note>
<para>
Currently the <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink> provides
support for RedHat/CentOS versions 5, 6 and 7. Support for version 8 is
available via the PGDG repository; see below for details.
Currently the <ulink url="https://www.enterprisedb.com">EDB</ulink>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink> provides
support for RedHat/CentOS versions 6,7 and 8.
</para>
</note>
<para>
@@ -45,7 +44,7 @@
<note>
<para>
&repmgr; RPM packages are designed to be compatible with the community-provided PostgreSQL packages
and 2ndQuadrant's <ulink url="https://www.2ndquadrant.com/en/resources/2ndqpostgres/">2ndQPostgres</ulink>.
and EDB's PostgreSQL Extended Server (formerly 2ndQPostgres).
They may not work with vendor-specific packages such as those provided by RedHat for RHEL
customers, as the PostgreSQL filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
@@ -64,16 +63,16 @@
<sect3 id="installation-packages-redhat-2ndq">
<title>2ndQuadrant public RPM yum repository</title>
<title>EDB public RPM yum repository</title>
<para>
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal>
<ulink url="https://dl.2ndquadrant.com/">public repository</ulink> for 2ndQuadrant software,
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides a dedicated <literal>yum</literal>
<ulink url="https://dl.enterprisedb.com/">public repository</ulink> for EDB software,
including &repmgr;. We recommend using this for all future &repmgr; releases.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
<ulink url="https://dl.enterprisedb.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
@@ -83,25 +82,25 @@
<listitem>
<para>
Locate the repository RPM for your PostgreSQL version from the list at:
<ulink url="https://dl.2ndquadrant.com/">https://dl.2ndquadrant.com/</ulink>
<ulink url="https://dl.enterprisedb.com/">https://dl.enterprisedb.com/</ulink>
</para>
</listitem>
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the 2ndQuadrant repository as a source of &repmgr; packages).
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the EDB repository as a source of &repmgr; packages).
</para>
<para>
For example, for PostgreSQL 11 on CentOS, execute:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/11/rpm | sudo bash</programlisting>
curl https://dl.enterprisedb.com/default/release/get/11/rpm | sudo bash</programlisting>
</para>
<para>
For PostgreSQL 9.6 on CentOS, execute:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/9.6/rpm | sudo bash</programlisting>
curl https://dl.enterprisedb.com/default/release/get/9.6/rpm | sudo bash</programlisting>
</para>
@@ -145,7 +144,7 @@ yum search repmgr</programlisting>
<emphasis>Compatibility with PGDG Repositories</emphasis>
</para>
<para>
The 2ndQuadrant &repmgr; yum repository packages use the same definitions and file system layout as the
The EDB &repmgr; yum repository packages use the same definitions and file system layout as the
main PGDG repository.
</para>
<para>
@@ -154,7 +153,7 @@ yum search repmgr</programlisting>
the packages are installed from.
</para>
<para>
To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal>
To ensure the EDB repository is always prioritised, install <literal>yum-plugin-priorities</literal>
and set the repository priorities accordingly.
</para>
@@ -217,16 +216,16 @@ repmgr11.x86_64 4.4-1.el7 2nd
</para>
<sect3 id="installation-packages-debian-ubuntu-2ndq">
<title>2ndQuadrant public apt repository for Debian/Ubuntu</title>
<title>EDB public apt repository for Debian/Ubuntu</title>
<para>
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a
<ulink url="https://dl.2ndquadrant.com/">public apt repository</ulink> for 2ndQuadrant software,
<ulink url="https://www.enterprisedb.com/">EDB</ulink> provides a
<ulink url="https://dl.enterprisedb.com/">public apt repository</ulink> for EDB software,
including &repmgr;.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://dl.2ndquadrant.com/">homepage</ulink>. Specific instructions
<ulink url="https://dl.enterprisedb.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
@@ -239,9 +238,9 @@ repmgr11.x86_64 4.4-1.el7 2nd
<listitem>
<para>
Install the repository definition for your distribution and PostgreSQL version
(this enables the 2ndQuadrant repository as a source of &repmgr; packages) by executing:
(this enables the EDB repository as a source of &repmgr; packages) by executing:
<programlisting>
curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash</programlisting>
curl https://dl.enterprisedb.com/default/release/get/deb | sudo bash</programlisting>
</para>
<note>
<para>

View File

@@ -14,7 +14,7 @@
</para>
<para>
&repmgr; &repmgrversion; is compatible with all PostgreSQL versions from 9.3. See
&repmgr; &repmgrversion; is compatible with all PostgreSQL versions from 9.4. See
section <link linkend="install-compatibility-matrix">&repmgr; compatibility matrix</link>
for an overview of version compatibility.
</para>
@@ -93,7 +93,7 @@
<table id="repmgr-compatibility-matrix">
<title>&repmgr; compatibility matrix</title>
<tgroup cols="3">
<tgroup cols="4">
<thead>
<row>
<entry>
@@ -112,10 +112,9 @@
</thead>
<tbody>
<row>
<entry>
&repmgr; 5.x
&repmgr; 5.3
</entry>
<entry>
YES
@@ -123,11 +122,57 @@
<entry>
<link linkend="release-current">&repmgrversion;</link> (&releasedate;)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13, 14
</entry>
</row>
<row>
<entry>
&repmgr; 5.2
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.2.1">5.2.1</link> (2020-12-07)
</entry>
<entry>
9.4, 9.5, 9.6, 10, 11, 12, 13
</entry>
</row>
<row>
<entry>
&repmgr; 5.1
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.1.0">5.1.0</link> (2020-04-13)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
</row>
<row>
<entry>
&repmgr; 5.0
</entry>
<entry>
NO
</entry>
<entry>
<link linkend="release-5.0">5.0</link> (2019-10-15)
</entry>
<entry>
9.3, 9.4, 9.5, 9.6, 10, 11, 12
</entry>
</row>
<row>
<entry>
&repmgr; 4.x
@@ -193,52 +238,49 @@
<sect2 id="install-postgresql-93-94">
<title>PostgreSQL 9.3 and 9.4 support</title>
<title>PostgreSQL 9.4 support</title>
<indexterm>
<primary>PostgreSQL 9.3</primary>
<primary>PostgreSQL 9.4</primary>
<secondary>repmgr support</secondary>
</indexterm>
<para>
Note that some &repmgr; functionality is not available in PostgreSQL 9.3 and PostgreSQL 9.4:
Note that some &repmgr; functionality is not available in PostgreSQL 9.4:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
PostgreSQL 9.3 does not support replication slots, so corresponding &repmgr; functionality
is not available.
</para>
</listitem>
<listitem>
<para>
In PostgreSQL 9.3 and PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
In PostgreSQL 9.4, <command>pg_rewind</command> is not part of the core
distribution. <command>pg_rewind</command> will need to be compiled separately to be able
to use any &repmgr; functionality which takes advantage of it.
</para>
</listitem>
</itemizedlist>
<important>
<warning>
<para>
PostgreSQL 9.3 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.3/release-9-3-25.html">9.3.25</ulink>
in November 2018) and will no longer be updated with security or bugfixes.
</para>
<para>
Beginning with &repmgr; 5.2, &repmgr; no longer supports PostgreSQL 9.3.
</para>
<para>
PostgreSQL 9.4 has reached the end of its community support period (final release was
<ulink url="https://www.postgresql.org/docs/9.4/release-9-4-26.html">9.4.26</ulink>
in February 2020) and will no longer be updated with security or bugfixes.
</para>
<para>
We recommend that users of these versions migrate to a recent PostgreSQL version
We recommend that users of these versions migrate to a supported PostgreSQL version
as soon as possible.
</para>
<para>
For further details, see the <ulink url="https://www.postgresql.org/support/versioning/">PostgreSQL Versioning Policy</ulink>.
</para>
</important>
</warning>
</sect2>

View File

@@ -178,18 +178,18 @@ deb-src https://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main</programlist
<para>
The source for &repmgr; is maintained at
<ulink url="https://github.com/2ndQuadrant/repmgr">https://github.com/2ndQuadrant/repmgr</ulink>.
<ulink url="https://github.com/EnterpriseDB/repmgr">https://github.com/EnterpriseDB/repmgr</ulink>.
</para>
<para>
There are also tags for each <ulink url="https://github.com/2ndQuadrant/repmgr/releases">&repmgr; release</ulink>, e.g.
<literal><ulink url="https://github.com/2ndQuadrant/repmgr/releases/tag/v4.4.0">v4.4.0</ulink></literal>.
There are also tags for each <ulink url="https://github.com/EnterpriseDB/repmgr/releases">&repmgr; release</ulink>, e.g.
<literal><ulink url="https://github.com/EnterpriseDB/repmgr/releases/tag/v4.4.0">v4.4.0</ulink></literal>.
</para>
<para>
Clone the source code using <application>git</application>:
<programlisting>
git clone https://github.com/2ndQuadrant/repmgr</programlisting>
git clone https://github.com/EnterpriseDB/repmgr</programlisting>
</para>
<para>

View File

@@ -1,18 +1,18 @@
<!-- doc/legal.xml -->
<date>2017</date>
<date>2022</date>
<copyright>
<year>2010-2020</year>
<holder>2ndQuadrant, Ltd.</holder>
<year>2010-2022</year>
<holder>EDB</holder>
</copyright>
<legalnotice id="legalnotice">
<title>Legal Notice</title>
<para>
<productname>repmgr</productname> is Copyright &copy; 2010-2020
by 2ndQuadrant, Ltd. All rights reserved.
<productname>repmgr</productname> is Copyright &copy; 2010-2022
by EDB All rights reserved.
</para>
<para>

View File

@@ -284,7 +284,7 @@
<tip>
<simpara>
For Debian-based distributions we recommend explictly setting
For Debian-based distributions we recommend explicitly setting
<option>pg_bindir</option> to the directory where <command>pg_ctl</command> and other binaries
not in the standard path are located. For PostgreSQL 9.6 this would be <filename>/usr/lib/postgresql/9.6/bin/</filename>.
</simpara>
@@ -302,7 +302,7 @@
<para>
See the file
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>
for details of all available configuration parameters.
</para>

View File

@@ -25,7 +25,7 @@
</para>
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
started. This behaviour can be overridden by specifying a diffent value using the <option>--wait</option>
started. This behaviour can be overridden by specifying a different value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>

View File

@@ -26,7 +26,7 @@
<para>
By default, &repmgr; will wait for up to 15 seconds to confirm that &repmgrd;
stopped. This behaviour can be overridden by specifying a diffent value using the <option>--wait</option>
stopped. This behaviour can be overridden by specifying a different value using the <option>--wait</option>
option, or disabled altogether with the <option>--no-wait</option> option.
</para>
<note>

View File

@@ -125,12 +125,69 @@
is correctly configured.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>repmgrd</title>
<para>
A separate check is available to verify whether &repmgrd; is running,
This is not included in the general output, as this does not
per-se constitute a check of the node's replication status.
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--repmgrd</option>: checks whether &repmgrd; is running.
If &repmgrd; is running but paused, status <literal>1</literal>
(<literal>WARNING</literal>) is returned.
</simpara>
</listitem>
</itemizedlist>
</refsect1>
<refsect1>
<title>Additional checks</title>
<para>
Several checks are provided for diagnostic purposes and are not
included in the general output:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>--db-connection</option>: checks if &repmgr; can connect to the
database on the local node.
</simpara>
<simpara>
This option is particularly useful in combination with <command>SSH</command>, as
it can be used to troubleshoot connection issues encountered when &repmgr; is
executed remotely (e.g. during a switchover operation).
</simpara>
</listitem>
<listitem>
<simpara>
<option>--replication-config-owner</option>: checks if the file containing replication
configuration (PostgreSQL 12 and later: <filename>postgresql.auto.conf</filename>;
PostgreSQL 11 and earlier: <filename>recovery.conf</filename>) is
owned by the same user who owns the data directory.
</simpara>
<simpara>
Incorrect ownership of these files (e.g. if they are owned by <literal>root</literal>)
will cause operations which need to update the replication configuration
(e.g. <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
or <link linkend="repmgr-standby-promote"><command>repmgr standby promote</command></link>)
to fail.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Connection options</title>
<para>

View File

@@ -22,6 +22,10 @@
This can optionally use <application>pg_rewind</application> to re-integrate
a node which has diverged from the rest of the cluster, typically a failed primary.
</para>
<para>
Note that <command>repmgr node rejoin</command> can only be used to attach
a standby to the current primary, not another standby.
</para>
<tip>
<para>
@@ -43,7 +47,12 @@
<programlisting>
repmgr node rejoin -d '$conninfo'</programlisting>
where <literal>$conninfo</literal> is the conninfo string of any reachable node in the cluster.
where <literal>$conninfo</literal> is the PostgreSQL <literal>conninfo</literal> string of the
<emphasis>current</emphasis> primary node (or that of any reachable node in the cluster, but
<emphasis>not</emphasis> the local node). This is so that &repmgr; can fetch up-to-date information
about the current state of the cluster.
</para>
<para>
<filename>repmgr.conf</filename> for the stopped node *must* be supplied explicitly if not
otherwise available.
</para>
@@ -71,7 +80,7 @@
</para>
<para>
It is only necessary to provide the <application>pg_rewind</application> path
if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
if using PostgreSQL 9.4, and <application>pg_rewind</application>
is not installed in the PostgreSQL <filename>bin</filename> directory.
</para>
</listitem>
@@ -207,9 +216,18 @@
a standby to the current primary, not another standby.
</para>
<para>
The node must have been shut down cleanly; if this was not the case, it will
need to be manually started (remove any existing <filename>recovery.conf</filename> file first)
until it has reached a consistent recovery point, then shut down cleanly.
The node's PostgreSQL instance must have been shut down cleanly. If this was not the
case, it will need to be started up until it has reached a consistent recovery point,
then shut down cleanly.
</para>
<para>
In PostgreSQL 13 and later, this will be done automatically
if the <option>--force-rewind</option> is provided (even if an actual rewind
is not necessary).
</para>
<para>
With PostgreSQL 12 and earlier, PostgreSQL will need to
be started and shut down manually; see below for the best way to do this.
</para>
<tip>
<para>
@@ -221,11 +239,14 @@
rm -f /var/lib/pgsql/data/recovery.conf
postgres --single -D /var/lib/pgsql/data/ &lt; /dev/null</programlisting>
</para>
<para>
Note that <filename>standby.signal</filename> (PostgreSQL 11 and earlier:
<filename>recovery.conf</filename>) <emphasis>must</emphasis> be removed
from the data directory for PostgreSQL to be able to start in single
user mode.
</para>
</tip>
<para>
&repmgr; will attempt to verify whether the node can rejoin as-is, or whether
<command>pg_rewind</command> must be used (see following section).
</para>
</refsect1>
<refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
@@ -241,7 +262,7 @@
<command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
node which has diverged from the rest of the cluster, typically a failed primary.
<command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution,
and can be installed from external sources for PostgreSQL 9.3 and 9.4.
and can be installed from external sources for PostgreSQL 9.4.
</para>
<note>
<para>
@@ -264,6 +285,7 @@
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--force-rewind --config-files=postgresql.local.conf,postgresql.conf --verbose --dry-run
NOTICE: rejoin target is node "node3" (node ID: 3)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 6652184002263212600
@@ -283,7 +305,15 @@
to execute <command>pg_rewind</command> to ensure the node can be rejoined successfully.
</para>
<important>
<refsect2 id="repmgr-node-rejoin-pg-rewind-config-files" xreflabel="pg_rewind and configuration files">
<title><command>pg_rewind</command> and configuration file retention</title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<para>
Be aware that if <command>pg_rewind</command> is executed and actually performs a
rewind operation, any configuration files in the PostgreSQL data directory will be
@@ -291,19 +321,30 @@
</para>
<para>
To prevent this happening, provide a comma-separated list of files to retain
using the <literal>--config-file</literal> command line option; the specified files
using the <option>--config-file</option> command line option; the specified files
will be archived in a temporary directory (whose parent directory can be specified with
<literal>--config-archive-dir</literal>) and restored once the rewind operation is
complete.
<option>--config-archive-dir</option>, default: <filename>/tmp</filename>)
and restored once the rewind operation is complete.
</para>
</important>
</refsect2>
<para>
Example, first using <literal>--dry-run</literal>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
<refsect2 id="repmgr-node-rejoin-pg-rewind-example" xreflabel="example using repmgr node rejoin and pg_rewind">
<title>Example using <command>repmgr node rejoin</command> and <command>pg_rewind</command></title>
<indexterm>
<primary>pg_rewind</primary>
<secondary>configuration file retention</secondary>
</indexterm>
<para>
Example, first using <option>--dry-run</option>, then actually executing the
<literal>node rejoin command</literal>.
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind --dry-run
NOTICE: rejoin target is node "node3" (node ID: 3)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 6652460429293670710
@@ -317,17 +358,17 @@
pg_rewind -D '/var/lib/postgresql/data' --source-server='host=node3 dbname=repmgr user=repmgr'
INFO: prerequisites for executing NODE REJOIN are met</programlisting>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but is
not an absolute guarantee that actually executing <application>pg_rewind</application>
will succeed. See also section <xref linkend="repmgr-node-rejoin-caveats"/> below.
</para>
<note>
<para>
If <option>--force-rewind</option> is used with the <option>--dry-run</option> option,
this checks the prerequisites for using <application>pg_rewind</application>, but is
not an absolute guarantee that actually executing <application>pg_rewind</application>
will succeed. See also section <xref linkend="repmgr-node-rejoin-caveats"/> below.
</para>
</note>
</note>
<programlisting>
<programlisting>
$ repmgr node rejoin -f /etc/repmgr.conf -d 'host=node3 dbname=repmgr user=repmgr' \
--config-files=postgresql.local.conf,postgresql.conf --verbose --force-rewind
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 3
@@ -339,8 +380,8 @@
NOTICE: starting server using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/pgsql/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 3</programlisting>
</para>
</para>
</refsect2>
</refsect1>
<refsect1 id="repmgr-node-rejoin-caveats" xreflabel="Caveats">
@@ -369,6 +410,11 @@
the current standby's PostgreSQL log will contain entries with the text
&quot;<literal>record with incorrect prev-link</literal>&quot;.
</para>
<para>
In PostgreSQL 9.5 and earlier, it is <emphasis>not</emphasis> possible to use
<application>pg_rewind</application> to attach to a target node with a lower
timeline than the local node.
</para>
<para>
We strongly recommend running <command>repmgr node rejoin</command> with the
<option>--dry-run</option> option first. Additionally it might be a good idea
@@ -378,6 +424,52 @@
is running in <option>--dry-run</option> mode.
</para>
<warning>
<para>
In all current PostgreSQL versions (as of September 2020), <application>pg_rewind</application>
contains a corner-case bug which affects standbys in a very specific situation.
</para>
<para>
This situation occurs when a standby was shut down <emphasis>before</emphasis> its
primary node, and an attempt is made to attach this standby to another primary
in the same cluster (following a &quot;split brain&quot; situation where the standby
was connected to the wrong primary). In this case, &repmgr; will correctly determine
that <application>pg_rewind</application> should be executed, however
<application>pg_rewind</application> incorrectly decides that no action is necessary.
</para>
<para>
In this situation, &repmgr; will report something like:
<programlisting>
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 1
DETAIL: rejoin target server's timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
but when executed, <application>pg_rewind</application> will report:
<programlisting>
pg_rewind: servers diverged at WAL location 0/7015540 on timeline 2
pg_rewind: no rewind required</programlisting>
and if an attempt is made to attach the standby to the new primary, PostgreSQL logs on the standby
will contain errors like:
<programlisting>
[2020-09-07 15:01:41 UTC] LOG: 00000: replication terminated by primary server
[2020-09-07 15:01:41 UTC] DETAIL: End of WAL reached on timeline 2 at 0/7015540.
[2020-09-07 15:01:41 UTC] LOG: 00000: new timeline 3 forked off current database system timeline 2 before current recovery point 0/7019C10</programlisting>
</para>
<para>
Currently it is not possible to resolve this situation using <application>pg_rewind</application>.
A <ulink url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2b4f3130382fe2f8705863e4d38589d4d69cd695">patch</ulink>
has been successfully submitted and will be included the next PostgreSQL minor release round, scheduled for
February 2021.
</para>
<para>
As a workaround, start the primary server the standby was previously attached to,
and ensure the standby can be attached to it. If <application>pg_rewind</application> was actually executed,
it will have copied in the <filename>.history</filename> file from the target primary server; this must
be removed. <command>repmgr node rejoin</command> can then be used to attach the standby to the original
primary. Ensure any changes pending on the primary have propagated to the standby. Then shut down the primary
server <emphasis>first</emphasis>, before shutting down the standby. It should then be possible to
use <command>repmgr node rejoin</command> to attach the standby to the new primary.
</para>
</warning>
</refsect1>
<refsect1>

View File

@@ -60,6 +60,17 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force</option></term>
<listitem>
<para>
Forcibly unregister the node if it is registered as an active primary,
as long as it has no registered standbys; or if it is registered as
a primary but running as a standby.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@@ -104,14 +104,6 @@
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>primary_conninfo</varname></simpara>
</listitem>
@@ -122,6 +114,21 @@
</itemizedlist>
<para>
For PostgreSQL 11 and earlier, these parameters will also be set:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
</itemizedlist>
<para>
The following additional parameters can be specified in <filename>repmgr.conf</filename>
for inclusion in the replication configuration:
@@ -175,7 +182,10 @@
<programlisting>
pg_basebackup_options='--wal-method=fetch'</programlisting>
and ensure that <literal>wal_keep_segments</literal> is set to an appropriately high value.
and ensure that <literal>wal_keep_segments</literal> (PostgreSQL 13 and later:
<literal>wal_keep_size</literal>) is set to an appropriately high value. Note
however that this is not a particularly reliable way of ensuring sufficient
WAL is retained and is not recommended.
See the <ulink url="https://www.postgresql.org/docs/current/app-pgbasebackup.html">
pg_basebackup</ulink> documentation for details.
</para>
@@ -188,6 +198,25 @@
</note>
</refsect1>
<refsect1 id="repmgr-standby-clone-wal-directory">
<title>Placing WAL files into a different directory</title>
<para>
To ensure that WAL files are placed in a directory outside of the main data
directory (e.g. to keep them on a separate disk for performance reasons),
specify the location with <option>--waldir</option>
(PostgreSQL 9.6 and earlier: <option>--xlogdir</option>) in
the <filename>repmgr.conf</filename> parameter <option>pg_basebackup_options</option>,
e.g.:
<programlisting>
pg_basebackup_options='--waldir=/path/to/wal-directory'</programlisting>
This setting will also be honored by &repmgr; when cloning from Barman
(&repmgr; 5.2 and later).
</para>
</refsect1>
<!-- don't rename this id as it may be used in external links -->
<refsect1 id="repmgr-standby-create-recovery-conf">
@@ -231,16 +260,29 @@
upstream node if required.
</para>
<para>
Note that the upstream node must be running. In PostgreSQL 11 and earlier, an existing
<filename>recovery.conf</filename> will not be overwritten unless the
<option>-F/--force</option> option is provided.
The upstream node must be running so the correct replication configuration can be obtained.
</para>
<para>
Execute <command>repmgr standby clone --replication-conf-only --dry-run</command>
to check the prerequisites for creating the recovery configuration,
and display the contents of the configuration which would be added without actually
making any changes.
If the standby is running, the replication configuration will not be written unless the
<option>-F/--force</option> option is provided.
</para>
<tip>
<para>
Execute <command>repmgr standby clone --replication-conf-only --dry-run</command>
to check the prerequisites for creating the recovery configuration,
and display the configuration changes which would be made without actually
making any changes.
</para>
</tip>
<para>
In PostgreSQL 13 and later, the PostgreSQL configuration must be reloaded for replication
configuration changes to take effect.
</para>
<para>
In PostgreSQL 12 and earlier, the PostgreSQL instance must be restarted for replication
configuration changes to take effect.
</para>
</refsect1>
@@ -302,6 +344,25 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>--recovery-min-apply-delay</option></term>
<listitem>
<para>
Set PostgreSQL configuration <option>recovery_min_apply_delay</option> parameter
to the provided value.
</para>
<para>
This overrides any <option>recovery_min_apply_delay</option> provided via
<filename>repmgr.conf</filename>.
</para>
<para>
For more details on this parameter, see:
<ulink url="https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-RECOVERY-MIN-APPLY-DELAY">recovery_min_apply_delay</ulink>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R, --remote-user=USERNAME</option></term>
<listitem>
@@ -317,14 +378,14 @@
<para>
Create recovery configuration for a previously cloned instance.
</para>
<para>
In PostgreSQL 11 and earlier, the replication configuration will be
written to <filename>recovery.conf</filename>.
</para>
<para>
In PostgreSQL 12 and later, the replication configuration will be
written to <filename>postgresql.auto.conf</filename>.
</para>
<para>
In PostgreSQL 11 and earlier, the replication configuration will be
written to <filename>recovery.conf</filename>.
</para>
</listitem>
</varlistentry>
@@ -370,6 +431,23 @@
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--verify-backup</option></term>
<listitem>
<para>
<!-- update link after Pg13 release -->
Verify a cloned node using the
<ulink url="https://www.postgresql.org/docs/13/app-pgverifybackup.html">pg_verifybackup</ulink>
utility (PostgreSQL 13 and later).
</para>
<para>
This option can currently only be used when cloning directly from an upstream
node.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--without-barman </option></term>
<listitem>

View File

@@ -47,7 +47,15 @@
</para>
<para>
This command will force a restart of PostgreSQL on the standby node.
In PostgreSQL 12 and earlier, this command will force a restart of PostgreSQL on the standby node.
</para>
<para>
In PostgreSQL 13 and later, by default this command will signal PostgreSQL to reload its
configuration, which will cause PostgreSQL to follow the new upstream without
a restart. If this behaviour is not desired for whatever reason, the configuration
file parameter <varname>standby_follow_restart</varname> can be set <literal>true</literal>
to always force a restart.
</para>
<para>

View File

@@ -66,10 +66,10 @@
Both values can be defined in <filename>repmgr.conf</filename>.
</para>
<note>
<warning>
<para>
If WAL replay is paused on the standby, and not all WAL files on the standby have been
replayed, &repmgr; will not attempt to promote it.
In PostgreSQL 12 and earlier, if WAL replay is paused on the standby, and not all
WAL files on the standby have been replayed, &repmgr; will not attempt to promote it.
</para>
<para>
This is because if WAL replay is paused, PostgreSQL itself will not react to a promote command
@@ -81,7 +81,10 @@
Note that if the standby is in archive recovery, &repmgr; will not be able to determine
if more WAL is pending replay, and will abort the promotion attempt if WAL replay is paused.
</para>
</note>
<para>
This restriction does <emphasis>not</emphasis> apply to PostgreSQL 13 and later.
</para>
</warning>
</refsect1>
@@ -95,7 +98,6 @@
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "pg_ctl -l /var/log/postgres/startup.log -w -D '/var/lib/postgres/data' promote"
server promoting
DEBUG: setting node 2 as primary and marking existing primary as failed
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary</programlisting>
</para>
@@ -170,6 +172,42 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
This option is relevant in the following situations if <option>--siblings-follow</option> was specified:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
If one or more sibling nodes was not reachable via SSH, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If the promotion candidate has insufficient free walsenders to accommodate the standbys which will
be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
<listitem>
<simpara>
If replication slots are in use but the promotion candidate has insufficient free replication slots
to accommodate the standbys which will be attached to it, the standby will be promoted anyway.
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Note that if the <option>-F</option>/<option>--force</option> option is used when any of the above
situations is encountered, the onus is on the user to manually resolve any resulting issues.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>

View File

@@ -147,7 +147,7 @@
<para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(and the prerequisites for using <application>pg_rewind</application> are met).
If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
If using PostgreSQL 9.4, and the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
provide its full path. For more details see also <xref linkend="switchover-pg-rewind"/>.
</para>

View File

@@ -63,6 +63,34 @@
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually register the witness
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option>/<option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1 id="repmgr-witness-register-events">
<title>Event notifications</title>
<para>

View File

@@ -18,7 +18,7 @@
<title>repmgr &repmgrversion; Documentation</title>
<bookinfo>
<corpauthor>2ndQuadrant Ltd</corpauthor>
<corpauthor>EDB</corpauthor>
<productname>repmgr</productname>
<productnumber>&repmgrversion;</productnumber>
&legal;
@@ -26,7 +26,7 @@
<abstract>
<para>
This is the official documentation of &repmgr; &repmgrversion; for
use with PostgreSQL 9.3 - PostgreSQL 12.
use with PostgreSQL 9.4 - PostgreSQL 14.
</para>
<para>
&repmgr; is being continually developed and we strongly recommend using the
@@ -38,20 +38,19 @@
<para>
&repmgr; is developed by
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://www.enterprisedb.com/">EDB</ulink>
along with contributions from other individuals and organisations.
Contributions from the community are appreciated and welcome - get
in touch via <ulink url="https://github.com/2ndQuadrant/repmgr">github</ulink>
in touch via <ulink url="https://github.com/EnterpriseDB/repmgr">github</ulink>
or <ulink url="https://groups.google.com/group/repmgr">the mailing list/forum</ulink>.
Multiple 2ndQuadrant customers contribute funding
to make repmgr development possible.
Multiple EDB customers contribute funding to make &repmgr; development possible.
</para>
<para>
&repmgr; is fully supported by 2ndQuadrant's
<ulink url="https://www.2ndquadrant.com/en/support/support-postgresql/">24/7 Production Support</ulink>.
2ndQuadrant, a Major Sponsor of the PostgreSQL project, continues to develop and maintain &repmgr;.
Other organisations as well as individual developers are welcome to participate in the efforts.
&repmgr; is fully supported by EDB's
<ulink url="https://www.enterprisedb.com/support/postgresql-support-overview-get-the-most-out-of-postgresql">24/7 Production Support</ulink>.
EDB, a Major Sponsor of the PostgreSQL project, continues to maintain &repmgr;.
We welcome participation from other organisations and individual developers.
</para>
</abstract>

View File

@@ -331,11 +331,12 @@
To use this, <option>failover_validation_command</option> in <filename>repmgr.conf</filename>
to a script executable by the <literal>postgres</literal> system user, e.g.:
<programlisting>
failover_validation_command=/path/to/script.sh %n %a</programlisting>
failover_validation_command=/path/to/script.sh %n</programlisting>
</para>
<para>
The <literal>%n</literal> parameter will be replaced with the node ID, and the
<literal>%a</literal> parameter will be replaced by the node name when the script is executed.
The <literal>%n</literal> parameter will be replaced with the node ID when the script is
executed. A number of other parameters are also available, see section
&quot;<xref linkend="repmgrd-automatic-failover-configuration-optional"/>&quot; for details.
</para>
<para>
This script must return an exit code of <literal>0</literal> to indicate the node should promote itself.
@@ -585,7 +586,7 @@ INFO: node 3 received notification to rerun promotion candidate election
<sect2 id="repmgrd-primary-child-disconnection-caveats">
<title>Standby disconnections monitoring caveats</title>
<para>
The follwing caveats should be considered if you are intending to use this functionality.
The following caveats should be considered if you are intending to use this functionality.
</para>
<para>
<itemizedlist mark="bullet">

View File

@@ -15,9 +15,13 @@
</para>
<para>
&repmgrd; can be configured to provide failover
capability in case the primary upstream node becomes unreachable, and/or
capability in case the primary or upstream node becomes unreachable, and/or
provide monitoring data to the &repmgr; metadatabase.
</para>
<para>
From &repmgr; 4.4, when running on the primary node, &repmgrd; can also monitor
standby disconnections/reconnections (see <xref linkend="repmgrd-primary-child-disconnection"/>).
</para>
<sect1 id="repmgrd-basic-configuration">
<title>repmgrd configuration</title>
@@ -85,6 +89,10 @@
<literal>query</literal> - determines server availability
by executing an SQL statement on the node via the existing connection
</simpara>
<simpara>
The query is a minimal throwaway query - <command>SELECT 1</command> -
which is used to determine that the server can accept queries.
</simpara>
</listitem>
</itemizedlist>
@@ -148,7 +156,7 @@
</variablelist>
<para>
See also <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename> for an annotated sample configuration file.
See also <filename><ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename> for an annotated sample configuration file.
</para>
<sect2 id="repmgrd-automatic-failover-configuration">
@@ -321,11 +329,11 @@
</sect2>
<sect2 id="repmgrd-automatic-failover-configuration-optional">
<sect2 id="repmgrd-automatic-failover-configuration-optional" xreflabel="Optional configuration for automatic failover">
<title>Optional configuration for automatic failover</title>
<para>
The following configuraton options can be use to fine-tune automatic failover:
The following configuraton options can be used to fine-tune automatic failover:
</para>
<variablelist>
@@ -366,8 +374,8 @@
</para>
</note>
<para>
One or both of the following parameter placeholders
should be provided, which will be replaced by repmgrd with the appropriate
One or more of the following parameter placeholders
may be provided, which will be replaced by repmgrd with the appropriate
value:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
@@ -376,6 +384,15 @@
<listitem>
<simpara><literal>%a</literal>: node name</simpara>
</listitem>
<listitem>
<simpara><literal>%v</literal>: number of visible nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%u</literal>: number of shared upstream nodes</simpara>
</listitem>
<listitem>
<simpara><literal>%t</literal>: total number of nodes</simpara>
</listitem>
</itemizedlist>
</para>
<para>
@@ -406,6 +423,33 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>always_promote</option></term>
<listitem>
<indexterm>
<primary>always_promote</primary>
</indexterm>
<para>
Default: <literal>false</literal>.
</para>
<para>
If <literal>true</literal>, promote the local node even if its
&repmgr; metadata is not up-to-date.
</para>
<para>
Normally &repmgr; expects its metadata (stored in the <varname>repmgr.nodes</varname>
table) to be up-to-date so &repmgrd; can take the correct action during a failover.
However it's possible that updates made on the primary may not
have propagated to the standby (promotion candidate). In this case &repmgrd; will
default to not promoting the standby. This behaviour can be overridden by setting
<option>always_promote</option> to <literal>true</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>standby_disconnect_on_failover</option></term>
@@ -441,6 +485,32 @@
</listitem>
</varlistentry>
<varlistentry>
<term><option>repmgrd_exit_on_inactive_node</option></term>
<listitem>
<indexterm>
<primary>repmgrd_exit_on_inactive_node</primary>
</indexterm>
<para>
This parameter is available in &repmgr; 5.3 and later.
</para>
<para>
If a node was marked as inactive but is running, and this option is set to
<literal>true</literal>, &repmgrd; will abort on startup.
</para>
<para>
By default, <option>repmgrd_exit_on_inactive_node</option> is set
to <literal>false</literal>, in which case &repmgrd; will set the
node record to active on startup.
</para>
<para>
Setting this parameter to <literal>true</literal> causes &repmgrd;
to behave in the same way it did in &repmgr; 5.2 and earlier.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
@@ -499,7 +569,7 @@
</indexterm>
<para>
For further details and a reference implementation, see the separate document
<ulink url="https://github.com/2ndQuadrant/repmgr/blob/master/doc/repmgrd-node-fencing.md">Fencing a failed master node with repmgrd and PgBouncer</ulink>.
<ulink url="https://github.com/EnterpriseDB/repmgr/blob/master/doc/repmgrd-node-fencing.md">Fencing a failed master node with repmgrd and PgBouncer</ulink>.
</para>
</sect2>
@@ -583,7 +653,8 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
the option <option>monitor_interval_secs</option> (see above).
</para>
<para>
For more details on monitoring, see <xref linkend="repmgrd-monitoring"/>.
For more details on monitoring, see <xref linkend="repmgrd-monitoring"/>. For information on
monitoring standby disconnections, see <xref linkend="repmgrd-primary-child-disconnection"/>.
</para>
</sect2>
@@ -751,6 +822,12 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
</simpara>
</listitem>
<listitem>
<simpara>
<varname>always_promote</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname>promote_command</varname>
@@ -922,7 +999,7 @@ repmgrd_service_stop_command='sudo systemctl repmgr12 stop'
</para>
<para>
If none of the above apply, &repmgrd; will create a PID file
in the operating system's temporary directory (as setermined by the environment variable
in the operating system's temporary directory (as determined by the environment variable
<varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
</para>
<para>
@@ -1002,6 +1079,29 @@ REPMGRD_OPTS="--daemonize=false"
</para>
</sect2>
<sect2 id="repmgrd-daemon-monitoring">
<title>repmgrd daemon monitoring</title>
<indexterm>
<primary>repmgrd</primary>
<secondary>monitoring</secondary>
</indexterm>
<indexterm>
<primary>monitoring</primary>
<secondary>repmgrd</secondary>
</indexterm>
<para>
The command <command><link linkend="repmgr-service-status">repmgr service status</link></command>
provides an overview of the &repmgrd; daemon status (including pause status)
on all nodes in the cluster.
</para>
<para>
From &repmgr; 5.3, <command><link linkend="repmgr-node-check">repmgr node check --repmgrd</link></command>
can be used to check the status of &repmgrd; (including pause status)
on the local node.
</para>
</sect2>
</sect1>
<sect1 id="repmgrd-connection-settings">

View File

@@ -137,7 +137,7 @@ NOTICE: node 3 (node3) paused</programlisting>
If the primary becomes available again (e.g. following a software upgrade), &repmgrd;
will automatically reconnect, e.g.:
<programlisting>
[2019-08-28 12:25:41] [NOTICE] reconnected to upstream node 1 after 8 seconds, resuming monitoring</programlisting>
[2019-08-28 12:25:41] [NOTICE] reconnected to upstream node "node1" (ID: 1) after 8 seconds, resuming monitoring</programlisting>
</para>
<para>
@@ -295,7 +295,7 @@ NOTICE: node 3 (node3) unpaused</programlisting>
[2017-08-29 10:59:37] [HINT] use "repmgr standby promote" to manually promote this node
[2017-08-29 10:59:37] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
[2017-08-29 10:59:53] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state (automatic failover disabled)
[2017-08-29 11:00:45] [NOTICE] reconnected to upstream node 1 after 68 seconds, resuming monitoring
[2017-08-29 11:00:45] [NOTICE] reconnected to upstream node "node1" (ID: 1) after 68 seconds, resuming monitoring
[2017-08-29 11:00:57] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in normal state (automatic failover disabled)</programlisting>
</para>

View File

@@ -120,16 +120,19 @@
</important>
<note>
<simpara>
<para>
On <literal>systemd</literal> systems we strongly recommend using the appropriate
<command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
<literal>systemd</literal> is informed about the status of the PostgreSQL service.
</simpara>
<simpara>
</para>
<para>
If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
<command>sudo</command> specification doesn't require a real tty for the user. If not set
this way, <command>repmgr</command> will fail to stop the primary.
</simpara>
</para>
<para>
See the <xref linkend="configuration-file-service-commands"/> documentation section for further details.
</para>
</note>
<para>
@@ -244,16 +247,15 @@
</para>
<para>
<application>pg_rewind</application> has been part of the core PostgreSQL distribution since
version 9.5. Users of versions 9.3 and 9.4 will need to manually install it; the source code is available here:
version 9.5. Users of PostgreSQL 9.4 will need to manually install it; the source code is available here:
<ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
If the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
its full path on the demotion candidate with <option>--force-rewind</option>.
</para>
<para>
Note that building the 9.3/9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code. Also, PostgreSQL 9.3 does not provide <varname>wal_log_hints</varname>,
meaning data checksums must have been enabled when the database was initialized.
Note that building the 9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code.
</para>
</sect2>
@@ -343,7 +345,7 @@
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
If using PostgreSQL 9.3 or 9.4, you should ensure that the shutdown command
If using PostgreSQL 9.4, you should ensure that the shutdown command
is configured to use PostgreSQL's <varname>fast</varname> shutdown mode (the default in 9.5
and later). If relying on <command>pg_ctl</command> to perform database server operations,
you should include <literal>-m fast</literal> in <varname>pg_ctl_options</varname>

View File

@@ -201,9 +201,13 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
</para>
<tip>
<para>
If the &repmgr; upgrade requires a PostgreSQL restart, combine the &repmgr; upgrade
with a PostgreSQL minor version upgrade, which will require a restart in any case.
New PostgreSQL minor version are usually released every couple of months.
If the &repmgr; upgrade requires a PostgreSQL restart, combine the &repmgr; upgrade
with a PostgreSQL minor version upgrade, which will require a restart in any case.
</para>
<para>
New PostgreSQL minor versions are usually released every couple of months;
see the <ulink url="https://www.postgresql.org/developer/roadmap/">Roadmap</ulink>
for the current schedule.
</para>
</tip>
</sect2>
@@ -308,12 +312,12 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<orderedlist>
<listitem>
<simpara>
converting the repmgr.conf configuration files
converting the <filename>repmgr.conf</filename> configuration files
</simpara>
</listitem>
<listitem>
<simpara>
upgrading the repmgr schema using <command>CREATE EXTENSION</command>
upgrading the repmgr schema using <command>CREATE EXTENSION</command> (PostgreSQL 12 and earlier)
</simpara>
</listitem>
</orderedlist>
@@ -457,22 +461,31 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
Please note that the the conversion script will add an empty
placeholder parameter for <varname>data_directory</varname>, which
is a required parameter from &repmgr; 4.
is a required parameter from &repmgr; 4. This must be manually modified to contain
the correct data directory.
</para>
</sect3>
</sect2>
<sect2>
<title>Upgrading the repmgr schema</title>
<title>Upgrading the repmgr schema (PostgreSQL 12 and earlier)</title>
<para>
Ensure &repmgrd; is not running, or any cron jobs which execute the
<command>repmgr</command> binary.
</para>
<para>
Install <literal>repmgr 4</literal> packages; any <literal>repmgr 3.x</literal> packages
Install the latest &repmgr; package; any <literal>repmgr 3.x</literal> packages
should be uninstalled (if not automatically uninstalled already by your packaging system).
</para>
<sect3>
<title>Upgrading from repmgr 3.1.1 or earlier</title>
<tip>
<simpara>
If you don't care about any data from the existing &repmgr; installation,
(e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
tables), the following steps can be skipped; proceed to <xref linkend="upgrade-reregister-nodes"/>.
</simpara>
</tip>
<para>
If your repmgr version is 3.1.1 or earlier, you will need to update
the schema to the latest version in the 3.x series (3.3.2) before
@@ -484,10 +497,10 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.0_repmgr3.1.sql">repmgr3.0_repmgr3.1.sql</ulink></simpara>
<ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/REL3_3_STABLE/sql/repmgr3.0_repmgr3.1.sql">repmgr3.0_repmgr3.1.sql</ulink></simpara>
</listitem>
<listitem>
<simpara><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/REL3_3_STABLE/sql/repmgr3.1.1_repmgr3.1.2.sql">repmgr3.1.1_repmgr3.1.2.sql</ulink></simpara>
<simpara><ulink url="https://raw.githubusercontent.com/EnterpriseDB/repmgr/REL3_3_STABLE/sql/repmgr3.1.1_repmgr3.1.2.sql">repmgr3.1.1_repmgr3.1.2.sql</ulink></simpara>
</listitem>
</itemizedlist>
</para>
@@ -501,19 +514,37 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
In the database used by the existing &repmgr; installation, execute:
<programlisting>
CREATE EXTENSION repmgr FROM unpackaged;</programlisting>
CREATE EXTENSION repmgr FROM unpackaged</programlisting>
</para>
<para>
This will move and convert all objects from the existing schema
into the new, standard <literal>repmgr</literal> schema.
</para>
<note>
<simpara>there must be only one schema matching <literal>repmgr_%</literal> in the
<simpara>There must be only one schema matching <literal>repmgr_%</literal> in the
database, otherwise this step may not work.
</simpara>
</note>
</sect3>
<sect3>
</sect2>
<sect2>
<title>Upgrading the repmgr schema (PostgreSQL 13 and later)</title>
<para>
Beginning with PostgreSQL 13, the <command>CREATE EXTENSION ... FROM unpackaged</command>
syntax is no longer available. In the unlikely event you have ended up with an
installation running PostgreSQL 13 or later and containing the legacy &repmgr;
schema, there is no convenient way of upgrading this; instead you'll just need
to re-register the nodes as detailed in <link linkend="upgrade-reregister-nodes">the following section</link>,
which will create the &repmgr; extension automatically.
</para>
<para>
Any historical data you wish to retain (e.g. the contents of the <structname>events</structname>
and <structname>monitoring</structname> tables) will need to be exported manually.
</para>
</sect2>
<sect2 id="upgrade-reregister-nodes">
<title>Re-register each node</title>
<para>
This is necessary to update the <literal>repmgr</literal> metadata with some additional items.
@@ -523,6 +554,10 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<programlisting>
repmgr primary register -f /etc/repmgr.conf --force</programlisting>
</para>
<para>
If not already present (e.g. after executing <command>CREATE EXTENSION repmgr FROM unpackaged</command>),
the &repmgr; extension will be automatically created by <command>repmgr primary register</command>.
</para>
<para>
On each standby node, execute e.g.
<programlisting>
@@ -535,18 +570,20 @@ ALTER EXTENSION repmgr UPDATE</programlisting>
<para>
The original <literal>repmgr_$cluster</literal> schema can be dropped at any time.
</para>
<tip>
<simpara>
If you don't care about any data from the existing &repmgr; installation,
(e.g. the contents of the <structname>events</structname> and <structname>monitoring</structname>
tables), the manual <command>CREATE EXTENSION</command> step can be skipped; just re-register
each node, starting with the primary node, and the <literal>repmgr</literal> extension will be
automatically created.
</simpara>
</tip>
</sect3>
</sect2>
<sect2 id="upgrade-drop-repmgr-cluster-schema">
<title>Drop the legacy repmgr schema</title>
<para>
Once the cluster has been registered with the current &repmgr; version, the legacy
<literal>repmgr_$cluster</literal> schema can be dropped at any time with:
<programlisting>
DROP SCHEMA repmgr_$cluster CASCADE</programlisting>
(substitute <literal>$cluster</literal> with the value of the <varname>clustername</varname>
variable used in &repmgr; 3.x).
</para>
</sect2>
</sect1>
</chapter>

View File

@@ -1,6 +1,6 @@
/*
* errcode.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

2
log.c
View File

@@ -1,6 +1,6 @@
/*
* log.c - Logging methods
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

2
log.h
View File

@@ -1,6 +1,6 @@
/*
* log.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,17 +1,10 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
\echo Use "ALTER EXTENSION repmgr UPDATE" to load this file. \quit
CREATE FUNCTION set_upstream_last_seen()
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
-- This script is intentionally empty and exists to skip the CREATE FUNCTION
-- commands contained in the 4.2--4.3 and 4.3--4.4 extension upgrade scripts,
-- which reference C functions which no longer exist in 5.3 and later.
--
-- These functions will be explicitly created in the 5.2--5.3 extension
-- upgrade step with the correct C function references.
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;

View File

@@ -1,19 +1,9 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
\echo Use "ALTER EXTENSION repmgr UPDATE" to load this file. \quit
DROP FUNCTION set_upstream_last_seen();
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
-- This script is intentionally empty and exists to skip the CREATE FUNCTION
-- commands contained in the 4.3--4.4 extension upgrade script, which reference
-- C functions which no longer exist in 5.3 and later.
--
-- These functions will be explicitly created in the 5.2--5.3 extension
-- upgrade step with the correct C function references.

7
repmgr--5.1--5.2.sql Normal file
View File

@@ -0,0 +1,7 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
SELECT pg_catalog.pg_extension_config_dump('repmgr.nodes', '');
SELECT pg_catalog.pg_extension_config_dump('repmgr.events', '');
SELECT pg_catalog.pg_extension_config_dump('repmgr.monitoring_history', '');

64
repmgr--5.2--5.3.sql Normal file
View File

@@ -0,0 +1,64 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE OR REPLACE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_local_node_id'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION repmgr.get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_local_node_id'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_set_last_updated'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_get_last_updated'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_node_id'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE OR REPLACE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_notify_follow_primary'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_new_primary'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_reset_voting_status'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_wal_receiver_pid'
LANGUAGE C STRICT;

192
repmgr--5.2.sql Normal file
View File

@@ -0,0 +1,192 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.nodes', '');
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.events', '');
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
SELECT pg_catalog.pg_extension_config_dump('repmgr.monitoring_history', '');
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

192
repmgr--5.3.sql Normal file
View File

@@ -0,0 +1,192 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.nodes', '');
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
SELECT pg_catalog.pg_extension_config_dump('repmgr.events', '');
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
SELECT pg_catalog.pg_extension_config_dump('repmgr.monitoring_history', '');
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

245
repmgr--unpackaged--5.2.sql Normal file
View File

@@ -0,0 +1,245 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
-- extract the current schema name
-- NOTE: this assumes there will be only one schema matching 'repmgr_%';
-- user is responsible for ensuring this is the case
CREATE TEMPORARY TABLE repmgr_old_schema (schema_name TEXT);
INSERT INTO repmgr_old_schema (schema_name)
SELECT nspname AS schema_name
FROM pg_catalog.pg_namespace
WHERE nspname LIKE 'repmgr_%'
LIMIT 1;
-- move old objects into new schema
DO $repmgr$
DECLARE
old_schema TEXT;
BEGIN
SELECT schema_name FROM repmgr_old_schema
INTO old_schema;
EXECUTE format('ALTER TABLE %I.repl_nodes SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_events SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_monitor SET SCHEMA repmgr', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_show_nodes', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_status', old_schema);
END$repmgr$;
-- convert "repmgr_$cluster.repl_nodes" to "repmgr.nodes"
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
INSERT INTO repmgr.nodes
(node_id, upstream_node_id, active, node_name, type, location, priority, conninfo, repluser, slot_name, config_file)
SELECT id, upstream_node_id, active, name,
CASE WHEN type = 'master' THEN 'primary' ELSE type END,
'default', priority, conninfo, 'unknown', slot_name, 'unknown'
FROM repmgr.repl_nodes
ORDER BY id;
-- convert "repmgr_$cluster.repl_event" to "event"
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
INSERT INTO repmgr.events
(node_id, event, successful, event_timestamp, details)
SELECT node_id, event, successful, event_timestamp, details
FROM repmgr.repl_events;
-- create new table "repmgr.voting_term"
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
INSERT INTO repmgr.voting_term (term) VALUES (1);
-- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
INSERT INTO repmgr.monitoring_history
(primary_node_id, standby_node_id, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
SELECT primary_node, standby_node, last_monitor_time, last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
FROM repmgr.repl_monitor;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);
/* drop old tables */
DROP TABLE repmgr.repl_nodes;
DROP TABLE repmgr.repl_monitor;
DROP TABLE repmgr.repl_events;
-- remove temporary table
DROP TABLE repmgr_old_schema;

245
repmgr--unpackaged--5.3.sql Normal file
View File

@@ -0,0 +1,245 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
-- extract the current schema name
-- NOTE: this assumes there will be only one schema matching 'repmgr_%';
-- user is responsible for ensuring this is the case
CREATE TEMPORARY TABLE repmgr_old_schema (schema_name TEXT);
INSERT INTO repmgr_old_schema (schema_name)
SELECT nspname AS schema_name
FROM pg_catalog.pg_namespace
WHERE nspname LIKE 'repmgr_%'
LIMIT 1;
-- move old objects into new schema
DO $repmgr$
DECLARE
old_schema TEXT;
BEGIN
SELECT schema_name FROM repmgr_old_schema
INTO old_schema;
EXECUTE format('ALTER TABLE %I.repl_nodes SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_events SET SCHEMA repmgr', old_schema);
EXECUTE format('ALTER TABLE %I.repl_monitor SET SCHEMA repmgr', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_show_nodes', old_schema);
EXECUTE format('DROP VIEW IF EXISTS %I.repl_status', old_schema);
END$repmgr$;
-- convert "repmgr_$cluster.repl_nodes" to "repmgr.nodes"
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES repmgr.nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
INSERT INTO repmgr.nodes
(node_id, upstream_node_id, active, node_name, type, location, priority, conninfo, repluser, slot_name, config_file)
SELECT id, upstream_node_id, active, name,
CASE WHEN type = 'master' THEN 'primary' ELSE type END,
'default', priority, conninfo, 'unknown', slot_name, 'unknown'
FROM repmgr.repl_nodes
ORDER BY id;
-- convert "repmgr_$cluster.repl_event" to "event"
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
INSERT INTO repmgr.events
(node_id, event, successful, event_timestamp, details)
SELECT node_id, event, successful, event_timestamp, details
FROM repmgr.repl_events;
-- create new table "repmgr.voting_term"
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
INSERT INTO repmgr.voting_term (term) VALUES (1);
-- convert "repmgr_$cluster.repl_monitor" to "monitoring_history"
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
);
INSERT INTO repmgr.monitoring_history
(primary_node_id, standby_node_id, last_monitor_time, last_apply_time, last_wal_primary_location, last_wal_standby_location, replication_lag, apply_lag)
SELECT primary_node, standby_node, last_monitor_time, last_apply_time, last_wal_primary_location::pg_lsn, last_wal_standby_location::pg_lsn, replication_lag, apply_lag
FROM repmgr.repl_monitor;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_standby_get_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_last_seen(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_last_seen()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_last_seen'
LANGUAGE C STRICT;
CREATE FUNCTION get_upstream_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_upstream_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION set_upstream_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_set_upstream_node_id'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgr_reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_repmgrd_pid'
LANGUAGE C STRICT;
CREATE FUNCTION get_repmgrd_pidfile()
RETURNS TEXT
AS 'MODULE_PATHNAME', 'get_repmgrd_pidfile'
LANGUAGE C STRICT;
CREATE FUNCTION set_repmgrd_pid(INT, TEXT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_repmgrd_pid'
LANGUAGE C CALLED ON NULL INPUT;
CREATE FUNCTION repmgrd_is_running()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_running'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_pause(BOOL)
RETURNS VOID
AS 'MODULE_PATHNAME', 'repmgrd_pause'
LANGUAGE C STRICT;
CREATE FUNCTION repmgrd_is_paused()
RETURNS BOOL
AS 'MODULE_PATHNAME', 'repmgrd_is_paused'
LANGUAGE C STRICT;
CREATE FUNCTION get_wal_receiver_pid()
RETURNS INT
AS 'MODULE_PATHNAME', 'repmgr_get_wal_receiver_pid'
LANGUAGE C STRICT;
/* views */
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);
/* drop old tables */
DROP TABLE repmgr.repl_nodes;
DROP TABLE repmgr.repl_monitor;
DROP TABLE repmgr.repl_events;
-- remove temporary table
DROP TABLE repmgr_old_schema;

View File

@@ -3,7 +3,7 @@
*
* Implements cluster information actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -55,10 +55,8 @@ typedef enum
struct ColHeader headers_show[SHOW_HEADER_COUNT];
struct ColHeader headers_event[EVENT_HEADER_COUNT];
static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, ItemList *warnings, int *error_code);
static int build_cluster_crosscheck(t_node_status_cube ***cube_dest, int *name_length, ItemList *warnings, int *error_code);
static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, int *error_code);
static int build_cluster_crosscheck(t_node_status_cube ***cube_dest, ItemList *warnings, int *error_code);
static void cube_set_node_status(t_node_status_cube **cube, int n, int node_id, int matrix_node_id, int connection_node_id, int connection_status);
/*
@@ -538,9 +536,6 @@ do_cluster_crosscheck(void)
{
int i = 0,
n = 0;
char c;
const char *node_header = "Name";
int name_length = strlen(node_header);
t_node_status_cube **cube;
@@ -548,7 +543,7 @@ do_cluster_crosscheck(void)
int error_code = SUCCESS;
ItemList warnings = {NULL, NULL};
n = build_cluster_crosscheck(&cube, &name_length, &warnings, &error_code);
n = build_cluster_crosscheck(&cube, &warnings, &error_code);
if (runtime_options.output_mode == OM_CSV)
{
@@ -582,24 +577,56 @@ do_cluster_crosscheck(void)
}
else
{
printf("%*s | Id ", name_length, node_header);
for (i = 0; i < n; i++)
printf("| %2d ", cube[i]->node_id);
printf("\n");
/* output header contains node name, node ID and one column for each node in the cluster */
struct ColHeader *headers_crosscheck = NULL;
int header_count = n + 2;
int header_id = 2;
headers_crosscheck = palloc0(sizeof(ColHeader) * header_count);
/* Initialize column headers */
strncpy(headers_crosscheck[0].title, _("Name"), MAXLEN);
strncpy(headers_crosscheck[1].title, _("ID"), MAXLEN);
for (i = 0; i < name_length; i++)
printf("-");
printf("-+----");
for (i = 0; i < n; i++)
printf("+----");
printf("\n");
{
maxlen_snprintf(headers_crosscheck[header_id].title, "%i", cube[i]->node_id);
header_id++;
}
/* Initialize column max values */
for (i = 0; i < header_count; i++)
{
headers_crosscheck[i].display = true;
headers_crosscheck[i].max_length = strlen(headers_crosscheck[i].title);
headers_crosscheck[i].cur_length = headers_crosscheck[i].max_length;
/* We can derive the maximum node ID length for the ID column from
* the generated matrix node ID headers
*/
if (i >= 2 && headers_crosscheck[i].max_length > headers_crosscheck[1].max_length)
headers_crosscheck[1].max_length = headers_crosscheck[i].max_length;
}
for (i = 0; i < n; i++)
{
if (strlen(cube[i]->node_name) > headers_crosscheck[0].max_length)
{
headers_crosscheck[0].max_length = strlen(cube[i]->node_name);
}
}
print_status_header(header_count, headers_crosscheck);
for (i = 0; i < n; i++)
{
int column_node_ix;
printf("%*s | %2d ", name_length,
printf(" %-*s | %-*i ",
headers_crosscheck[0].max_length,
cube[i]->node_name,
headers_crosscheck[1].max_length,
cube[i]->node_id);
for (column_node_ix = 0; column_node_ix < n; column_node_ix++)
@@ -607,6 +634,8 @@ do_cluster_crosscheck(void)
int max_node_status = -2;
int node_ix = 0;
char c;
/*
* The value of entry (i,j) is equal to the maximum value of all
* the (i,j,k). Indeed:
@@ -646,12 +675,14 @@ do_cluster_crosscheck(void)
exit(ERR_INTERNAL);
}
printf("| %c ", c);
printf("| %-*c ", headers_crosscheck[column_node_ix + 2].max_length, c);
}
printf("\n");
}
pfree(headers_crosscheck);
if (warnings.head != NULL && runtime_options.terse == false)
{
log_warning(_("following problems detected:"));
@@ -708,16 +739,13 @@ do_cluster_matrix()
j = 0,
n = 0;
const char *node_header = "Name";
int name_length = strlen(node_header);
t_node_matrix_rec **matrix_rec_list;
bool connection_error_found = false;
int error_code = SUCCESS;
ItemList warnings = {NULL, NULL};
n = build_cluster_matrix(&matrix_rec_list, &name_length, &warnings, &error_code);
n = build_cluster_matrix(&matrix_rec_list, &warnings, &error_code);
if (runtime_options.output_mode == OM_CSV)
{
@@ -740,27 +768,60 @@ do_cluster_matrix()
}
else
{
char c;
/* output header contains node name, node ID and one column for each node in the cluster */
struct ColHeader *headers_matrix = NULL;
printf("%*s | Id ", name_length, node_header);
for (i = 0; i < n; i++)
printf("| %2d ", matrix_rec_list[i]->node_id);
printf("\n");
int header_count = n + 2;
int header_id = 2;
for (i = 0; i < name_length; i++)
printf("-");
printf("-+----");
for (i = 0; i < n; i++)
printf("+----");
printf("\n");
headers_matrix = palloc0(sizeof(ColHeader) * header_count);
/* Initialize column headers */
strncpy(headers_matrix[0].title, _("Name"), MAXLEN);
strncpy(headers_matrix[1].title, _("ID"), MAXLEN);
for (i = 0; i < n; i++)
{
printf("%*s | %2d ", name_length,
maxlen_snprintf(headers_matrix[header_id].title, "%i", matrix_rec_list[i]->node_id);
header_id++;
}
/* Initialize column max values */
for (i = 0; i < header_count; i++)
{
headers_matrix[i].display = true;
headers_matrix[i].max_length = strlen(headers_matrix[i].title);
headers_matrix[i].cur_length = headers_matrix[i].max_length;
/* We can derive the maximum node ID length for the ID column from
* the generated matrix node ID headers
*/
if (i >= 2 && headers_matrix[i].max_length > headers_matrix[1].max_length)
headers_matrix[1].max_length = headers_matrix[i].max_length;
}
for (i = 0; i < n; i++)
{
if (strlen(matrix_rec_list[i]->node_name) > headers_matrix[0].max_length)
{
headers_matrix[0].max_length = strlen(matrix_rec_list[i]->node_name);
}
}
print_status_header(header_count, headers_matrix);
for (i = 0; i < n; i++)
{
printf(" %-*s | %-*i ",
headers_matrix[0].max_length,
matrix_rec_list[i]->node_name,
headers_matrix[1].max_length,
matrix_rec_list[i]->node_id);
for (j = 0; j < n; j++)
{
char c;
switch (matrix_rec_list[i]->node_status_list[j]->node_status)
{
case -2:
@@ -778,11 +839,13 @@ do_cluster_matrix()
exit(ERR_INTERNAL);
}
printf("| %c ", c);
printf("| %-*c ", headers_matrix[j + 2].max_length, c);
}
printf("\n");
}
pfree(headers_matrix);
if (warnings.head != NULL && runtime_options.terse == false)
{
log_warning(_("following problems detected:"));
@@ -838,7 +901,7 @@ matrix_set_node_status(t_node_matrix_rec **matrix_rec_list, int n, int node_id,
static int
build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, ItemList *warnings, int *error_code)
build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, ItemList *warnings, int *error_code)
{
PGconn *conn = NULL;
int i = 0,
@@ -896,7 +959,6 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, Ite
/* Initialise matrix structure for each node */
for (cell = nodes.head; cell; cell = cell->next)
{
int name_length_cur;
NodeInfoListCell *cell_j;
matrix_rec_list[i] = (t_node_matrix_rec *) pg_malloc0(sizeof(t_node_matrix_rec));
@@ -906,13 +968,6 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, Ite
cell->node_info->node_name,
sizeof(matrix_rec_list[i]->node_name));
/*
* Find the maximum length of a node name
*/
name_length_cur = strlen(matrix_rec_list[i]->node_name);
if (name_length_cur > *name_length)
*name_length = name_length_cur;
matrix_rec_list[i]->node_status_list = (t_node_status_rec **) pg_malloc0(sizeof(t_node_status_rec) * nodes.node_count);
j = 0;
@@ -1077,7 +1132,7 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length, Ite
static int
build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length, ItemList *warnings, int *error_code)
build_cluster_crosscheck(t_node_status_cube ***dest_cube, ItemList *warnings, int *error_code)
{
PGconn *conn = NULL;
int h,
@@ -1126,20 +1181,12 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length, Item
for (cell = nodes.head; cell; cell = cell->next)
{
int name_length_cur = 0;
NodeInfoListCell *cell_i = NULL;
cube[h] = (t_node_status_cube *) pg_malloc(sizeof(t_node_status_cube));
cube[h]->node_id = cell->node_info->node_id;
strncpy(cube[h]->node_name, cell->node_info->node_name, sizeof(cube[h]->node_name));
/*
* Find the maximum length of a node name
*/
name_length_cur = strlen(cube[h]->node_name);
if (name_length_cur > *name_length)
*name_length = name_length_cur;
cube[h]->matrix_list_rec = (t_node_matrix_rec **) pg_malloc(sizeof(t_node_matrix_rec) * nodes.node_count);
i = 0;
@@ -1507,4 +1554,5 @@ do_cluster_help(void)
printf(_(" -k, --keep-history=VALUE retain indicated number of days of history (default: 0)\n"));
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-cluster.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -2,7 +2,7 @@
* repmgr-action-daemon.c
*
* Implements repmgrd actions for the repmgr command line utility
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -339,4 +339,5 @@ void do_daemon_help(void)
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-daemon.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
*
* Implements actions available for any kind of node
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -35,6 +35,9 @@
static bool copy_file(const char *src_file, const char *dest_file);
static void format_archive_dir(PQExpBufferData *archive_dir);
static t_server_action parse_server_action(const char *action);
static const char *output_repmgrd_status(CheckStatus status);
static void exit_optformat_error(const char *error, int errcode);
static void _do_node_service_list_actions(t_server_action action);
static void _do_node_status_is_shutdown_cleanly(void);
@@ -43,14 +46,18 @@ static void _do_node_restore_config(void);
static void do_node_check_replication_connection(void);
static CheckStatus do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list_output);
static CheckStatus do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_output);
static CheckStatus do_node_check_downstream(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_upstream(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_replication_lag(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_role(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_slots(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_missing_slots(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_repmgrd(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_replication_config_owner(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
static CheckStatus do_node_check_db_connection(PGconn *conn, OutputMode mode);
/*
* NODE STATUS
*
@@ -81,7 +88,6 @@ do_node_status(void)
t_recovery_conf recovery_conf = T_RECOVERY_CONF_INITIALIZER;
char data_dir[MAXPGPATH] = "";
int server_version_num = UNKNOWN_SERVER_VERSION_NUM;
char server_version_str[MAXVERSIONSTR] = "";
/*
@@ -99,7 +105,7 @@ do_node_status(void)
conn = establish_db_connection(config_file_options.conninfo, true);
strncpy(data_dir, config_file_options.data_directory, MAXPGPATH);
server_version_num = get_server_version(conn, server_version_str);
(void)get_server_version(conn, server_version_str);
/* check node exists */
@@ -131,13 +137,22 @@ do_node_status(void)
if (runtime_options.verbose == true)
{
uint64 local_system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
uint64 local_system_identifier = get_system_identifier(config_file_options.data_directory);
local_system_identifier = get_system_identifier(config_file_options.data_directory);
key_value_list_set_format(&node_status,
"System identifier",
"%lu", local_system_identifier);
if (local_system_identifier == UNKNOWN_SYSTEM_IDENTIFIER)
{
key_value_list_set(&node_status,
"System identifier",
"unknown");
item_list_append_format(&warnings,
_("unable to retrieve system identifier from pg_control"));
}
else
{
key_value_list_set_format(&node_status,
"System identifier",
"%lu", local_system_identifier);
}
}
key_value_list_set(&node_status,
@@ -204,7 +219,16 @@ do_node_status(void)
if (enabled == false && recovery_type == RECTYPE_STANDBY)
{
appendPQExpBufferStr(&archiving_status, " (on standbys \"archive_mode\" must be set to \"always\" to be effective)");
if (PQserverVersion(conn) >= 90500)
{
appendPQExpBufferStr(&archiving_status,
" (on standbys \"archive_mode\" must be set to \"always\" to be effective)");
}
else
{
appendPQExpBufferStr(&archiving_status,
" (\"archive_mode\" has no effect on standbys)");
}
}
key_value_list_set(&node_status,
@@ -294,7 +318,7 @@ do_node_status(void)
continue;
}
if (is_downstream_node_attached(conn, node_cell->node_info->node_name) != NODE_ATTACHED)
if (is_downstream_node_attached(conn, node_cell->node_info->node_name, NULL) != NODE_ATTACHED)
{
missing_nodes_count++;
item_list_append_format(&missing_nodes,
@@ -321,13 +345,7 @@ do_node_status(void)
}
}
if (server_version_num < 90400)
{
key_value_list_set(&node_status,
"Replication slots",
"not available");
}
else if (node_info.max_replication_slots == 0)
if (node_info.max_replication_slots == 0)
{
key_value_list_set(&node_status,
"Replication slots",
@@ -632,9 +650,17 @@ _do_node_status_is_shutdown_cleanly(void)
break;
}
/* check what pg_controldata says */
/* check what pg_control says */
db_state = get_db_state(config_file_options.data_directory);
if (get_db_state(config_file_options.data_directory, &db_state) == false)
{
/*
* Unable to retrieve the database state from pg_control
*/
node_status = NODE_STATUS_UNKNOWN;
log_verbose(LOG_DEBUG, "unable to determine db state");
goto return_state;
}
log_verbose(LOG_DEBUG, "db state now: %s", describe_db_state(db_state));
@@ -653,21 +679,23 @@ _do_node_status_is_shutdown_cleanly(void)
checkPoint = get_latest_checkpoint_location(config_file_options.data_directory);
/* unable to read pg_control, don't know what's happening */
if (checkPoint == InvalidXLogRecPtr)
{
/* unable to read pg_control, don't know what's happening */
node_status = NODE_STATUS_UNKNOWN;
}
/*
* if still "UNKNOWN" at this point, then the node must be cleanly shut
* down
*/
else if (node_status == NODE_STATUS_UNKNOWN)
{
/*
* if still "UNKNOWN" at this point, then the node must be cleanly shut
* down
*/
node_status = NODE_STATUS_DOWN;
}
return_state:
log_verbose(LOG_DEBUG, "node status determined as: %s",
print_node_status(node_status));
@@ -686,6 +714,26 @@ _do_node_status_is_shutdown_cleanly(void)
return;
}
static void
exit_optformat_error(const char *error, int errcode)
{
PQExpBufferData output;
Assert(runtime_options.output_mode == OM_OPTFORMAT);
initPQExpBuffer(&output);
appendPQExpBuffer(&output,
"--error=%s",
error);
printf("%s\n", output.data);
termPQExpBuffer(&output);
exit(errcode);
}
/*
* Configuration file required
*/
@@ -702,6 +750,7 @@ do_node_check(void)
CheckStatusListCell *cell = NULL;
bool issue_detected = false;
bool exit_on_connection_error = true;
/* for internal use */
if (runtime_options.has_passfile == true)
@@ -711,12 +760,28 @@ do_node_check(void)
exit(return_code);
}
/* for use by "standby switchover" */
if (runtime_options.replication_connection == true)
{
do_node_check_replication_connection();
exit(SUCCESS);
}
if (runtime_options.db_connection == true)
{
exit_on_connection_error = false;
}
/*
* If --optformat was provided, we'll assume this is a remote invocation
* and instead of exiting with an error, we'll return an error string to
* so the remote invoker will know what's happened.
*/
if (runtime_options.output_mode == OM_OPTFORMAT)
{
exit_on_connection_error = false;
}
if (config_file_options.conninfo[0] != '\0')
{
@@ -732,6 +797,12 @@ do_node_check(void)
if (parse_success == false)
{
if (runtime_options.output_mode == OM_OPTFORMAT)
{
exit_optformat_error("CONNINFO_PARSE",
ERR_BAD_CONFIG);
}
log_error(_("unable to parse conninfo string \"%s\" for local node"),
config_file_options.conninfo);
log_detail("%s", errmsg);
@@ -749,16 +820,36 @@ do_node_check(void)
config_file_options.conninfo,
"user",
runtime_options.superuser,
true);
exit_on_connection_error);
}
else
{
conn = establish_db_connection_by_params(&node_conninfo, true);
conn = establish_db_connection_by_params(&node_conninfo, exit_on_connection_error);
}
}
else
{
conn = establish_db_connection_by_params(&source_conninfo, true);
conn = establish_db_connection_by_params(&source_conninfo, exit_on_connection_error);
}
/*
* --db-connection option provided
*/
if (runtime_options.db_connection == true)
{
return_code = do_node_check_db_connection(conn, runtime_options.output_mode);
PQfinish(conn);
exit(return_code);
}
/*
* If we've reached here, and the connection is invalid, then --optformat was provided
*/
if (PQstatus(conn) != CONNECTION_OK)
{
exit_optformat_error("DB_CONNECTION",
ERR_DB_CONN);
}
if (get_node_record(conn, config_file_options.node_id, &node_info) != RECORD_FOUND)
@@ -797,6 +888,7 @@ do_node_check(void)
{
return_code = do_node_check_downstream(conn,
runtime_options.output_mode,
&node_info,
NULL);
PQfinish(conn);
exit(return_code);
@@ -852,6 +944,16 @@ do_node_check(void)
exit(return_code);
}
if (runtime_options.repmgrd == true)
{
return_code = do_node_check_repmgrd(conn,
runtime_options.output_mode,
&node_info,
NULL);
PQfinish(conn);
exit(return_code);
}
if (runtime_options.replication_config_owner == true)
{
return_code = do_node_check_replication_config_owner(conn,
@@ -888,7 +990,7 @@ do_node_check(void)
if (do_node_check_upstream(conn, runtime_options.output_mode, &node_info, &status_list) != CHECK_STATUS_OK)
issue_detected = true;
if (do_node_check_downstream(conn, runtime_options.output_mode, &status_list) != CHECK_STATUS_OK)
if (do_node_check_downstream(conn, runtime_options.output_mode, &node_info, &status_list) != CHECK_STATUS_OK)
issue_detected = true;
if (do_node_check_slots(conn, runtime_options.output_mode, &node_info, &status_list) != CHECK_STATUS_OK)
@@ -986,7 +1088,15 @@ do_node_check_replication_connection(void)
}
/* retrieve remote node record from local database */
local_conn = establish_db_connection(config_file_options.conninfo, true);
local_conn = establish_db_connection(config_file_options.conninfo, false);
if (PQstatus(local_conn) != CONNECTION_OK)
{
appendPQExpBufferStr(&output, "CONNECTION_ERROR");
printf("%s\n", output.data);
termPQExpBuffer(&output);
return;
}
record_status = get_node_record(local_conn, runtime_options.remote_node_id, &node_record);
PQfinish(local_conn);
@@ -1183,7 +1293,7 @@ do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list
static CheckStatus
do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_output)
do_node_check_downstream(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output)
{
NodeInfoList downstream_nodes = T_NODE_INFO_LIST_INITIALIZER;
NodeInfoListCell *cell = NULL;
@@ -1217,7 +1327,7 @@ do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_ou
continue;
}
if (is_downstream_node_attached(conn, cell->node_info->node_name) != NODE_ATTACHED)
if (is_downstream_node_attached(conn, cell->node_info->node_name, NULL) != NODE_ATTACHED)
{
missing_nodes_count++;
item_list_append_format(&missing_nodes,
@@ -1234,7 +1344,13 @@ do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_ou
}
}
if (missing_nodes_count == 0)
if (node_info->type == WITNESS)
{
/* witness is not connecting to any upstream */
appendPQExpBufferStr(&details,
_("N/A - node is a witness"));
}
else if (missing_nodes_count == 0)
{
if (expected_nodes_count == 0)
appendPQExpBufferStr(&details,
@@ -1367,7 +1483,13 @@ do_node_check_upstream(PGconn *conn, OutputMode mode, t_node_info *node_info, Ch
initPQExpBuffer(&details);
if (get_node_record(conn, node_info->upstream_node_id, &upstream_node_info) != RECORD_FOUND)
if (node_info->type == WITNESS)
{
/* witness is not connecting to any upstream */
appendPQExpBufferStr(&details,
_("N/A - node is a witness"));
}
else if (get_node_record(conn, node_info->upstream_node_id, &upstream_node_info) != RECORD_FOUND)
{
if (get_recovery_type(conn) == RECTYPE_STANDBY)
{
@@ -1388,7 +1510,7 @@ do_node_check_upstream(PGconn *conn, OutputMode mode, t_node_info *node_info, Ch
upstream_conn = establish_db_connection(upstream_node_info.conninfo, true);
/* check our node is connected */
if (is_downstream_node_attached(upstream_conn, config_file_options.node_name) != NODE_ATTACHED)
if (is_downstream_node_attached(upstream_conn, config_file_options.node_name, NULL) != NODE_ATTACHED)
{
appendPQExpBuffer(&details,
_("node \"%s\" (ID: %i) is not attached to expected upstream node \"%s\" (ID: %i)"),
@@ -1417,6 +1539,7 @@ do_node_check_upstream(PGconn *conn, OutputMode mode, t_node_info *node_info, Ch
output_check_status(status),
details.data);
}
break;
case OM_TEXT:
if (list_output != NULL)
{
@@ -1746,12 +1869,7 @@ do_node_check_slots(PGconn *conn, OutputMode mode, t_node_info *node_info, Check
initPQExpBuffer(&details);
if (PQserverVersion(conn) < 90400)
{
appendPQExpBufferStr(&details,
_("replication slots not available for this PostgreSQL version"));
}
else if (node_info->total_replication_slots == 0)
if (node_info->total_replication_slots == 0)
{
appendPQExpBufferStr(&details,
_("node has no physical replication slots"));
@@ -1822,50 +1940,42 @@ do_node_check_missing_slots(PGconn *conn, OutputMode mode, t_node_info *node_inf
initPQExpBuffer(&details);
if (PQserverVersion(conn) < 90400)
get_downstream_nodes_with_missing_slot(conn,
config_file_options.node_id,
&missing_slots);
if (missing_slots.node_count == 0)
{
appendPQExpBufferStr(&details,
_("replication slots not available for this PostgreSQL version"));
_("node has no missing physical replication slots"));
}
else
{
get_downstream_nodes_with_missing_slot(conn,
config_file_options.node_id,
&missing_slots);
NodeInfoListCell *missing_slot_cell = NULL;
bool first_element = true;
if (missing_slots.node_count == 0)
status = CHECK_STATUS_CRITICAL;
appendPQExpBuffer(&details,
_("%i physical replication slots are missing"),
missing_slots.node_count);
if (missing_slots.node_count)
{
appendPQExpBufferStr(&details,
_("node has no missing physical replication slots"));
}
else
{
NodeInfoListCell *missing_slot_cell = NULL;
bool first_element = true;
appendPQExpBufferStr(&details, ": ");
status = CHECK_STATUS_CRITICAL;
appendPQExpBuffer(&details,
_("%i physical replication slots are missing"),
missing_slots.node_count);
if (missing_slots.node_count)
for (missing_slot_cell = missing_slots.head; missing_slot_cell; missing_slot_cell = missing_slot_cell->next)
{
appendPQExpBufferStr(&details, ": ");
for (missing_slot_cell = missing_slots.head; missing_slot_cell; missing_slot_cell = missing_slot_cell->next)
if (first_element == true)
{
if (first_element == true)
{
first_element = false;
}
else
{
appendPQExpBufferStr(&details, ", ");
}
appendPQExpBufferStr(&details, missing_slot_cell->node_info->slot_name);
first_element = false;
}
else
{
appendPQExpBufferStr(&details, ", ");
}
appendPQExpBufferStr(&details, missing_slot_cell->node_info->slot_name);
}
}
}
@@ -1927,7 +2037,6 @@ do_node_check_missing_slots(PGconn *conn, OutputMode mode, t_node_info *node_inf
return status;
}
CheckStatus
do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output)
{
@@ -1948,7 +2057,7 @@ do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_in
* Check actual data directory matches that in repmgr.conf; note this requires
* a superuser connection
*/
if (connection_has_pg_settings(conn) == true)
if (connection_has_pg_monitor_role(conn, "pg_read_all_settings") == true)
{
/* we expect to have a database connection */
if (get_pg_setting(conn, "data_directory", actual_data_directory) == false)
@@ -2062,6 +2171,53 @@ do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_in
return status;
}
CheckStatus
do_node_check_repmgrd(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output)
{
CheckStatus status = CHECK_STATUS_OK;
if (mode == OM_CSV && list_output == NULL)
{
log_error(_("--csv output not provided with --repmgrd option"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
status = get_repmgrd_status(conn);
switch (mode)
{
case OM_OPTFORMAT:
printf("--repmgrd=%s\n",
output_check_status(status));
break;
case OM_NAGIOS:
printf("REPMGRD %s: %s\n",
output_check_status(status),
output_repmgrd_status(status));
break;
case OM_CSV:
case OM_TEXT:
if (list_output != NULL)
{
check_status_list_set(list_output,
"repmgrd",
status,
output_repmgrd_status(status));
}
else
{
printf("%s (%s)\n",
output_check_status(status),
output_repmgrd_status(status));
}
default:
break;
}
return status;
}
/*
* This is not included in the general list output
*/
@@ -2097,6 +2253,72 @@ CheckStatus do_node_check_replication_config_owner(PGconn *conn, OutputMode mode
}
/*
* This is not included in the general list output
*/
static CheckStatus
do_node_check_db_connection(PGconn *conn, OutputMode mode)
{
CheckStatus status = CHECK_STATUS_OK;
PQExpBufferData details;
if (mode == OM_CSV)
{
log_error(_("--csv output not provided with --db-connection option"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
/* This check is for configuration diagnostics only */
if (mode == OM_NAGIOS)
{
log_error(_("--nagios output not provided with --db-connection option"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
initPQExpBuffer(&details);
if (PQstatus(conn) != CONNECTION_OK)
{
t_conninfo_param_list conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
int c;
status = CHECK_STATUS_CRITICAL;
initialize_conninfo_params(&conninfo, false);
conn_to_param_list(conn, &conninfo);
appendPQExpBufferStr(&details,
"connection parameters used:");
for (c = 0; c < conninfo.size && conninfo.keywords[c] != NULL; c++)
{
if (conninfo.values[c] != NULL && conninfo.values[c][0] != '\0')
{
appendPQExpBuffer(&details,
" %s=%s",
conninfo.keywords[c], conninfo.values[c]);
}
}
}
if (mode == OM_OPTFORMAT)
{
printf("--db-connection=%s\n",
output_check_status(status));
}
else if (mode == OM_TEXT)
{
printf("%s (%s)\n",
output_check_status(status),
details.data);
}
termPQExpBuffer(&details);
return status;
}
void
do_node_service(void)
{
@@ -2331,6 +2553,8 @@ do_node_rejoin(void)
DBState db_state;
PGPing status;
bool is_shutdown = true;
int server_version_num = UNKNOWN_SERVER_VERSION_NUM;
bool hide_standby_signal = false;
PQExpBufferData command;
PQExpBufferData command_output;
@@ -2361,7 +2585,11 @@ do_node_rejoin(void)
break;
}
db_state = get_db_state(config_file_options.data_directory);
if (get_db_state(config_file_options.data_directory, &db_state) == false)
{
log_error(_("unable to determine database state from pg_control"));
exit(ERR_BAD_CONFIG);
}
if (is_shutdown == false)
{
@@ -2371,6 +2599,21 @@ do_node_rejoin(void)
exit(ERR_REJOIN_FAIL);
}
/*
* Server version number required to determine whether pg_rewind will run
* crash recovery (Pg 13 and later).
*/
server_version_num = get_pg_version(config_file_options.data_directory, NULL);
if (server_version_num == UNKNOWN_SERVER_VERSION_NUM)
{
/* This is very unlikely to happen */
log_error(_("unable to determine database version"));
exit(ERR_BAD_CONFIG);
}
log_verbose(LOG_DEBUG, "server version number is: %i", server_version_num);
/* check if cleanly shut down */
if (db_state != DB_SHUTDOWNED && db_state != DB_SHUTDOWNED_IN_RECOVERY)
{
@@ -2378,15 +2621,41 @@ do_node_rejoin(void)
{
log_error(_("database is still shutting down"));
}
else if (server_version_num >= 130000 && runtime_options.force_rewind_used == true)
{
log_warning(_("database is not shut down cleanly"));
log_detail(_("--force-rewind provided, pg_rewind will automatically perform recovery"));
/*
* If pg_rewind is executed, the first change it will make
* is to start the server in single user mode, which will fail
* in the presence of "standby.signal", so we'll "hide" it
* (actually delete and recreate).
*/
hide_standby_signal = true;
}
else
{
/*
* If the database was not shut down cleanly, it *might* rejoin correctly
* after starting up and recovering, but better to ensure the database
* can recover before trying anything else.
*/
log_error(_("database is not shut down cleanly"));
if (runtime_options.force_rewind_used == true)
if (server_version_num >= 130000)
{
log_detail(_("pg_rewind will not be able to run"));
log_hint(_("provide --force-rewind to run recovery"));
}
log_hint(_("database should be restarted then shut down cleanly after crash recovery completes"));
else
{
if (runtime_options.force_rewind_used == true)
{
log_detail(_("pg_rewind will not be able to run"));
}
log_hint(_("database should be restarted then shut down cleanly after crash recovery completes"));
}
exit(ERR_REJOIN_FAIL);
}
}
@@ -2394,10 +2663,6 @@ do_node_rejoin(void)
/* check provided upstream connection */
upstream_conn = establish_db_connection_by_params(&source_conninfo, true);
/* sanity checks for 9.3 */
if (PQserverVersion(upstream_conn) < 90400)
check_93_config();
if (get_primary_node_record(upstream_conn, &primary_node_record) == false)
{
log_error(_("unable to retrieve primary node record"));
@@ -2406,6 +2671,13 @@ do_node_rejoin(void)
exit(ERR_BAD_CONFIG);
}
/*
* Emit a notice about the identity of the rejoin target
*/
log_notice(_("rejoin target is node \"%s\" (ID: %i)"),
primary_node_record.node_name,
primary_node_record.node_id);
/* connect to registered primary and check it's not in recovery */
primary_conn = establish_db_connection(primary_node_record.conninfo, false);
@@ -2458,7 +2730,7 @@ do_node_rejoin(void)
log_hint(_("check the local node is registered with the current primary \"%s\" (ID: %i)"),
primary_node_record.node_name,
primary_node_record.node_id);
PQfinish(upstream_conn);
PQfinish(primary_conn);
exit(ERR_BAD_CONFIG);
}
@@ -2482,7 +2754,7 @@ do_node_rejoin(void)
* sanity-check that it will actually be possible to stream from the new upstream
*/
{
bool can_follow;
bool can_rejoin;
TimeLineID tli = get_min_recovery_end_timeline(config_file_options.data_directory);
XLogRecPtr min_recovery_location = get_min_recovery_location(config_file_options.data_directory);
@@ -2496,13 +2768,13 @@ do_node_rejoin(void)
if (tli == 0)
tli = get_timeline(config_file_options.data_directory);
can_follow = check_node_can_attach(tli,
can_rejoin = check_node_can_attach(tli,
min_recovery_location,
primary_conn,
&primary_node_record,
true);
if (can_follow == false)
if (can_rejoin == false)
{
PQfinish(primary_conn);
exit(ERR_REJOIN_FAIL);
@@ -2570,9 +2842,9 @@ do_node_rejoin(void)
}
else
{
appendPQExpBuffer(&command,
"%s -D ",
make_pg_path("pg_rewind"));
make_pg_path(&command, "pg_rewind");
appendPQExpBufferStr(&command,
" -D ");
}
appendShellString(&command,
@@ -2594,6 +2866,31 @@ do_node_rejoin(void)
log_detail(_("pg_rewind command is \"%s\""),
command.data);
/*
* In Pg13 and later, pg_rewind will attempt to start up a server which
* was not cleanly shut down in single user mode. This will fail if
* "standby.signal" is present. We'll remove it and restore it after
* pg_rewind runs.
*/
if (hide_standby_signal == true)
{
char standby_signal_file_path[MAXPGPATH] = "";
log_notice(_("temporarily removing \"standby.signal\""));
log_detail(_("this is required so pg_rewind can fix the unclean shutdown"));
make_standby_signal_path(config_file_options.data_directory,
standby_signal_file_path);
if (unlink(standby_signal_file_path) < 0 && errno != ENOENT)
{
log_error(_("unable to remove \"standby.signal\" file in data directory \"%s\""),
standby_signal_file_path);
log_detail("%s", strerror(errno));
exit(ERR_REJOIN_FAIL);
}
}
initPQExpBuffer(&command_output);
ret = local_command(command.data,
@@ -2601,9 +2898,19 @@ do_node_rejoin(void)
termPQExpBuffer(&command);
if (hide_standby_signal == true)
{
/*
* Restore standby.signal if we previously removed it, regardless
* of whether the pg_rewind operation failed.
*/
log_notice(_("recreating \"standby.signal\""));
write_standby_signal(config_file_options.data_directory);
}
if (ret == false)
{
log_error(_("unable to execute pg_rewind"));
log_error(_("pg_rewind execution failed"));
log_detail("%s", command_output.data);
termPQExpBuffer(&command_output);
@@ -2671,7 +2978,7 @@ do_node_rejoin(void)
struct stat statbuf;
PQExpBufferData slotdir_ent_path;
if(strcmp(slotdir_ent->d_name, ".") == 0 || strcmp(slotdir_ent->d_name, "..") == 0)
if (strcmp(slotdir_ent->d_name, ".") == 0 || strcmp(slotdir_ent->d_name, "..") == 0)
continue;
initPQExpBuffer(&slotdir_ent_path);
@@ -2777,7 +3084,7 @@ do_node_rejoin(void)
config_file_options.node_rejoin_timeout);
}
else {
log_detail(_("no record for local node \"%s\" found in node \"%s\"'s \"pg_stat_replication\" table"),
log_detail(_("no active record for local node \"%s\" found in node \"%s\"'s \"pg_stat_replication\" table"),
config_file_options.node_name,
primary_node_record.node_name);
}
@@ -2789,7 +3096,7 @@ do_node_rejoin(void)
else
{
/* -W/--no-wait provided - check once */
NodeAttached node_attached = is_downstream_node_attached(primary_conn, config_file_options.node_name);
NodeAttached node_attached = is_downstream_node_attached(primary_conn, config_file_options.node_name, NULL);
if (node_attached == NODE_ATTACHED)
success = true;
}
@@ -3322,6 +3629,25 @@ copy_file(const char *src_file, const char *dest_file)
}
static const char *
output_repmgrd_status(CheckStatus status)
{
switch (status)
{
case CHECK_STATUS_OK:
return "repmgrd running";
case CHECK_STATUS_WARNING:
return "repmgrd running but paused";
case CHECK_STATUS_CRITICAL:
return "repmgrd not running";
case CHECK_STATUS_UNKNOWN:
return "repmgrd status unknown";
}
return "UNKNOWN";
}
void
do_node_help(void)
{
@@ -3359,11 +3685,12 @@ do_node_help(void)
printf(_(" Following options check an individual status:\n"));
printf(_(" --archive-ready number of WAL files ready for archiving\n"));
printf(_(" --downstream whether all downstream nodes are connected\n"));
printf(_(" --uptream whether the node is connected to its upstream\n"));
printf(_(" --upstream whether the node is connected to its upstream\n"));
printf(_(" --replication-lag replication lag in seconds (standbys only)\n"));
printf(_(" --role check node has expected role\n"));
printf(_(" --slots check for inactive replication slots\n"));
printf(_(" --missing-slots check for missing replication slots\n"));
printf(_(" --repmgrd check if repmgrd is running\n"));
printf(_(" --data-directory-config check repmgr's data directory configuration\n"));
puts("");
@@ -3377,7 +3704,7 @@ do_node_help(void)
printf(_(" --dry-run check that the prerequisites are met for rejoining the node\n" \
" (including usability of \"pg_rewind\" if requested)\n"));
printf(_(" --force-rewind[=VALUE] execute \"pg_rewind\" if necessary\n"));
printf(_(" (9.3 and 9.4 - provide full \"pg_rewind\" path)\n"));
printf(_(" (PostgreSQL 9.4 - provide full \"pg_rewind\" path)\n"));
printf(_(" --config-files comma-separated list of configuration files to retain\n" \
" after executing \"pg_rewind\"\n"));
@@ -3398,8 +3725,8 @@ do_node_help(void)
printf(_(" --list-actions show what command would be performed for each action\n"));
printf(_(" --checkpoint issue a CHECKPOINT before stopping or restarting the node\n"));
printf(_(" -S, --superuser=USERNAME superuser to use, if repmgr user is not superuser\n"));
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-node.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
*
* Implements primary actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -454,9 +454,9 @@ do_primary_unregister(void)
/*
* This appears to be the cluster primary - cowardly refuse to
* delete the record
* delete the record, unless --force is supplied.
*/
if (primary_node_info.node_id == target_node_info_ptr->node_id)
if (primary_node_info.node_id == target_node_info_ptr->node_id && !runtime_options.force)
{
log_error(_("node \"%s\" (ID: %i) is the current primary node, unable to unregister"),
target_node_info_ptr->node_name,
@@ -575,4 +575,5 @@ do_primary_help(void)
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-primary.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -2,7 +2,7 @@
* repmgr-action-service.c
*
* Implements repmgrd actions for the repmgr command line utility
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -543,4 +543,6 @@ void do_service_help(void)
puts("");
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-service.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-standby.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
*
* Implements witness actions for the repmgr command line utility
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -560,7 +560,7 @@ void do_witness_help(void)
printf(_("WITNESS UNREGISTER\n"));
puts("");
printf(_(" \"witness register\" unregisters a witness node.\n"));
printf(_(" \"witness unregister\" unregisters a witness node.\n"));
puts("");
printf(_(" --dry-run check prerequisites but don't make any changes\n"));
printf(_(" -F, --force unregister when witness node not running\n"));
@@ -569,5 +569,5 @@ void do_witness_help(void)
puts("");
return;
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-action-witness.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/*
* repmgr-client-global.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -48,6 +48,7 @@ typedef struct
bool no_wait;
bool compact;
bool detail;
bool dump_config;
/* logging options */
char log_level[MAXLEN]; /* overrides setting in repmgr.conf */
@@ -83,11 +84,12 @@ typedef struct
bool fast_checkpoint;
bool rsync_only;
bool no_upstream_connection;
char recovery_min_apply_delay[MAXLEN];
char recovery_min_apply_delay[MAXLEN]; /* overrides setting in repmgr.conf */
char replication_user[MAXLEN];
char upstream_conninfo[MAXLEN];
bool without_barman;
bool replication_conf_only;
bool verify_backup;
/* "standby clone"/"standby follow" options */
int upstream_node_id;
@@ -118,8 +120,10 @@ typedef struct
bool missing_slots;
bool has_passfile;
bool replication_connection;
bool repmgrd;
bool data_directory_config;
bool replication_config_owner;
bool db_connection;
/* "node rejoin" options */
char config_files[MAXLEN];
@@ -140,7 +144,7 @@ typedef struct
/* following options for internal use */
char config_archive_dir[MAXPGPATH];
OutputMode output_mode;
OutputMode output_mode; /* set through provision of --csv, --nagios or --optformat */
bool disable_wal_receiver;
bool enable_wal_receiver;
} t_runtime_options;
@@ -149,7 +153,7 @@ typedef struct
/* configuration metadata */ \
false, false, false, false, false, \
/* general configuration options */ \
"", false, false, "", -1, false, false, false, \
"", false, false, "", -1, false, false, false, false, \
/* logging options */ \
"", false, false, false, false, \
/* output options */ \
@@ -162,7 +166,7 @@ typedef struct
UNKNOWN_NODE_ID, "", "", UNKNOWN_NODE_ID, \
/* "standby clone" options */ \
false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \
false, false, \
false, false, false, \
/* "standby clone"/"standby follow" options */ \
NO_UPSTREAM_NODE, \
/* "standby register" options */ \
@@ -172,7 +176,7 @@ typedef struct
/* "node status" options */ \
false, \
/* "node check" options */ \
false, false, false, false, false, false, false, false, false, false, false, \
false, false, false, false, false, false, false, false, false, false, false, false, false, \
/* "node rejoin" options */ \
"", \
/* "node service" options */ \
@@ -216,11 +220,20 @@ typedef enum
typedef enum
{
JOIN_UNKNOWN = -1,
JOIN_SUCCESS,
JOIN_COMMAND_FAIL,
JOIN_FAIL_NO_PING,
JOIN_FAIL_NO_REPLICATION
} standy_join_status;
typedef enum
{
REMOTE_ERROR_UNKNOWN = -1,
REMOTE_ERROR_NONE,
REMOTE_ERROR_DB_CONNECTION,
REMOTE_ERROR_CONNINFO_PARSE
} t_remote_error_type;
typedef struct ColHeader
{
@@ -232,21 +245,17 @@ typedef struct ColHeader
/* global configuration structures */
/* globally available configuration structures */
extern t_runtime_options runtime_options;
extern t_configuration_options config_file_options;
t_conninfo_param_list source_conninfo;
extern t_conninfo_param_list source_conninfo;
extern t_node_info target_node_info;
/* global variables */
extern bool config_file_required;
extern char pg_bindir[MAXLEN];
extern t_node_info target_node_info;
/* global functions */
extern int check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *server_version_string);
extern void check_93_config(void);
extern bool create_repmgr_extension(PGconn *conn);
extern int test_ssh_connection(char *host, char *remote_user);
@@ -257,7 +266,7 @@ extern int copy_remote_files(char *host, char *remote_user, char *remote_path,
extern void print_error_list(ItemList *error_list, int log_level);
extern char *make_pg_path(const char *file);
extern void make_pg_path(PQExpBufferData *buf, const char *file);
extern void get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privileged_conn);
@@ -276,6 +285,8 @@ extern void get_node_config_directory(char *config_dir_buf);
extern void get_node_data_directory(char *data_dir_buf);
extern void init_node_record(t_node_info *node_record);
extern bool can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
extern void make_standby_signal_path(const char *data_dir, char *buf);
extern bool write_standby_signal(const char *data_dir);
extern bool create_replication_slot(PGconn *conn, char *slot_name, t_node_info *upstream_node_record, PQExpBufferData *error_msg);
extern bool drop_replication_slot_if_exists(PGconn *conn, int node_id, char *slot_name);

View File

@@ -1,7 +1,7 @@
/*
* repmgr-client.c - Command interpreter for the repmgr package
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This module is a command-line utility to easily setup a cluster of
* hot standby servers for an HA environment
@@ -51,6 +51,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <pwd.h>
#include <unistd.h>
#include <sys/stat.h>
#include <signal.h>
@@ -75,15 +76,13 @@
* ============================ */
t_runtime_options runtime_options = T_RUNTIME_OPTIONS_INITIALIZER;
t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
/* conninfo params for the node we're operating on */
t_conninfo_param_list source_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
bool config_file_required = true;
char pg_bindir[MAXLEN] = "";
char path_buf[MAXLEN] = "";
char pg_bindir[MAXPGPATH] = "";
/*
* if --node-id/--node-name provided, place that node's record here
@@ -124,7 +123,7 @@ main(int argc, char **argv)
/*
* Tell the logger we're a command-line program - this will ensure any
* output logged before the logger is initialized will be formatted
* correctly. Can be overriden with "--log-to-file".
* correctly. Can be overridden with "--log-to-file".
*/
logger_output_mode = OM_COMMAND_LINE;
@@ -279,6 +278,11 @@ main(int argc, char **argv)
runtime_options.detail = true;
break;
/* --dump-config */
case OPT_DUMP_CONFIG:
runtime_options.dump_config = true;
break;
/*----------------------------
* database connection options
*----------------------------
@@ -435,6 +439,15 @@ main(int argc, char **argv)
runtime_options.replication_conf_only = true;
break;
/* --recovery-min-apply-delay */
case OPT_RECOVERY_MIN_APPLY_DELAY:
strncpy(runtime_options.recovery_min_apply_delay, optarg, sizeof(runtime_options.recovery_min_apply_delay));
break;
/* --verify-backup */
case OPT_VERIFY_BACKUP:
runtime_options.verify_backup = true;
break;
/*---------------------------
* "standby register" options
@@ -537,10 +550,18 @@ main(int argc, char **argv)
runtime_options.data_directory_config = true;
break;
case OPT_REPMGRD:
runtime_options.repmgrd = true;
break;
case OPT_REPLICATION_CONFIG_OWNER:
runtime_options.replication_config_owner = true;
break;
case OPT_DB_CONNECTION:
runtime_options.db_connection = true;
break;
/*--------------------
* "node rejoin" options
*--------------------
@@ -701,9 +722,12 @@ main(int argc, char **argv)
if (strcmp(argv[optind - 1], "-?") == 0)
{
help_option = true;
break;
}
/* otherwise fall through to default */
else
{
option_error_found = true;
}
break;
default: /* invalid option */
option_error_found = true;
break;
@@ -1073,11 +1097,43 @@ main(int argc, char **argv)
load_config(runtime_options.config_file,
runtime_options.verbose,
runtime_options.terse,
&config_file_options,
argv[0]);
/*
* Handle options which must be executed without a repmgr command
*/
if (runtime_options.dump_config == true)
{
if (repmgr_command != NULL)
{
fprintf(stderr,
_("--dump-config cannot be used in combination with a repmgr command"));
exit(ERR_BAD_CONFIG);
}
dump_config();
exit(SUCCESS);
}
check_cli_parameters(action);
/*
* Command-line parameter --recovery-min-apply-delay overrides the equivalent
* setting in the config file. Note we'll need to parse it here to handle
* any formatting errors.
*/
if (*runtime_options.recovery_min_apply_delay != '\0')
{
parse_time_unit_parameter("--recovery-min-apply-delay",
runtime_options.recovery_min_apply_delay,
config_file_options.recovery_min_apply_delay,
&cli_errors);
config_file_options.recovery_min_apply_delay_provided = true;
}
/*
* Sanity checks for command line parameters completed by now; any further
* errors will be runtime ones
@@ -1138,7 +1194,7 @@ main(int argc, char **argv)
}
/*
* Check for configuration file items which can be overriden by runtime
* Check for configuration file items which can be overridden by runtime
* options
* =====================================================================
*/
@@ -1196,7 +1252,7 @@ main(int argc, char **argv)
/*
* If --dry-run specified, ensure log_level is at least LOG_INFO, regardless
* of what's in the configuration file or -L/--log-level paremeter, otherwise
* of what's in the configuration file or -L/--log-level parameter, otherwise
* some or output might not be displayed.
*/
if (runtime_options.dry_run == true)
@@ -1214,8 +1270,6 @@ main(int argc, char **argv)
logger_set_level(LOG_ERROR);
}
/*
* Node configuration information is not needed for all actions, with
* STANDBY CLONE being the main exception.
@@ -2290,7 +2344,7 @@ format_node_status(t_node_info *node_info, PQExpBufferData *node_status, PQExpBu
node_info->node_id);
}
/* mismatch between reported upstream and upstream in local node's metadata */
else if(node_info->upstream_node_id != remote_node_rec.upstream_node_id)
else if (node_info->upstream_node_id != remote_node_rec.upstream_node_id)
{
appendPQExpBufferStr(upstream, "! ");
@@ -2351,6 +2405,7 @@ format_node_status(t_node_info *node_info, PQExpBufferData *node_status, PQExpBu
* connected to the upstream
*/
NodeAttached attached_to_upstream = NODE_ATTACHED_UNKNOWN;
char *replication_state = NULL;
t_node_info upstream_node_rec = T_NODE_INFO_INITIALIZER;
RecordStatus upstream_node_rec_found = get_node_record(node_info->conn,
node_info->upstream_node_id,
@@ -2378,7 +2433,7 @@ format_node_status(t_node_info *node_info, PQExpBufferData *node_status, PQExpBu
}
else
{
attached_to_upstream = is_downstream_node_attached(upstream_conn, node_info->node_name);
attached_to_upstream = is_downstream_node_attached(upstream_conn, node_info->node_name, &replication_state);
}
PQfinish(upstream_conn);
@@ -2394,6 +2449,18 @@ format_node_status(t_node_info *node_info, PQExpBufferData *node_status, PQExpBu
upstream_node_rec.node_name,
upstream_node_rec.node_id);
}
if (attached_to_upstream == NODE_NOT_ATTACHED)
{
appendPQExpBufferStr(upstream, "? ");
item_list_append_format(warnings,
"node \"%s\" (ID: %i) attached to its upstream node \"%s\" (ID: %i) in state \"%s\"",
node_info->node_name,
node_info->node_id,
upstream_node_rec.node_name,
upstream_node_rec.node_id,
replication_state);
}
else if (attached_to_upstream == NODE_DETACHED)
{
appendPQExpBufferStr(upstream, "! ");
@@ -2662,6 +2729,8 @@ do_help(void)
printf(_(" -v, --verbose display additional log output (useful for debugging)\n"));
puts("");
printf(_("%s home page: <%s>\n"), "repmgr", REPMGR_URL);
}
@@ -2671,7 +2740,7 @@ do_help(void)
*
* Note:
* This is one of two places where superuser rights are required.
* We should also consider possible scenarious where a non-superuser
* We should also consider possible scenarios where a non-superuser
* has sufficient privileges to install the extension.
*/
@@ -2706,7 +2775,7 @@ create_repmgr_extension(PGconn *conn)
log_detail(_("version %s is installed but newer version %s is available"),
extversions.installed_version,
extversions.default_version);
log_hint(_("update the installed extension version by executing \"ALTER EXTENSION repmgr UPDATE\""));
log_hint(_("update the installed extension version by executing \"ALTER EXTENSION repmgr UPDATE\" in the repmgr database"));
return false;
case REPMGR_INSTALLED:
@@ -2872,7 +2941,7 @@ check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *
* PostgreSQL from a particular PostgreSQL release onwards (e.g. 4.4 with PostgreSQL
* 12 and later due to recovery.conf removal), set MAX_UNSUPPORTED_VERSION and
* MAX_UNSUPPORTED_VERSION_NUM in "repmgr.h" to define the first PostgreSQL
* version which can't be suppored.
* version which can't be supported.
*/
#ifdef MAX_UNSUPPORTED_VERSION_NUM
if (conn_server_version_num >= MAX_UNSUPPORTED_VERSION_NUM)
@@ -2903,30 +2972,6 @@ check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *
}
/*
* check_93_config()
*
* Disable options not compatible with PostgreSQL 9.3
*/
void
check_93_config(void)
{
if (config_file_options.recovery_min_apply_delay_provided == true)
{
config_file_options.recovery_min_apply_delay_provided = false;
log_warning(_("configuration file option \"recovery_min_apply_delay\" not compatible with PostgreSQL 9.3, ignoring"));
}
if (config_file_options.use_replication_slots == true)
{
config_file_options.use_replication_slots = false;
log_warning(_("configuration file option \"use_replication_slots\" not compatible with PostgreSQL 9.3, ignoring"));
log_hint(_("replication slots are available from PostgreSQL 9.4"));
}
}
int
test_ssh_connection(char *host, char *remote_user)
{
@@ -3039,7 +3084,6 @@ get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privil
}
standy_clone_mode
get_standby_clone_mode(void)
{
@@ -3054,12 +3098,11 @@ get_standby_clone_mode(void)
}
char *
make_pg_path(const char *file)
void
make_pg_path(PQExpBufferData *buf, const char *file)
{
maxlen_snprintf(path_buf, "%s%s", pg_bindir, file);
return path_buf;
appendPQExpBuffer(buf, "%s%s",
pg_bindir, file);
}
@@ -3120,15 +3163,12 @@ copy_remote_files(char *host, char *remote_user, char *remote_path,
appendPQExpBufferStr(&rsync_flags,
" --exclude=recovery.conf --exclude=recovery.done");
if (server_version_num >= 90400)
{
/*
* Ideally we'd use PG_AUTOCONF_FILENAME from utils/guc.h, but
* that has too many dependencies for a mere client program.
*/
appendPQExpBuffer(&rsync_flags, " --exclude=%s.tmp",
PG_AUTOCONF_FILENAME);
}
/*
* Ideally we'd use PG_AUTOCONF_FILENAME from utils/guc.h, but
* that has too many dependencies for a mere client program.
*/
appendPQExpBuffer(&rsync_flags, " --exclude=%s.tmp",
PG_AUTOCONF_FILENAME);
/* Temporary files which we don't want, if they exist */
appendPQExpBuffer(&rsync_flags, " --exclude=%s*",
@@ -3139,16 +3179,16 @@ copy_remote_files(char *host, char *remote_user, char *remote_path,
if (server_version_num >= 100000)
{
appendPQExpBufferStr(&rsync_flags,
" --exclude=pg_wal/*");
" --exclude=pg_wal/* --exclude=log/*");
}
else
{
appendPQExpBufferStr(&rsync_flags,
" --exclude=pg_xlog/*");
" --exclude=pg_xlog/* --exclude=pg_log/*");
}
appendPQExpBufferStr(&rsync_flags,
" --exclude=pg_log/* --exclude=pg_stat_tmp/*");
" --exclude=pg_stat_tmp/*");
maxlen_snprintf(script, "rsync %s %s:%s/* %s",
rsync_flags.data, host_string, remote_path, local_path);
@@ -3275,9 +3315,10 @@ get_server_action(t_server_action action, char *script, char *data_dir)
{
initPQExpBuffer(&command);
make_pg_path(&command, "pg_ctl");
appendPQExpBuffer(&command,
"%s %s -w -D ",
make_pg_path("pg_ctl"),
" %s -w -D ",
config_file_options.pg_ctl_options);
appendShellString(&command,
@@ -3305,9 +3346,10 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else
{
initPQExpBuffer(&command);
make_pg_path(&command, "pg_ctl");
appendPQExpBuffer(&command,
"%s %s -D ",
make_pg_path("pg_ctl"),
" %s -D ",
config_file_options.pg_ctl_options);
appendShellString(&command,
@@ -3340,9 +3382,11 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else
{
initPQExpBuffer(&command);
make_pg_path(&command, "pg_ctl");
appendPQExpBuffer(&command,
"%s %s -w -D ",
make_pg_path("pg_ctl"),
" %s -w -D ",
config_file_options.pg_ctl_options);
appendShellString(&command,
@@ -3368,9 +3412,11 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else
{
initPQExpBuffer(&command);
make_pg_path(&command, "pg_ctl");
appendPQExpBuffer(&command,
"%s %s -w -D ",
make_pg_path("pg_ctl"),
" %s -w -D ",
config_file_options.pg_ctl_options);
appendShellString(&command,
@@ -3397,9 +3443,11 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else
{
initPQExpBuffer(&command);
make_pg_path(&command, "pg_ctl");
appendPQExpBuffer(&command,
"%s %s -w -D ",
make_pg_path("pg_ctl"),
" %s -w -D ",
config_file_options.pg_ctl_options);
appendShellString(&command,
@@ -3573,27 +3621,6 @@ can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *rea
{
bool can_use = true;
/* wal_log_hints not available in 9.3, so just determine if data checksums enabled */
if (PQserverVersion(conn) < 90400)
{
int data_checksum_version = get_data_checksum_version(data_directory);
if (data_checksum_version < 0)
{
appendPQExpBuffer(reason,
_("unable to determine data checksum version"));
can_use = false;
}
else if (data_checksum_version == 0)
{
appendPQExpBuffer(reason,
_("this cluster was initialised without data checksums"));
can_use = false;
}
return can_use;
}
/* "full_page_writes" must be on in any case */
if (guc_set(conn, "full_page_writes", "=", "off"))
{
@@ -3613,7 +3640,7 @@ can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *rea
{
int data_checksum_version = get_data_checksum_version(data_directory);
if (data_checksum_version < 0)
if (data_checksum_version == UNKNOWN_DATA_CHECKSUM_VERSION)
{
if (can_use == false)
appendPQExpBuffer(reason, "; ");
@@ -3638,8 +3665,66 @@ can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *rea
}
// provided connection should be for the normal repmgr user
// upstream_node_record may be NULL or initialised to default values
void
make_standby_signal_path(const char *data_dir, char *buf)
{
snprintf(buf, MAXPGPATH,
"%s/%s",
data_dir,
STANDBY_SIGNAL_FILE);
}
/*
* create standby.signal (PostgreSQL 12 and later)
*/
bool
write_standby_signal(const char *data_dir)
{
char standby_signal_file_path[MAXPGPATH] = "";
FILE *file;
mode_t um;
Assert(data_dir != NULL);
make_standby_signal_path(data_dir, standby_signal_file_path);
/* Set umask to 0600 */
um = umask((~(S_IRUSR | S_IWUSR)) & (S_IRWXG | S_IRWXO));
file = fopen(standby_signal_file_path, "w");
umask(um);
if (file == NULL)
{
log_error(_("unable to create %s file at \"%s\""),
STANDBY_SIGNAL_FILE,
standby_signal_file_path);
log_detail("%s", strerror(errno));
return false;
}
if (fputs("# created by repmgr\n", file) == EOF)
{
log_error(_("unable to write to %s file at \"%s\""),
STANDBY_SIGNAL_FILE,
standby_signal_file_path);
fclose(file);
return false;
}
fclose(file);
return true;
}
/*
* NOTE:
* - the provided connection should be for the normal repmgr user
* - if upstream_node_record is not NULL, its "repluser" entry, if
* set, will be used as the fallback replication user
*/
bool
create_replication_slot(PGconn *conn, char *slot_name, t_node_info *upstream_node_record, PQExpBufferData *error_msg)
{
@@ -3986,8 +4071,10 @@ check_standby_join(PGconn *upstream_conn, t_node_info *upstream_node_record, t_n
for (; i < config_file_options.node_rejoin_timeout; i++)
{
char *node_state = NULL;
NodeAttached node_attached = is_downstream_node_attached(upstream_conn,
standby_node_record->node_name);
standby_node_record->node_name,
&node_state);
if (node_attached == NODE_ATTACHED)
{
log_verbose(LOG_INFO, _("node \"%s\" (ID: %i) has attached to its upstream node"),
@@ -4004,9 +4091,19 @@ check_standby_join(PGconn *upstream_conn, t_node_info *upstream_node_record, t_n
i + 1,
config_file_options.node_rejoin_timeout);
log_detail(_("checking for record in node \"%s\"'s \"pg_stat_replication\" table where \"application_name\" is \"%s\""),
upstream_node_record->node_name,
standby_node_record->node_name);
if (node_attached == NODE_NOT_ATTACHED)
{
log_detail(_("node \"%s\" (ID: %i) is currently attached to its upstream node in state \"%s\""),
upstream_node_record->node_name,
standby_node_record->node_id,
node_state);
}
else
{
log_detail(_("checking for record in node \"%s\"'s \"pg_stat_replication\" table where \"application_name\" is \"%s\""),
upstream_node_record->node_name,
standby_node_record->node_name);
}
}
else
{
@@ -4026,7 +4123,7 @@ check_standby_join(PGconn *upstream_conn, t_node_info *upstream_node_record, t_n
/*
* Here we'll perform some timeline sanity checks to ensure the follow target
* can actually be followed.
* can actually be followed or rejoined.
*
* See also comment for check_node_can_follow() in repmgrd-physical.c .
*/
@@ -4066,10 +4163,31 @@ check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *fo
local_system_identifier = get_system_identifier(config_file_options.data_directory);
/*
* Check for thing that should never happen, but expect the unexpected anyway.
* Check for things that should never happen, but expect the unexpected anyway.
*/
if (follow_target_identification.system_identifier != local_system_identifier)
if (local_system_identifier == UNKNOWN_SYSTEM_IDENTIFIER)
{
/*
* We don't return immediately here so subsequent checks can be
* made, but indicate the node will not be able to rejoin.
*/
success = false;
if (runtime_options.dry_run == true)
{
log_warning(_("unable to retrieve system identifier from pg_control"));
}
else
{
log_error(_("unable to retrieve system identifier from pg_control, aborting"));
}
}
else if (follow_target_identification.system_identifier != local_system_identifier)
{
/*
* It's never going to be possible to rejoin a node from another cluster,
* so no need to bother with further checks.
*/
log_error(_("this node is not part of the %s target node's replication cluster"), action);
log_detail(_("this node's system identifier is %lu, %s target node's system identifier is %lu"),
local_system_identifier,
@@ -4078,8 +4196,7 @@ check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *fo
PQfinish(follow_target_repl_conn);
return false;
}
if (runtime_options.dry_run == true)
else if (runtime_options.dry_run == true)
{
log_info(_("local and %s target system identifiers match"), action);
log_detail(_("system identifier is %lu"), local_system_identifier);
@@ -4092,20 +4209,64 @@ check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *fo
action,
follow_target_identification.timeline);
/* upstream's timeline is lower than ours - impossible case */
/*
* The upstream's timeline is lower than ours - we cannot follow, and rejoin
* requires PostgreSQL 9.6 and later.
*/
if (follow_target_identification.timeline < local_tli)
{
log_error(_("this node's timeline is ahead of the %s target node's timeline"), action);
log_detail(_("this node's timeline is %i, %s target node's timeline is %i"),
local_tli,
action,
follow_target_identification.timeline);
PQfinish(follow_target_repl_conn);
return false;
/*
* "repmgr standby follow" is impossible in this case
*/
if (is_rejoin == false)
{
log_error(_("this node's timeline is ahead of the %s target node's timeline"), action);
log_detail(_("this node's timeline is %i, %s target node's timeline is %i"),
local_tli,
action,
follow_target_identification.timeline);
if (PQserverVersion(follow_target_conn) >= 90600)
{
log_hint(_("use \"repmgr node rejoin --force-rewind\" to reattach this node"));
}
PQfinish(follow_target_repl_conn);
return false;
}
/*
* pg_rewind can only rejoin to a lower timeline from PostgreSQL 9.6
*/
if (PQserverVersion(follow_target_conn) < 90600)
{
log_error(_("this node's timeline is ahead of the %s target node's timeline"), action);
log_detail(_("this node's timeline is %i, %s target node's timeline is %i"),
local_tli,
action,
follow_target_identification.timeline);
if (runtime_options.force_rewind_used == true)
{
log_hint(_("pg_rewind can only be used to rejoin to a node with a lower timeline from PostgreSQL 9.6"));
}
PQfinish(follow_target_repl_conn);
return false;
}
if (runtime_options.force_rewind_used == false)
{
log_notice(_("pg_rewind execution required for this node to attach to rejoin target node %i"),
follow_target_node_record->node_id);
log_hint(_("provide --force-rewind"));
PQfinish(follow_target_repl_conn);
return false;
}
}
/* timelines are the same - check relative positions */
if (follow_target_identification.timeline == local_tli)
else if (follow_target_identification.timeline == local_tli)
{
XLogRecPtr follow_target_xlogpos = get_node_current_lsn(follow_target_conn);
@@ -4126,12 +4287,26 @@ check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *fo
}
else
{
log_error(_("this node is ahead of the %s target"), action);
/*
* Unable to follow or join to a node we're ahead of, if we're on the
* same timeline. Also, pg_rewind does not detect this situation,
* as there is no definitive fork point.
*
* Note that Pg will still happily attach to the upstream in state "streaming"
* for a while but then detach with an endless stream of
* "record with incorrect prev-link" errors.
*/
log_error(_("this node ahead of the %s target on the same timeline (%i)"), action, local_tli);
log_detail(_("local node lsn is %X/%X, %s target lsn is %X/%X"),
format_lsn(local_xlogpos),
action,
format_lsn(follow_target_xlogpos));
if (is_rejoin == true)
{
log_hint(_("the --force-rewind option is ineffective in this case"));
}
success = false;
}
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr-client.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -83,27 +83,33 @@
#define OPT_DOWNSTREAM 1030
#define OPT_UPSTREAM 1031
#define OPT_SLOTS 1032
#define OPT_CONFIG_ARCHIVE_DIR 1033
#define OPT_HAS_PASSFILE 1034
#define OPT_WAIT_START 1035
#define OPT_REPL_CONN 1036
#define OPT_REMOTE_NODE_ID 1037
#define OPT_REPLICATION_CONF_ONLY 1038
#define OPT_NO_WAIT 1039
#define OPT_MISSING_SLOTS 1040
#define OPT_REPMGRD_NO_PAUSE 1041
#define OPT_VERSION_NUMBER 1042
#define OPT_DATA_DIRECTORY_CONFIG 1043
#define OPT_COMPACT 1044
#define OPT_DISABLE_WAL_RECEIVER 1045
#define OPT_ENABLE_WAL_RECEIVER 1046
#define OPT_DETAIL 1047
#define OPT_REPMGRD_FORCE_UNPAUSE 1048
#define OPT_REPLICATION_CONFIG_OWNER 1049
#define OPT_HAS_PASSFILE 1033
#define OPT_WAIT_START 1034
#define OPT_REPL_CONN 1035
#define OPT_REMOTE_NODE_ID 1036
#define OPT_REPLICATION_CONF_ONLY 1037
#define OPT_NO_WAIT 1038
#define OPT_MISSING_SLOTS 1039
#define OPT_REPMGRD_NO_PAUSE 1040
#define OPT_VERSION_NUMBER 1041
#define OPT_DATA_DIRECTORY_CONFIG 1042
#define OPT_COMPACT 1043
#define OPT_DETAIL 1044
#define OPT_REPMGRD_FORCE_UNPAUSE 1045
#define OPT_REPLICATION_CONFIG_OWNER 1046
#define OPT_DB_CONNECTION 1047
#define OPT_VERIFY_BACKUP 1048
#define OPT_RECOVERY_MIN_APPLY_DELAY 1049
#define OPT_REPMGRD 1050
/* These options are for internal use only */
#define OPT_CONFIG_ARCHIVE_DIR 2001
#define OPT_DISABLE_WAL_RECEIVER 2002
#define OPT_ENABLE_WAL_RECEIVER 2003
#define OPT_DUMP_CONFIG 2004
/* deprecated since 4.0 */
#define OPT_CHECK_UPSTREAM_CONFIG 999
#define OPT_NODE 998
static struct option long_options[] =
@@ -122,6 +128,7 @@ static struct option long_options[] =
{"no-wait", no_argument, NULL, 'W'},
{"compact", no_argument, NULL, OPT_COMPACT},
{"detail", no_argument, NULL, OPT_DETAIL},
{"dump-config", no_argument, NULL, OPT_DUMP_CONFIG},
/* connection options */
{"dbname", required_argument, NULL, 'd'},
@@ -158,6 +165,8 @@ static struct option long_options[] =
{"upstream-node-id", required_argument, NULL, OPT_UPSTREAM_NODE_ID},
{"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN},
{"replication-conf-only", no_argument, NULL, OPT_REPLICATION_CONF_ONLY},
{"verify-backup", no_argument, NULL, OPT_VERIFY_BACKUP },
{"recovery-min-apply-delay", required_argument, NULL, OPT_RECOVERY_MIN_APPLY_DELAY },
/* deprecate this once Pg11 and earlier are unsupported */
{"recovery-conf-only", no_argument, NULL, OPT_REPLICATION_CONF_ONLY},
@@ -185,10 +194,12 @@ static struct option long_options[] =
{"role", no_argument, NULL, OPT_ROLE},
{"slots", no_argument, NULL, OPT_SLOTS},
{"missing-slots", no_argument, NULL, OPT_MISSING_SLOTS},
{"repmgrd", no_argument, NULL, OPT_REPMGRD},
{"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE},
{"replication-connection", no_argument, NULL, OPT_REPL_CONN},
{"data-directory-config", no_argument, NULL, OPT_DATA_DIRECTORY_CONFIG},
{"replication-config-owner", no_argument, NULL, OPT_REPLICATION_CONFIG_OWNER},
{"db-connection", no_argument, NULL, OPT_DB_CONNECTION},
/* "node rejoin" options */
{"config-files", required_argument, NULL, OPT_CONFIG_FILES},
@@ -216,9 +227,6 @@ static struct option long_options[] =
{"check-upstream-config", no_argument, NULL, OPT_CHECK_UPSTREAM_CONFIG},
/* previously used by "standby switchover" */
{"remote-config-file", required_argument, NULL, 'C'},
/* replaced by --node-id */
{"node", required_argument, NULL, OPT_NODE},
{NULL, 0, NULL, 0}
};

View File

@@ -1,7 +1,7 @@
/*
* repmgr.c - repmgr extension
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This is the actual extension code; see repmgr-client.c for the code which
* generates the repmgr binary
@@ -33,22 +33,14 @@
#include "storage/shmem.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#if (PG_VERSION_NUM >= 90400)
#include "utils/pg_lsn.h"
#endif
#include "utils/timestamp.h"
#include "lib/stringinfo.h"
#include "access/xact.h"
#include "utils/snapmgr.h"
#if (PG_VERSION_NUM >= 90400)
#include "pgstat.h"
#else
#define PGSTAT_STAT_PERMANENT_DIRECTORY "pg_stat"
#endif
#include "voting.h"
@@ -96,59 +88,24 @@ void _PG_fini(void);
static void repmgr_shmem_startup(void);
Datum set_local_node_id(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(set_local_node_id);
Datum get_local_node_id(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_local_node_id);
Datum standby_set_last_updated(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(standby_set_last_updated);
Datum standby_get_last_updated(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(standby_get_last_updated);
Datum set_upstream_last_seen(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(set_upstream_last_seen);
Datum get_upstream_last_seen(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_upstream_last_seen);
Datum get_upstream_node_id(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_upstream_node_id);
Datum set_upstream_node_id(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(set_upstream_node_id);
Datum notify_follow_primary(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(notify_follow_primary);
Datum get_new_primary(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_new_primary);
Datum reset_voting_status(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(reset_voting_status);
Datum set_repmgrd_pid(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgr_set_local_node_id);
PG_FUNCTION_INFO_V1(repmgr_get_local_node_id);
PG_FUNCTION_INFO_V1(repmgr_standby_set_last_updated);
PG_FUNCTION_INFO_V1(repmgr_standby_get_last_updated);
PG_FUNCTION_INFO_V1(repmgr_set_upstream_last_seen);
PG_FUNCTION_INFO_V1(repmgr_get_upstream_last_seen);
PG_FUNCTION_INFO_V1(repmgr_get_upstream_node_id);
PG_FUNCTION_INFO_V1(repmgr_set_upstream_node_id);
PG_FUNCTION_INFO_V1(repmgr_notify_follow_primary);
PG_FUNCTION_INFO_V1(repmgr_get_new_primary);
PG_FUNCTION_INFO_V1(repmgr_reset_voting_status);
PG_FUNCTION_INFO_V1(set_repmgrd_pid);
Datum get_repmgrd_pid(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_repmgrd_pid);
Datum get_repmgrd_pidfile(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_repmgrd_pidfile);
Datum repmgrd_is_running(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgrd_is_running);
Datum repmgrd_pause(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgrd_pause);
Datum repmgrd_is_paused(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgrd_is_paused);
Datum get_wal_receiver_pid(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(get_wal_receiver_pid);
PG_FUNCTION_INFO_V1(repmgr_get_wal_receiver_pid);
/*
@@ -157,8 +114,6 @@ PG_FUNCTION_INFO_V1(get_wal_receiver_pid);
void
_PG_init(void)
{
elog(DEBUG1, "repmgr init");
if (!process_shared_preload_libraries_in_progress)
return;
@@ -175,7 +130,6 @@ _PG_init(void)
*/
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = repmgr_shmem_startup;
}
@@ -205,7 +159,7 @@ repmgr_shmem_startup(void)
shared_state = NULL;
/*
* Create or attach to the shared memory state, including hash table
* Create or attach to the shared memory state
*/
LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
@@ -244,7 +198,7 @@ repmgr_shmem_startup(void)
/* ==================== */
Datum
set_local_node_id(PG_FUNCTION_ARGS)
repmgr_set_local_node_id(PG_FUNCTION_ARGS)
{
int local_node_id = UNKNOWN_NODE_ID;
int stored_node_id = UNKNOWN_NODE_ID;
@@ -314,7 +268,7 @@ set_local_node_id(PG_FUNCTION_ARGS)
Datum
get_local_node_id(PG_FUNCTION_ARGS)
repmgr_get_local_node_id(PG_FUNCTION_ARGS)
{
int local_node_id = UNKNOWN_NODE_ID;
@@ -331,7 +285,7 @@ get_local_node_id(PG_FUNCTION_ARGS)
/* update and return last updated with current timestamp */
Datum
standby_set_last_updated(PG_FUNCTION_ARGS)
repmgr_standby_set_last_updated(PG_FUNCTION_ARGS)
{
TimestampTz last_updated = GetCurrentTimestamp();
@@ -348,7 +302,7 @@ standby_set_last_updated(PG_FUNCTION_ARGS)
/* get last updated timestamp */
Datum
standby_get_last_updated(PG_FUNCTION_ARGS)
repmgr_standby_get_last_updated(PG_FUNCTION_ARGS)
{
TimestampTz last_updated;
@@ -365,7 +319,7 @@ standby_get_last_updated(PG_FUNCTION_ARGS)
Datum
set_upstream_last_seen(PG_FUNCTION_ARGS)
repmgr_set_upstream_last_seen(PG_FUNCTION_ARGS)
{
int upstream_node_id = UNKNOWN_NODE_ID;
@@ -388,7 +342,7 @@ set_upstream_last_seen(PG_FUNCTION_ARGS)
Datum
get_upstream_last_seen(PG_FUNCTION_ARGS)
repmgr_get_upstream_last_seen(PG_FUNCTION_ARGS)
{
long secs;
int microsecs;
@@ -422,7 +376,7 @@ get_upstream_last_seen(PG_FUNCTION_ARGS)
Datum
get_upstream_node_id(PG_FUNCTION_ARGS)
repmgr_get_upstream_node_id(PG_FUNCTION_ARGS)
{
int upstream_node_id = UNKNOWN_NODE_ID;
@@ -437,7 +391,7 @@ get_upstream_node_id(PG_FUNCTION_ARGS)
}
Datum
set_upstream_node_id(PG_FUNCTION_ARGS)
repmgr_set_upstream_node_id(PG_FUNCTION_ARGS)
{
int upstream_node_id = UNKNOWN_NODE_ID;
int local_node_id = UNKNOWN_NODE_ID;
@@ -473,7 +427,7 @@ set_upstream_node_id(PG_FUNCTION_ARGS)
Datum
notify_follow_primary(PG_FUNCTION_ARGS)
repmgr_notify_follow_primary(PG_FUNCTION_ARGS)
{
int primary_node_id = UNKNOWN_NODE_ID;
@@ -516,7 +470,7 @@ notify_follow_primary(PG_FUNCTION_ARGS)
Datum
get_new_primary(PG_FUNCTION_ARGS)
repmgr_get_new_primary(PG_FUNCTION_ARGS)
{
int new_primary_node_id = UNKNOWN_NODE_ID;
@@ -538,7 +492,7 @@ get_new_primary(PG_FUNCTION_ARGS)
Datum
reset_voting_status(PG_FUNCTION_ARGS)
repmgr_reset_voting_status(PG_FUNCTION_ARGS)
{
if (!shared_state)
PG_RETURN_NULL();
@@ -746,7 +700,7 @@ repmgrd_is_paused(PG_FUNCTION_ARGS)
Datum
get_wal_receiver_pid(PG_FUNCTION_ARGS)
repmgr_get_wal_receiver_pid(PG_FUNCTION_ARGS)
{
int wal_receiver_pid;

View File

@@ -69,7 +69,7 @@
#------------------------------------------------------------------------------
#replication_user='repmgr' # User to make replication connections with, if not set
# defaults to the user defined in "conninfo".
# defaults to the user defined in "conninfo".
#replication_type='physical' # Must "physical" (the default).
@@ -181,6 +181,8 @@
#pg_ctl_options='' # Options to append to "pg_ctl"
#pg_basebackup_options='' # Options to append to "pg_basebackup"
# (Note: when cloning from Barman, repmgr will honour any
# --waldir/--xlogdir setting present in "pg_basebackup_options"
#rsync_options='' # Options to append to "rsync"
ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
@@ -209,7 +211,9 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# managing WAL archives (see: https://www.pgbarman.org )
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" will be set to
# this value (PostgreSQL 9.4 and later).
# this value (PostgreSQL 9.4 and later). Value can be
# an integer representing milliseconds, or a string
# representing a period of time (e.g. '5 min').
#------------------------------------------------------------------------------
@@ -234,9 +238,10 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#primary_follow_timeout=60 # The max length of time (in seconds) to wait
# for the new primary to become available
#standby_follow_timeout=15 # The max length of time (in seconds) to wait
#standby_follow_timeout=30 # The max length of time (in seconds) to wait
# for the standby to connect to the primary
#standby_follow_restart=false # Restart the standby instead of sending a SIGHUP
# (only for PostgreSQL 13 and later)
#------------------------------------------------------------------------------
# "standby switchover" settings
@@ -296,7 +301,8 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#connection_check_type=ping # How to check availability of the upstream node; valid options:
# 'ping': use PQping() to check if the node is accepting connections
# 'connection': execute a throwaway query on the current connection
# 'connection': attempt to make a new connection to the node
# 'query': execute an SQL statement on the node via the existing connection
#reconnect_attempts=6 # Number of attempts which will be made to reconnect to an unreachable
# primary (or other upstream node)
#reconnect_interval=10 # Interval between attempts to reconnect to an unreachable
@@ -308,7 +314,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#follow_command='' # command repmgrd executes when instructing a standby to follow a new primary;
# use something like:
#
# repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n
# repmgr standby follow -f /etc/repmgr.conf --upstream-node-id=%n
#
#primary_notification_timeout=60 # Interval (in seconds) which repmgrd on a standby
# will wait for a notification from the new primary,
@@ -317,7 +323,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# for the the local node to restart and become ready to accept connections after
# executing "follow_command" (defaults to the value set in "standby_reconnect_timeout")
#monitoring_history=no # Whether to write monitoring data to the "montoring_history" table
#monitoring_history=no # Whether to write monitoring data to the "monitoring_history" table
#monitor_interval_secs=2 # Interval (in seconds) at which to write monitoring data
#degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the
# server(s) being monitored are no longer available. -1 (default)
@@ -331,6 +337,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# "--no-pid-file" will force PID file creation to be skipped.
# Note: there is normally no need to set this, particularly if
# repmgr was installed from packages.
#repmgrd_exit_on_inactive_node=false # If "true", and the node record is marked as "inactive", abort repmgrd startup
#standby_disconnect_on_failover=false # If "true", in a failover situation wait for all standbys to
# disconnect their WAL receivers before electing a new primary
# (PostgreSQL 9.5 and later only; repmgr user must be a superuser for this)
@@ -339,6 +346,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# WAL receivers
#primary_visibility_consensus=false # If "true", only continue with failover if no standbys have seen
# the primary node recently. *Must* be the same on all nodes.
#always_promote=false # Always promote a node, even if repmgr metadata is outdated
#failover_validation_command='' # Script to execute for an external mechanism to validate the failover
# decision made by repmgrd. One or both of the following parameter placeholders
# should be provided, which will be replaced by repmgrd with the appropriate

View File

@@ -1,8 +1,7 @@
# repmgr extension
comment = 'Replication manager for PostgreSQL'
default_version = '5.1'
default_version = '5.3'
module_pathname = '$libdir/repmgr'
relocatable = false
schema = repmgr

View File

@@ -1,6 +1,6 @@
/*
* repmgr.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -74,16 +74,15 @@
#include "log.h"
#include "sysutils.h"
#define MIN_SUPPORTED_VERSION "9.3"
#define MIN_SUPPORTED_VERSION_NUM 90300
#define REPLICATION_TYPE_PHYSICAL 1
#define MIN_SUPPORTED_VERSION "9.4"
#define MIN_SUPPORTED_VERSION_NUM 90400
#define UNKNOWN_SERVER_VERSION_NUM -1
#define UNKNOWN_REPMGR_VERSION_NUM -1
#define UNKNOWN_TIMELINE_ID -1
#define UNKNOWN_SYSTEM_IDENTIFIER 0
#define UNKNOWN_DATA_CHECKSUM_VERSION -1
#define UNKNOWN_PID -1
#define UNKNOWN_REPLICATION_LAG -1
#define UNKNOWN_VALUE -1
@@ -97,40 +96,59 @@
#define ARCHIVE_STATUS_DIR_ERROR -1
#define NO_DEGRADED_MONITORING_ELAPSED -1
#define WALRECEIVER_DISABLE_TIMEOUT_VALUE 86400000 /* milliseconds */
/*
* various default values - ensure repmgr.conf.sample is update
* if any of these are changed
* Default command line option parameter values
*/
#define DEFAULT_LOCATION "default"
#define DEFAULT_PRIORITY 100
#define DEFAULT_RECONNECTION_ATTEMPTS 6 /* seconds */
#define DEFAULT_RECONNECTION_INTERVAL 10 /* seconds */
#define DEFAULT_MONITORING_INTERVAL 2 /* seconds */
#define DEFAULT_ASYNC_QUERY_TIMEOUT 60 /* seconds */
#define DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT 60 /* seconds */
#define DEFAULT_PRIMARY_FOLLOW_TIMEOUT 60 /* seconds */
#define DEFAULT_STANDBY_FOLLOW_TIMEOUT 30 /* seconds */
#define DEFAULT_ARCHIVE_READY_WARNING 16 /* WAL files */
#define DEFAULT_ARCHIVE_READY_CRITICAL 128 /* WAL files */
#define DEFAULT_REPLICATION_LAG_WARNING 300 /* seconds */
#define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */
#define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */
#define DEFAULT_WAIT_START 30 /* seconds */
/*
* Default configuration file parameter values - ensure repmgr.conf.sample
* is update if any of these are changed
*/
#define DEFAULT_USE_REPLICATION_SLOTS false
#define DEFAULT_USE_PRIMARY_CONNINFO_PASSWORD false
#define DEFAULT_PROMOTE_CHECK_TIMEOUT 60 /* seconds */
#define DEFAULT_PROMOTE_CHECK_INTERVAL 1 /* seconds */
#define DEFAULT_PRIMARY_FOLLOW_TIMEOUT 60 /* seconds */
#define DEFAULT_STANDBY_FOLLOW_TIMEOUT 30 /* seconds */
#define DEFAULT_STANDBY_FOLLOW_RESTART false
#define DEFAULT_SHUTDOWN_CHECK_TIMEOUT 60 /* seconds */
#define DEFAULT_STANDBY_RECONNECT_TIMEOUT 60 /* seconds */
#define DEFAULT_NODE_REJOIN_TIMEOUT 60 /* seconds */
#define DEFAULT_ARCHIVE_READY_WARNING 16 /* WAL files */
#define DEFAULT_ARCHIVE_READY_CRITICAL 128 /* WAL files */
#define DEFAULT_REPLICATION_TYPE REPLICATION_TYPE_PHYSICAL
#define DEFAULT_REPLICATION_LAG_WARNING 300 /* seconds */
#define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */
#define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */
#define DEFAULT_WAL_RECEIVE_CHECK_TIMEOUT 30 /* seconds */
#define DEFAULT_LOCATION "default"
#define DEFAULT_PRIORITY 100
#define DEFAULT_MONITORING_INTERVAL 2 /* seconds */
#define DEFAULT_RECONNECTION_ATTEMPTS 6 /* seconds */
#define DEFAULT_RECONNECTION_INTERVAL 10 /* seconds */
#define DEFAULT_MONITORING_HISTORY false
#define DEFAULT_DEGRADED_MONITORING_TIMEOUT -1 /* seconds */
#define DEFAULT_ASYNC_QUERY_TIMEOUT 60 /* seconds */
#define DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT 60 /* seconds */
#define DEFAULT_REPMGRD_STANDBY_STARTUP_TIMEOUT -1 /*seconds */
#define DEFAULT_REPMGRD_EXIT_ON_INACTIVE_NODE false,
#define DEFAULT_STANDBY_DISCONNECT_ON_FAILOVER false
#define DEFAULT_SIBLING_NODES_DISCONNECT_TIMEOUT 30 /* seconds */
#define DEFAULT_CONNECTION_CHECK_TYPE CHECK_PING
#define DEFAULT_PRIMARY_VISIBILITY_CONSENSUS false
#define DEFAULT_ALWAYS_PROMOTE false
#define DEFAULT_ELECTION_RERUN_INTERVAL 15 /* seconds */
#define DEFAULT_CHILD_NODES_CHECK_INTERVAL 5 /* seconds */
#define DEFAULT_CHILD_NODES_DISCONNECT_MIN_COUNT -1
#define DEFAULT_CHILD_NODES_CONNECTED_MIN_COUNT -1
#define DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT 30 /* seconds */
#define DEFAULT_CHILD_NODES_CONNECTED_INCLUDE_WITNESS false
#define DEFAULT_CHILD_NODES_DISCONNECT_TIMEOUT 30 /* seconds */
#define DEFAULT_SSH_OPTIONS "-q -o ConnectTimeout=10"
#define WALRECEIVER_DISABLE_TIMEOUT_VALUE 86400000 /* milliseconds */
#ifndef RECOVERY_COMMAND_FILE
#define RECOVERY_COMMAND_FILE "recovery.conf"
@@ -145,6 +163,6 @@
#define TABLESPACE_MAP "tablespace_map"
#endif
#define REPMGR_URL "https://repmgr.org/"
#endif /* _REPMGR_H_ */

View File

@@ -1,5 +1,7 @@
#define REPMGR_VERSION_DATE ""
#define REPMGR_VERSION "5.1.0"
#define REPMGR_VERSION_NUM 50100
#define REPMGR_RELEASE_DATE "2020-04-13"
#define REPMGR_VERSION "5.3.1"
#define REPMGR_VERSION_NUM 50301
#define REPMGR_EXTENSION_VERSION "5.3"
#define REPMGR_EXTENSION_NUM 50300
#define REPMGR_RELEASE_DATE "2022-02-15"
#define PG_ACTUAL_VERSION_NUM

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/*
* repmgrd-physical.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -19,7 +19,7 @@
#ifndef _REPMGRD_PHYSICAL_H_
#define _REPMGRD_PHYSICAL_H_
void do_physical_node_check(void);
void do_physical_node_check(PGconn *conn);
void monitor_streaming_primary(void);
void monitor_streaming_standby(void);

126
repmgrd.c
View File

@@ -1,7 +1,7 @@
/*
* repmgrd.c - Replication manager daemon
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -39,13 +39,9 @@ static bool daemonize = true;
static bool show_pid_file = false;
static bool no_pid_file = false;
t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
t_node_info local_node_info = T_NODE_INFO_INITIALIZER;
PGconn *local_conn = NULL;
/* Collate command line errors here for friendlier reporting */
static ItemList cli_errors = {NULL, NULL};
@@ -135,7 +131,7 @@ main(int argc, char **argv)
memset(pid_file, 0, MAXPGPATH);
while ((c = getopt_long(argc, argv, "?Vf:L:vdp:m", long_options, &optindex)) != -1)
while ((c = getopt_long(argc, argv, "?Vf:L:vdp:sm", long_options, &optindex)) != -1)
{
switch (c)
{
@@ -253,10 +249,10 @@ main(int argc, char **argv)
/*
* Parse the configuration file, if provided (if no configuration file was
* provided, an attempt will be made to find one in one of the default
* locations). If no conifguration file is available, or it can't be parsed
* locations). If no configuration file is available, or it can't be parsed
* parse_config() will abort anyway, with an appropriate message.
*/
load_config(config_file, verbose, false, &config_file_options, argv[0]);
load_config(config_file, verbose, false, argv[0]);
/* Determine pid file location, unless --no-pid-file supplied */
@@ -311,7 +307,7 @@ main(int argc, char **argv)
}
/* Some configuration file items can be overriden by command line options */
/* Some configuration file items can be overridden by command line options */
/*
* Command-line parameter -L/--log-level overrides any setting in config
@@ -400,13 +396,14 @@ main(int argc, char **argv)
* extension is the latest available according to "pg_available_extensions" -
* - does our (major) version match that?
*/
log_verbose(LOG_DEBUG, "binary version: %i; extension version: %i",
REPMGR_VERSION_NUM, extversions.installed_version_num);
if ((REPMGR_VERSION_NUM/100) < (extversions.installed_version_num / 100))
log_verbose(LOG_DEBUG, "expected extension version: %i; extension version: %i",
REPMGR_EXTENSION_NUM, extversions.installed_version_num);
if ((REPMGR_EXTENSION_NUM/100) < (extversions.installed_version_num / 100))
{
log_error(_("this \"repmgr\" version is older than the installed \"repmgr\" extension version"));
log_detail(_("\"repmgr\" version %s is installed but extension is version %s"),
log_detail(_("\"repmgr\" version %s providing extension version %s is installed but extension is version %s"),
REPMGR_VERSION,
REPMGR_EXTENSION_VERSION,
extversions.installed_version);
log_hint(_("update the repmgr binaries to match the installed extension version"));
@@ -414,13 +411,14 @@ main(int argc, char **argv)
exit(ERR_BAD_CONFIG);
}
if ((REPMGR_VERSION_NUM/100) > (extversions.installed_version_num / 100))
if ((REPMGR_EXTENSION_NUM/100) > (extversions.installed_version_num / 100))
{
log_error(_("this \"repmgr\" version is newer than the installed \"repmgr\" extension version"));
log_detail(_("\"repmgr\" version %s is installed but extension is version %s"),
log_detail(_("\"repmgr\" version %s providing extension version %s is installed but extension is version %s"),
REPMGR_VERSION,
REPMGR_EXTENSION_VERSION,
extversions.installed_version);
log_hint(_("update the installed extension version by executing \"ALTER EXTENSION repmgr UPDATE\""));
log_hint(_("update the installed extension version by executing \"ALTER EXTENSION repmgr UPDATE\" in the repmgr database"));
close_connection(&local_conn);
exit(ERR_BAD_CONFIG);
@@ -514,7 +512,7 @@ main(int argc, char **argv)
log_debug("node id is %i, upstream node id is %i",
local_node_info.node_id,
local_node_info.upstream_node_id);
do_physical_node_check();
do_physical_node_check(local_conn);
}
if (daemonize == true)
@@ -812,42 +810,92 @@ show_help(void)
bool
check_upstream_connection(PGconn **conn, const char *conninfo)
check_upstream_connection(PGconn **conn, const char *conninfo, PGconn **paired_conn)
{
/* Check the connection status twice in case it changes after reset */
bool twice = false;
if (config_file_options.connection_check_type == CHECK_PING)
return is_server_available(conninfo);
if (config_file_options.connection_check_type == CHECK_CONNECTION)
log_debug("connection check type is \"%s\"",
print_connection_check_type(config_file_options.connection_check_type));
/*
* For the check types which do not involve using the existing database
* connection, we'll perform the actual check, then as an additional
* safeguard verify that the connection is still valid (as it might have
* gone away during a brief outage between checks).
*/
if (config_file_options.connection_check_type != CHECK_QUERY)
{
bool success = true;
PGconn *test_conn = PQconnectdb(conninfo);
log_debug("check_upstream_connection(): attempting to connect to \"%s\"", conninfo);
if (PQstatus(test_conn) != CONNECTION_OK)
if (config_file_options.connection_check_type == CHECK_PING)
{
log_warning(_("unable to connect to \"%s\""), conninfo);
log_detail("\n%s", PQerrorMessage(test_conn));
success = false;
success = is_server_available(conninfo);
}
PQfinish(test_conn);
else if (config_file_options.connection_check_type == CHECK_CONNECTION)
{
/*
* This connection is thrown away, and we never execute a query on it.
*/
PGconn *test_conn = PQconnectdb(conninfo);
return success;
log_debug("check_upstream_connection(): attempting to connect to \"%s\"", conninfo);
if (PQstatus(test_conn) != CONNECTION_OK)
{
log_warning(_("unable to connect to \"%s\""), conninfo);
log_detail("\n%s", PQerrorMessage(test_conn));
success = false;
}
PQfinish(test_conn);
}
if (success == false)
return false;
if (PQstatus(*conn) == CONNECTION_OK)
return true;
/*
* Checks have succeeded, but the open connection to the primary has gone away,
* possibly due to a brief outage between monitoring intervals - attempt to
* reset it.
*/
log_notice(_("upstream is available but upstream connection has gone away, resetting"));
PQfinish(*conn);
*conn = establish_db_connection_quiet(conninfo);
if (PQstatus(*conn) == CONNECTION_OK)
{
if (paired_conn != NULL)
{
log_debug("resetting paired connection");
*paired_conn = *conn;
}
return true;
}
return false;
}
for (;;)
{
if (PQstatus(*conn) != CONNECTION_OK)
{
log_debug("check_upstream_connection(): connection not OK");
log_debug("check_upstream_connection(): upstream connection has gone away, resetting");
if (twice)
return false;
/* reconnect */
PQfinish(*conn);
*conn = PQconnectdb(conninfo);
*conn = establish_db_connection_quiet(conninfo);
if (paired_conn != NULL)
{
log_debug("resetting paired connection");
*paired_conn = *conn;
}
twice = true;
}
else
@@ -859,7 +907,7 @@ check_upstream_connection(PGconn **conn, const char *conninfo)
goto failed;
/* execute a simple query to verify connection availability */
if (PQsendQuery(*conn, "SELECT 1") == 0)
if (PQsendQuery(*conn, config_file_options.connection_check_query) == 0)
{
log_warning(_("unable to send query to upstream"));
log_detail("%s", PQerrorMessage(*conn));
@@ -877,8 +925,16 @@ check_upstream_connection(PGconn **conn, const char *conninfo)
return false;
/* reconnect */
log_debug("check_upstream_connection(): upstream connection not available, resetting");
PQfinish(*conn);
*conn = PQconnectdb(conninfo);
*conn = establish_db_connection_quiet(conninfo);
if (paired_conn != NULL)
{
log_debug("resetting paired connection");
*paired_conn = *conn;
}
twice = true;
}
}

View File

@@ -1,6 +1,6 @@
/*
* repmgrd.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*/
@@ -17,13 +17,12 @@ extern volatile sig_atomic_t got_SIGHUP;
extern MonitoringState monitoring_state;
extern instr_time degraded_monitoring_start;
extern t_configuration_options config_file_options;
extern t_node_info local_node_info;
extern PGconn *local_conn;
extern bool startup_event_logged;
extern char pid_file[MAXPGPATH];
bool check_upstream_connection(PGconn **conn, const char *conninfo);
bool check_upstream_connection(PGconn **conn, const char *conninfo, PGconn **paired_conn);
void try_reconnect(PGconn **conn, t_node_info *node_info);
int calculate_elapsed(instr_time start_time);

View File

@@ -1,7 +1,7 @@
/*
* strutil.c
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -369,7 +369,6 @@ check_status_list_free(CheckStatusList *list)
}
const char *
output_check_status(CheckStatus status)
{
@@ -386,12 +385,9 @@ output_check_status(CheckStatus status)
}
return "UNKNOWN";
}
/*
* Escape a string for use as a parameter in recovery.conf
* Caller must free returned value
@@ -433,7 +429,6 @@ escape_string(PGconn *conn, const char *string)
/*
* simple function to escape double quotes only
*/
void
escape_double_quotes(char *string, PQExpBufferData *out)
{
@@ -564,3 +559,10 @@ parse_follow_command(char *parsed_command, char *template, int node_id)
return;
}
const char *
format_bool(bool value)
{
return value == true ? "true" : "false";
}

View File

@@ -1,6 +1,6 @@
/*
* strutil.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -166,4 +166,6 @@ extern char *trim(char *s);
extern void
parse_follow_command(char *parsed_command, char *template, int node_id);
extern const char *format_bool(bool value);
#endif /* _STRUTIL_H_ */

View File

@@ -3,7 +3,7 @@
*
* Functions which need to be executed on the local system.
*
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -57,27 +57,56 @@ _local_command(const char *command, PQExpBufferData *outputbuf, bool simple, int
char output[MAXLEN];
int retval = 0;
bool success;
char tmpfile_path[MAXPGPATH];
const char *tmpdir = getenv("TMPDIR");
int fd;
PQExpBufferData command_final;
log_verbose(LOG_DEBUG, "executing:\n %s", command);
if (!tmpdir)
tmpdir = "/tmp";
maxpath_snprintf(tmpfile_path, "%s/repmgr_command.XXXXXX",
tmpdir);
fd = mkstemp(tmpfile_path);
if (fd < 1)
{
log_error(_("unable to open temporary file"));
return false;
}
initPQExpBuffer(&command_final);
appendPQExpBufferStr(&command_final, command);
appendPQExpBuffer(&command_final, " 2>%s", tmpfile_path);
log_verbose(LOG_DEBUG, "executing:\n %s", command_final.data);
if (outputbuf == NULL)
{
retval = system(command);
retval = system(command_final.data);
termPQExpBuffer(&command_final);
if (return_value != NULL)
*return_value = WEXITSTATUS(retval);
close(fd);
return (retval == 0) ? true : false;
}
fp = popen(command, "r");
fp = popen(command_final.data, "r");
if (fp == NULL)
{
log_error(_("unable to execute local command:\n%s"), command);
log_error(_("unable to execute local command:\n%s"), command_final.data);
termPQExpBuffer(&command_final);
close(fd);
return false;
}
termPQExpBuffer(&command_final);
while (fgets(output, MAXLEN, fp) != NULL)
{
@@ -91,11 +120,32 @@ _local_command(const char *command, PQExpBufferData *outputbuf, bool simple, int
retval = pclose(fp);
/* */
/* 141 = SIGPIPE */
success = (WEXITSTATUS(retval) == 0 || WEXITSTATUS(retval) == 141) ? true : false;
log_verbose(LOG_DEBUG, "result of command was %i (%i)", WEXITSTATUS(retval), retval);
/*
* Append any captured STDERR output
*/
fp = fopen(tmpfile_path, "r");
/*
* Not critical if we can't open the file
*/
if (fp != NULL)
{
while (fgets(output, MAXLEN, fp) != NULL)
{
appendPQExpBufferStr(outputbuf, output);
}
fclose(fp);
}
unlink(tmpfile_path);
if (return_value != NULL)
*return_value = WEXITSTATUS(retval);
@@ -104,6 +154,7 @@ _local_command(const char *command, PQExpBufferData *outputbuf, bool simple, int
else
log_verbose(LOG_DEBUG, "local_command(): no output returned");
return success;
}
@@ -118,28 +169,13 @@ remote_command(const char *host, const char *user, const char *command, const ch
{
FILE *fp;
PQExpBufferData ssh_command;
PQExpBufferData ssh_host;
char output[MAXLEN] = "";
initPQExpBuffer(&ssh_host);
if (*user != '\0')
{
appendPQExpBuffer(&ssh_host, "%s@", user);
}
appendPQExpBufferStr(&ssh_host, host);
initPQExpBuffer(&ssh_command);
appendPQExpBuffer(&ssh_command,
"ssh -o Batchmode=yes %s %s %s",
ssh_options,
ssh_host.data,
command);
termPQExpBuffer(&ssh_host);
make_remote_command(host, user, command, ssh_options, &ssh_command);
log_debug("remote_command():\n %s", ssh_command.data);
@@ -187,6 +223,32 @@ remote_command(const char *host, const char *user, const char *command, const ch
}
void
make_remote_command(const char *host, const char *user, const char *command, const char *ssh_options, PQExpBufferData *ssh_command)
{
PQExpBufferData ssh_host;
initPQExpBuffer(&ssh_host);
if (*user != '\0')
{
appendPQExpBuffer(&ssh_host, "%s@", user);
}
appendPQExpBufferStr(&ssh_host, host);
appendPQExpBuffer(ssh_command,
"ssh -o Batchmode=yes %s %s %s",
ssh_options,
ssh_host.data,
command);
termPQExpBuffer(&ssh_host);
}
pid_t
disable_wal_receiver(PGconn *conn)
{
@@ -227,10 +289,19 @@ disable_wal_receiver(PGconn *conn)
if (wal_retrieve_retry_interval < WALRECEIVER_DISABLE_TIMEOUT_VALUE)
{
bool success;
log_notice(_("setting \"wal_retrieve_retry_interval\" to %i milliseconds"),
new_wal_retrieve_retry_interval);
alter_system_int(conn, "wal_retrieve_retry_interval", new_wal_retrieve_retry_interval);
pg_reload_conf(conn);
success = pg_reload_conf(conn);
if (success == false)
{
log_warning(_("unable to reload configuration"));
return UNKNOWN_PID;
}
}
/*
@@ -292,6 +363,12 @@ enable_wal_receiver(PGconn *conn, bool wait_startup)
/* make timeout configurable */
int i, timeout = 30;
if (PQstatus(conn) != CONNECTION_OK)
{
log_error(_("database connection not available"));
return UNKNOWN_PID;
}
if (is_superuser_connection(conn, NULL) == false)
{
log_error(_("superuser connection required"));
@@ -332,7 +409,13 @@ enable_wal_receiver(PGconn *conn, bool wait_startup)
return UNKNOWN_PID;
}
pg_reload_conf(conn);
success = pg_reload_conf(conn);
if (success == false)
{
log_warning(_("unable to reload configuration"));
return UNKNOWN_PID;
}
}
else
{

View File

@@ -1,6 +1,6 @@
/*
* sysutils.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -24,6 +24,7 @@ extern bool local_command_return_value(const char *command, PQExpBufferData *out
extern bool local_command_simple(const char *command, PQExpBufferData *outputbuf);
extern bool remote_command(const char *host, const char *user, const char *command, const char *ssh_options, PQExpBufferData *outputbuf);
extern void make_remote_command(const char *host, const char *user, const char *command, const char *ssh_options, PQExpBufferData *ssh_command);
extern pid_t disable_wal_receiver(PGconn *conn);
extern pid_t enable_wal_receiver(PGconn *conn, bool wait_startup);

View File

@@ -1,6 +1,6 @@
/*
* voting.h
* Copyright (c) 2ndQuadrant, 2010-2020
* Copyright (c) EnterpriseDB Corporation, 2010-2021
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by