Commit Graph

1873 Commits

Author SHA1 Message Date
Ian Barwick
e7acb6809b standby clone: improve error logging
When executing "repmgr standby clone" in Barman mode, and --waldir
is set in pg_basebackup options, properly report an error if the
target WAL directory could not be created or is not empty.
2020-10-08 10:46:20 +09:00
Ian Barwick
4b524c52b6 standby clone: honour --waldir setting when cloning from Barman
By setting --waldir in "pg_basebackup_options", standbys cloned using
pg_basebackup would have their WAL directory set to the specified
location and symlinked from the data directory.

This commit causes repmgr to honour that setting even when cloning
from Barman.
2020-10-07 15:13:52 +09:00
Ian Barwick
b3b9281253 Parse pg_basebackup option --waldir/--xlogdir 2020-10-07 15:13:50 +09:00
Ian Barwick
679cfe0852 doc: update release notes
Note PostgreSQL 13 support as a general feature.
2020-10-06 14:17:16 +09:00
Ian Barwick
e10d9fd393 EXPERIMENTAL: synchronise try_primary_reconnect()'s reconnection loop
Per proposal in GitHub #662, this patch attempts to synchronise each
repmgrd's primary reconnection attempts to prevent potential race
conditions. This relies on each node's clock being correcly
synchronised.

Currently this change is experimental and is not enabled by default.
It can be enabled by setting the repmgr.conf parameter
"reconnect_loop_sync".
2020-10-06 13:35:49 +09:00
Ian Barwick
467d19bcd4 Use atoll() to parse system_identifier on 32bit systems
Addresses issue in GitHub #665.
2020-10-06 09:44:31 +09:00
Ian Barwick
be244e2155 Fix typo
s/paremeter/parameter/
2020-10-05 17:06:52 +09:00
Ian Barwick
42283bf344 repmgrd: check local connection after promoting local node
In theory the local connection should not be affected by the node's
promotion. However we're handing over control to an external command
which is usually just "repmgr standby promote", but could potentially
be a user-defined script with unknowable side effects. So it's
better to be safe than sorry.
2020-10-05 16:50:41 +09:00
Ian Barwick
5b254a1be9 repmgrd: add parameter "failover_delay"
This parameter is not documented and intended for use during testing.
It should not be used in production.
2020-10-05 16:43:06 +09:00
Ian Barwick
9aaf7d79a2 standby switchover: in Pg13 and later, promotion overrides paused WAL replay
Preventing a switchover in this case no longer makes sense, so we apply
the checks to PostgreSQL 12 and earlier only.
2020-09-30 15:15:11 +09:00
Ian Barwick
e86c035242 standby promote: in Pg13 and later, promotion overrides paused WAL replay
Aborting in this case no longer makes sense, so we apply the checks
to PostgreSQL 12 and earlier only.
2020-09-30 15:15:07 +09:00
Ian Barwick
73d2088a85 standby follow: don't restart server (PostgreSQL 13 and later)
As of PostgreSQL 13, changes to the fundamental replication
configuration can be applied with a simple SIGHUP, no restart
required.

In case the old behaviour is desired, i.e. a full restart to apply
the configuration changes, the new configuration parameter
"standby_follow_restart" can be set. This parameter has no effect
in PostgreSQL 12 and earlier.
2020-09-29 17:53:51 +09:00
Ian Barwick
48f95f9a39 Fix typo in comment 2020-09-29 15:26:28 +09:00
Ian Barwick
ce229beff8 repmgrd: add configuration option "always_promote"
In certain corner cases, it's possible repmgrd may end up monitoring
a standby which was a former primary, but the node record has not
yet been updated.

Previously repmgrd would abort the promotion with a cryptic message
about being unable to find a node record for node_id -1 (the
default value for an unknown node id).

This commit addes a new configuration option "always_promote", which
determines whether repmgrd should promote the node in this case.
The default is "false", to effectively maintain the existing behaviour.

Logging output has also been improved to make it clearer what has
happened when this situation occurs.
2020-09-29 14:18:00 +09:00
Ian Barwick
16eeae700c repmgrd: minor log message tweak 2020-09-29 10:23:31 +09:00
Ian Barwick
70061c51aa Further improve handling of possible pg_control read errors
Builds on changes in commit 147f454, and ensures appropriate
action is taken if a value cannot be read from pg_control.
2020-09-28 13:59:34 +09:00
houzj.fnst
3ffeffbd8b Remove redundant condition
GitHub #655.
2020-09-28 13:08:18 +09:00
Ian Barwick
147f454d32 Minor sanity check for control file extraction functions
If the control file couldn't be parsed for whatever reason, return
the default value for the requested parameter.

It'd be better to have the caller pass in a pointer to the parameter
and have the function return bool so the caller doesn't assume the
control file was read successfully. This is important for handling
DBState, where no "value unknown" default is available.
2020-09-28 10:47:56 +09:00
Ian Barwick
26b5664741 repmgr: enable "primary unregister --force" to unregister an active primary
The primary must have no registered standby nodes.

Also document usage when unregistering a primary node which is actually
running as a standby.
2020-09-23 15:12:19 +09:00
Ian Barwick
cb86180f4f doc: document include directives 2020-09-18 16:27:52 +09:00
Ian Barwick
1f3e098104 Add option "--dump-config"
This is initially intended for verifying the configuration parsing
mechanism and is currently undocumented.
2020-09-18 15:12:22 +09:00
Ian Barwick
4670515285 config: fix parsing of event_notifications list 2020-09-18 14:36:56 +09:00
Ian Barwick
bccc2673b6 doc: update compatibility matrix 2020-09-18 11:40:47 +09:00
Ian Barwick
158008c5c5 doc: update release notes 2020-09-18 11:29:07 +09:00
Ian Barwick
5f3d1cdeb6 doc: note removal of PostgreSQL 9.3 support 2020-09-17 16:05:16 +09:00
Ian Barwick
82515a9733 doc: document new parameters for "failover_validation_command" 2020-09-17 15:48:18 +09:00
Stanislav Paskalev
73e8373337 Add %v, %u and %t parameters to "failover_validation_command"
These indicate:
 - the number of visible nodes sharing the current upstream
 - the number of nodes on the current upstream
 - the total number of nodes in the entire repmgr cluster.

This allows the failover_validation_command to be used to perform
more thorough validations, including cross-referencing external
cluster management state (e.g. if managed by kubernetes).

GitHub #651.
2020-09-17 15:48:12 +09:00
Ian Barwick
f1bdb09512 doc: note existing pg_rewind corner-case bug 2020-09-15 14:21:14 +09:00
Ian Barwick
028e3ab48d doc: rearrange "repmgr node rejoin" reference for clarity
The <important> section looked like an actual subsection, so convert
that and the following example section into <refsect2> sections.
2020-09-15 13:42:18 +09:00
Ian Barwick
b5b7d635ad doc: fix "release-current" tag 2020-09-04 15:12:33 +09:00
Ian Barwick
146654bf0e Clarify code comment 2020-09-04 15:10:54 +09:00
Ian Barwick
4c3aed2573 rsync: exclude log/pg_log directory depending on PostgreSQL version
This is more for completeness as the data source here is Barman, which
shouldn't contain log files anyway.
2020-09-04 15:03:27 +09:00
Ian Barwick
3945314e65 Remove PostgreSQL 9.3 support
PostgreSQL 9.3 community support ended in November 2018.
2020-09-04 11:37:12 +09:00
Ian Barwick
f4938a4a42 standby clone: tweak --help output wording 2020-09-03 10:41:04 +09:00
Ian Barwick
9a836b3c04 Add option --verify-backup
This causes pg_verifybackup to be executed immediately after
pg_basebackup completes.

PostreSQL 13 and later.
2020-09-03 10:40:32 +09:00
Ian Barwick
c8e52e486f Have make_pg_path() output to a PQexpBuffer
Calling functions are all using one anyway, so there's no point keeping
static buffers around.
2020-09-02 15:30:57 +09:00
Ian Barwick
1f7ac843fd Consolidate role availability checking code 2020-09-01 14:37:33 +09:00
Ian Barwick
8d57d7e001 is_downstream_node_attached(): avoid false negative
If the provided connection does not have sufficient permission to read
"pg_stat_replication.state", and there is an entry for the node in
"pg_stat_replication", assume it's connected. Finer-grained detection
requires additional user permissions, nothing we can do about that.
2020-09-01 14:28:40 +09:00
Ian Barwick
13e7c679cd Minor coding style fixes 2020-09-01 13:35:13 +09:00
Ian Barwick
466590af28 Fix comment 2020-09-01 13:23:46 +09:00
Ian Barwick
a88c80248c repmgrd: minor tweaks to witness node synchronisation
Explicitly roll back if any operation fails, and add debugging output
to track elapsed time between synchronisation intervals.
2020-09-01 09:58:14 +09:00
Ian Barwick
1131e3aad2 PostgreSQL 13: support "wal_keep_size"
Renamed from "wal_keep_segments" in core commit f5dff459.
2020-08-31 17:18:41 +09:00
Ian Barwick
0630d9644e Improve replication connection check
Previously the check verifying that a node has connected to its upstream
merely assumed the presence of a record in pg_stat_replication indicates
a successful replication connection. However the record may contain a
state other than "streaming", typically "startup" (which will occur when
a node has diverged from its upstream and will therefore never
transition to "streaming"), which needs to be taken into account when
considering the state of the replication connection to avoid false
positives.
2020-08-27 16:09:25 +09:00
Ian Barwick
c50a2d049c doc: note use of wildcards in .pgpass file 2020-08-19 10:32:34 +09:00
Ian Barwick
20100f5aaa docs: link to PostgreSQL roadmap 2020-08-06 09:50:23 +09:00
Ian Barwick
2d20c110bf doc: update "repmgr witness register" description
Add missing "Options" section.
2020-08-06 09:50:19 +09:00
Ian Barwick
aed0045c3a docs: reformat additonal config file upgrade notes into a new section
It's easier to link to the information that way.
2020-08-06 09:50:16 +09:00
Martín Marqués
cd81046c26 doc: add two notes on section related to configuration files
Add notes to the documention mentioning that after postgres or repmgr
upgrades (postgres major upgrades), there are some changes that need
to be taken care of.

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2020-08-06 09:50:11 +09:00
Ian Barwick
fb32284cdc remove superfluous debugging output 2020-08-06 09:05:55 +09:00
Ian Barwick
893044f1e9 Add missing pfree() calls 2020-07-16 11:08:20 +09:00