Commit Graph

111 Commits

Author SHA1 Message Date
Ian Barwick
a50f0e7cc0 doc: update release notes 2019-05-07 15:49:34 +09:00
Ian Barwick
d4b17635fe doc: update release notes 2019-05-07 15:25:54 +09:00
Ian Barwick
9ee2448583 standby switchover: ignore nodes which are unreachable and marked as inactive
Previously "repmgr standby switchover" would abort if any node was unreachable,
as that means it was unable to check if repmgrd is running.

However if the node has been marked as inactive in the repmgr metadata, it's
reasonable to assume the node is no longer part of the replication cluster
and does not need to be checked.
2019-04-29 14:35:58 +09:00
Ian Barwick
3b96b2afce "primary register": ensure --force works if another primary is registered but not running 2019-04-23 22:06:28 +09:00
Ian Barwick
216f274c15 doc: add note about when a PostgreSQL restart is required
Per query in GitHub #564.
2019-04-17 09:44:29 +09:00
Ian Barwick
7aaac343f8 standby clone: always ensure directory is created with correct permissions
In Barman mode, if there is an existing, populated data directory, and
the "--force" option is provided, the entire directory was being deleted,
and later recreated as part of the rsync process, but with the default
permissions.

Fix this by recreating the data directory with the correct permissions
after deleting it.
2019-04-09 11:02:06 +09:00
Ian Barwick
68470a9167 standby clone: improve --dry-run behaviour in barman mode
- emit additional informational output
- ensure that provision of --force does not result in an existing
  data directory being modified in any way
2019-04-08 15:13:13 +09:00
Ian Barwick
35320c27bd doc: update release notes 2019-04-08 11:28:13 +09:00
Ian Barwick
4e8b94c105 doc: update 4.3 release notes 2019-04-03 15:09:09 +09:00
Ian Barwick
b03f07ca8f doc: finalize 4.3 release notes 2019-04-02 14:42:25 +09:00
Ian Barwick
a6eacca6e4 standby register: fail if --upstream-node-id is the local node ID 2019-03-27 14:27:59 +09:00
Ian Barwick
29acd10f37 doc: update release notes 2019-03-22 15:42:12 +09:00
Ian Barwick
4251590833 doc: update "connection_check_type" descriptions 2019-03-15 14:07:13 +09:00
John Naylor
feb90ee50c Correct some doc typos 2019-03-15 14:07:05 +09:00
Ian Barwick
0a6486bb7f doc: expand "standby_disconnect_on_failover" documentation 2019-03-15 14:07:01 +09:00
Ian Barwick
4528eb1796 doc: expand "failover_validate_command" documentation 2019-03-15 14:06:37 +09:00
Ian Barwick
074d79b44f repmgrd: add option "connection_check_type"
This enable selection of the method repmgrd uses to check whether the upstream
node is available. Possible values are:

 - "ping" (default): uses PQping() to check server availability
 - "connection":  executes a query on the connection to check server
   availability (similar to repmgr3.x).
2019-03-06 13:23:53 +09:00
Ian Barwick
1f256d4d73 doc: upate release notes 2019-02-28 10:02:05 +09:00
Ian Barwick
39234afcbf standby clone: check upstream connections after data copy operation
With long-running copy operations, it's possible the connection(s) to
the primary/source server may go away for some reason, so recheck
their availability before attempting to reuse.
2019-02-26 14:37:51 +09:00
Ian Barwick
07097575b1 daemon status: add column "upstream last seen"
This displays the interval (in seconds) since the repmgrd instance on
each node last confirmed its upstream node is available.
2019-02-23 13:03:16 +09:00
Ian Barwick
71d151ca87 Don't check status of logical replication slots
We only want to check the status of physical replication slots
to determine whether a streaming replication standby has become
detached and there is therefore a risk of uncontrolled WAL buildup
on the local node.

It's not feasible to second-guess the state of logical replication
slots.
2019-02-23 10:09:43 +09:00
Ian Barwick
629c552348 primary unregister: ensure correct behaviour when executed on a witness
Fixes GitHub #548.
2019-02-15 19:49:17 +09:00
Ian Barwick
3a5a4388c7 cluster show: differentiate unreachable status
Differentiate between unreachable nodes and nodes which are running
but rejecting connections.
2019-02-15 16:01:55 +09:00
Ian Barwick
f1667a7e98 repmgrd: don't consider nodes where repmgrd is not running
If, for whatever reason, repmgrd is not running on a node, but that
node qualifies as promotion candidate, failover will not take place
as that node will never promote itself.

We therefore discount nodes where repmgrd is running as promotion
candidates, which will ensure one node is always promoted.

There is a slight risk here that the node(s) where repmgrd is not running
are further ahead, leading to a timeline fork. It might be possible
to mitigate that by having the "election" leader perform the promote
(or follow) operation.
2019-02-07 17:07:13 +09:00
Martín Marqués
b036870c83 doc: fix typo in the release notes for 4.3
GitHub #539

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2019-02-04 16:39:58 +09:00
Ian Barwick
2c9700586c repmgr: "witness register" - check connection is to primary node
Previously, if the witness server connection details were provided
to "repmgr witness register" rather than those of the primary server,
repmgr a) write the node record to the witness server rather than
the primary, and b) would loop indefinitely trying to copy the
node table to itself.

Addresses GitHub #538.
2019-02-04 14:45:32 +09:00
Ian Barwick
59ed86c01a "cluster show": fix formatting with multiple digit node IDs 2019-02-02 14:07:49 +09:00
Ian Barwick
48381a5b4e Use --compact option for abbreviated display output
--terse is meant for reducing log chatter.
2019-02-02 13:06:59 +09:00
Ian Barwick
a41e7bb726 doc: various minor updates 2019-02-01 17:24:32 +09:00
Ian Barwick
b9ba97a36d "standby switchover": check replication connection to upstream
Ensure repmgr checks the standby (promotion candidate) is currently
attached to the primary (demotion candidate).

Addresses issue reported in GitHub #519.
2019-02-01 15:28:06 +09:00
Ian Barwick
9273e7af73 "standby switchover": avoid potential race condition with WAL location check
Immediately after the demotion candidate (primary) has shut down, we can't
be absolutely sure that the walreceiver has flushed all WAL to disk, so
checking pg_last_wal_receive_lsn() at that point might not reflect
the actual last available WAL location.

To handle this, we'll loop for a while (timeout controlled by configuration
parameter "wal_receive_check_timeout") before finally deciding whether
the standby is still behind the shut-down primary.

Addresses issue raised in GitHub #518.
2019-02-01 12:06:22 +09:00
Ian Barwick
a5aa47c1dd daemon start/stop: add warning about missing configuration
repmgr will attempt to construct appropriate commands to start
and stop repmgrd, but usually it's preferable for them to be explicitly
defined, particularly if repmgr is installed from packages.
2019-01-29 14:08:00 +09:00
Ian Barwick
7654dd615b Finalize "daemon (start|stop)" commands
Implements GitHub #528.
2019-01-29 13:16:11 +09:00
Ian Barwick
061932d023 "node rejoin": verify status of rejoin target
This adapts the code previously added to "standby follow" to verify
whether the rejoin target can actually be rejoined.
2019-01-23 17:08:55 +09:00
Ian Barwick
58efb0f158 repmgrd: on a cascaded standby, don't fail over if "failover=manual"
Addresses GitHub #531.
2019-01-21 14:16:49 +09:00
Ian Barwick
8881b69c06 "standby switchover": check remote data directory configuration
The switchover will fail if the data_directory parameter in repmgr.conf
on the remote node (demotion candidate) is incorrectly configured.
We use the previously added "repmgr node check --data-directory-config
to verify this, and abort early if an issue is discovered.

Implements GitHub #523.
2019-01-16 16:03:49 +09:00
Ian Barwick
0b3a310802 Add --data-directory-config option to "repmgr node check"
Implements part of GitHub #523.
2019-01-16 16:03:44 +09:00
Ian Barwick
ba7ef9e643 doc: update PostgreSQL documentation links
"/static/" path element no longer required.
2019-01-15 12:45:33 +09:00
Ian Barwick
d4e993a240 Improve handling of connection URIs when executing remote commands
Previously, if connection URIs were in use and "repmgr standby switchover"
was executed, repmgr would pass the connection URI as-is to the demotion
candidate to execute "repmgr node rejoin". However the presence of
unescaped ampersands in the connection URI was causing the rejoin command
to be incorrectly executed.

Addresses GitHub #525.
2019-01-14 11:11:51 +09:00
Ian Barwick
b3c2831bd3 repmgr: add --dry-run option to "standby promote"
Implements GitHub #522.
2019-01-10 12:36:58 +09:00
Ian Barwick
c66c8ebc98 repmgr: add --terse mode to "cluster show"
This suppresses display of the usually lengthy "conninfo" column, mainly
useful for generating a compact table suitable for pasting into emails,
chats etc. without messy line breaks.

Implements GitHub #521.
2019-01-09 10:06:37 +09:00
Ian Barwick
b5b9aacc8a Add command line option "repmgr --version-number"
Outputs the raw version number.

Intended for use by scripts etc.
2019-01-08 10:08:23 +09:00
Ian Barwick
96895ba8a8 doc: update 4.2 release notes 2018-10-24 15:24:00 +09:00
Ian Barwick
6999dbb52a Doc: update HISTORY and 4.2 release notes 2018-10-17 11:47:28 +09:00
Ian Barwick
a40fd60cb5 repmgrd: fix parsing of -d/--daemonize option 2018-10-03 11:36:38 +09:00
Ian Barwick
7ab81e10de Log SSH errors when running "repmgr cluster (matrix|crosscheck)"
Previously repmgr would abort with an unhelpful message about being
unable to parse CSV output.

With this commit, it will continue running, and display a list of
inaccessible nodes as an addendum to the main output (unless --csv
or --terse options are specified).

Addresses GitHub #246.
2018-10-03 10:12:18 +09:00
Ian Barwick
11d25e2aef Add configuration parameter "repmgr_bindir"
This is to facilitate remote invocation of repmgr when the repmgr
binary is located somewhere other than the PostgreSQL binary directory, as it
cannot be assumed all package maintainers will install repmgr there.

This parameter is optional; if not set (the default), repmgr will fall back
to "pg_bindir" (if set).

Addresses GitHub #246.
2018-10-02 09:59:12 +09:00
Ian Barwick
688337dec3 repmgr: add "--node-id" option to "cluster cleanup"
Implements GitHub #493.
2018-09-25 15:56:40 +09:00
Ian Barwick
b660cb9fe4 doc: fix link in 4.1.1 release notes 2018-09-25 14:30:38 +09:00
Ian Barwick
5d8d9db21d doc: update 4.2 release notes 2018-09-25 14:28:28 +09:00