Commit Graph

156 Commits

Author SHA1 Message Date
Ian Barwick
26b5664741 repmgr: enable "primary unregister --force" to unregister an active primary
The primary must have no registered standby nodes.

Also document usage when unregistering a primary node which is actually
running as a standby.
2020-09-23 15:12:19 +09:00
Ian Barwick
158008c5c5 doc: update release notes 2020-09-18 11:29:07 +09:00
Ian Barwick
5f3d1cdeb6 doc: note removal of PostgreSQL 9.3 support 2020-09-17 16:05:16 +09:00
Ian Barwick
82515a9733 doc: document new parameters for "failover_validation_command" 2020-09-17 15:48:18 +09:00
Ian Barwick
9a836b3c04 Add option --verify-backup
This causes pg_verifybackup to be executed immediately after
pg_basebackup completes.

PostreSQL 13 and later.
2020-09-03 10:40:32 +09:00
Ian Barwick
9df1731fb8 doc: update release notes 2020-04-15 14:50:51 +09:00
Ian Barwick
c39289d570 doc: finalize release notes 2020-04-13 17:17:01 +09:00
Ian Barwick
662b603770 doc: update release notes 2020-04-13 17:16:55 +09:00
Ian Barwick
cd80f265ac standby clone: warn about missing pg_rewind prerequisites
These are not essential for cloning a standby, but useful to warn
as early as possible in case the user is intending to use pg_rewind.
2020-04-06 15:37:37 +09:00
Ian Barwick
447054a630 standby promote: in --dry-run mode, display promote command which will be used
For PostgreSQL 12 and later, explicitly note whether repmgr user has
execution permissions on the pg_promote() function.
2020-04-02 12:37:32 +09:00
Ian Barwick
d9cb38c7f0 node check: add --upstream option
We have a --downstream option to check for attached nodes, but it
would be useful to have a corresponding --upstream option too.

A following patch will adapt the behaviour of this option when executed
on the primary node.
2020-03-30 17:54:52 +09:00
Ian Barwick
e561ddc8d3 node check: accept -S/--superuser option
This is mainly useful for the --data-directory-config option, which
requires permission to read pg_settings to verify that the data
directory configured in "repmgr.conf" matches the data directory
actually in use.

If pg_settings read permission is not available, repmgr will fall
back to a simple check that the data directory configured in
"repmgr.conf" is a valid PostgreSQL directory. This is not entirely
foolproof, as it's possible PostgreSQL could be using a different
data directory.
2020-03-23 17:14:04 +09:00
Ian Barwick
0bc0a28378 standby promote: enable "service_promote_command" in PostgreSQL 12
This enables the promote command generated internally by repmgr to
be overridden if desired, in the same way as for PostgreSQL 11 and
earlier.
2020-03-06 13:21:30 +09:00
Ian Barwick
fb5ce720f3 standby promote: fall back to "pg_ctl promote" if necessary
From PostgreSQL 12, the SQL-level function "pg_promote()" can be used
to promote a PostgreSQL instance, however usage is restricted to
superusers and users to whom explicit execution permission for this
function has been granted.

Therefore, if execution permission is not available, fall back to
"pg_ctl promote".
2020-03-06 12:53:37 +09:00
Ian Barwick
6559258b53 doc: update release notes 2020-03-05 17:24:24 +09:00
Ian Barwick
10304a1a3b doc: update release notes 2020-03-04 17:21:31 +09:00
Ian Barwick
6f01c54620 repmgr: improve "standby switchover" completion checks
There were some corner cases where "repmgr standby switchover"
would erroneously report a successful switchover, even if the
demotion candidate had not reattached to the promotion candidate.

Also improve the logging in various places to make it clearer
what is happening on which node.
2020-02-04 15:38:10 +09:00
Ian Barwick
e2a362a171 "node rejoin": check for available replication slots on the rejoin target
"standby follow" did this already, but "node rejoin" didn't.
2020-02-03 17:03:20 +09:00
Ian Barwick
ab9c84c655 Report error code on follow/rejoin failure due to non-available slot
Previously, if "repmgr standby follow" or "repmgr node rejoin" failed
due to a replication slot not being available, no error code was
returned.
2020-02-03 15:03:31 +09:00
Ian Barwick
4d4ed3bcd6 Remove BDR 2.x support
The BDR 2.x support was conceptual only and was never used in
production. As BDR 2.x will be EOL'd shortly, there is no risk it will
be needed.
2020-01-16 09:52:42 +09:00
Ian Barwick
be494f0d5f standby clone: minimize requirement to check upstream data directory location
repmgr has always insisted on determining the upstream's data directory
location, which requires superuser permissions (or from PostgreSQL 10,
membership of the default role "pg_read_all_settings").

Knowledge of the data directory location was required to implement rsync
cloning (now deprecated), but with pg_basebackup the minimum permission
requirement is now only a normal user with access to the repmgr metadata
and a user with replication permissions. The ability to determine the
data directory location is only required if the user specifies the
--copy-external-config-files option, which needs to be able to determine
the data directory to work out which configuration files are located
outside it.

This patch makes it possible to clone a standby with minimum
permissions, with appropriate checks for available permissions if
--copy-external-config-files is provided.

Implements part of GitHub #536 and addresses issue raised in #586.
2019-10-23 10:46:40 +09:00
Ian Barwick
d7fd55be99 Update HISTORY 2019-10-18 16:47:28 +09:00
Ian Barwick
0dce03a5f8 standby clone: don't query upstream's data directory
In early repmgr versions, this used to be a requirement for cloning
via rsync, and/or as a fallback location if the user didn't supply
a data directory to clone into. However as rsync cloning has been
deprecated, and the data directory must be specified in repmgr.conf,
this is no longer required, and removing it simplifies user privilege
requirements.

Note that it is still possible to explicitly provide a target data
directory with -D/--pgdata, though this is primarily useful for
the niche use case where repmgr is used as a convenience tool to
clone a node which is not intended to become part of a repmgr
cluster.

This is part of the implementation of GitHub #536 for the minimizing
of user privilege requirements.
2019-10-16 13:21:29 +09:00
Ian Barwick
a0591afb1e doc: add repmgr 5.0 release date 2019-10-15 10:29:52 +09:00
Ian Barwick
2304584679 Fix handling of upstream node change check
repmgrd has a check to see if the upstream node has unexpectedly
changed, e.g. if the repmgrd service is paused and the PostgreSQL
instance has been pointed to another node.

However this check was relying on the node record on the local node
being up-to-date, which may not be the case immediately after a
failover, when the node is still replaying records updated prior
to the node's own record being updated. In this case it will
mistakenly assume the node is following the original primary
and attempt to restart monitoring, which will fail as the original
primary is no longer available.

To prevent this, we check against the node's record on the upstream
node.

Addresses issue noted in GitHub #587 and #588.
2019-10-14 12:28:04 +09:00
Ian Barwick
10f00b8822 repmgr: pass explicitly provided log level when executing repmgr remotely
This makes it possible to return log output when executing repmgr
remotely at a different level to the one defined in the remote
repmgr's repmgr.conf.

This is particularly useful when DEBUG output is required.
2019-09-17 15:38:43 +09:00
Ian Barwick
56aae22b6c doc: update release notes 2019-09-17 11:00:04 +09:00
Ian Barwick
931da14df1 Rename some "repmgr daemon ..." commands to "repmgr service ..."
"repmgr daemon" can be interpreted to mean the commands affect the local
daemon process only. Rename the commands which affect the entire cluster
to "repmgr service ...".

The "repmgr daemon ..." form of the affected commands is retained for backwards
 compatibility.
2019-08-28 14:58:11 +09:00
Ian Barwick
ffc7b7817b doc: update HISTORY
Note PostgreSQL support.
2019-08-22 15:42:29 +09:00
Ian Barwick
fb6352735a The next major release will be 5.0.
4.5 was a placeholder release number in case a major release was required
prior to the release of Pg12.
2019-08-22 15:15:56 +09:00
Ian Barwick
666c6f5140 "standby clone": improve error messages related to extension status
Previously repmgr would emit the "repmgr extension not found on source node"
which depending on context is somewhat misleading, as it may exist
but not be installed, or the user may be attempting to clone from the
wrong database.
2019-08-07 16:41:27 +09:00
Ian Barwick
38b373e6df "node check": check role membership when trying to read pg_settings
From PostgreSQL 10, a member of the default roles "pg_monitor" and/or
"pg_read_all_settings" can read pg_settings without requiring superuser
privileges.

Previously, a hint was being emitted about making the repmgr user a
member of one of those groups, but no check for membership was being
made, meaning the check could only be run by a superuser.
2019-08-07 14:26:48 +09:00
Ian Barwick
8d55cab25e Convert configuration file parsing to use flex
Previously, repmgr was using a very simple ad-hoc string-based parser,
which had various limitations and allowed configuration files to be
created in a way which could cause confusion and/or unexpected
behaviour.

For example, it accepted strings enclosed in single quotes, but treated
strings enclosed in double quotes literally. A node_name defined thusly:

    node_name="somenode"

would result in the literal value '"somenode"' being used, which could
lead to unobvious errors along the lines of:

    no record found for ""somenode""

The configuration file parser has been adapted from the one used by
PostgreSQL itself, so behaves more-or-less identically (though some
functions such as file inclusion are not supported in repmgr).

This makes configuration parsing more robust and consistent;
additionally, error reporting will be more precise.

Note this does mean that some repmgr.conf items previously accepted
as valid by repmgr will now be rejected; in particular this includes
strings containing spaces which are not enclosed in single quotes.
2019-08-01 10:17:20 +09:00
Ian Barwick
5bf9605286 Revert "Convert configuration file parsing to use flex"
This reverts commit c6ca183247.

Backing out this patch for now as the Debian build system doesn't
seem to like it, even though it builds just fine on Debian itself.
2019-07-18 10:19:18 +09:00
Ian Barwick
c6ca183247 Convert configuration file parsing to use flex
Previously, repmgr was using a very simple ad-hoc string-based parser,
which had various limitations and allowed configuration files to be
created in a way which could cause confusion and/or unexpected
behaviour.

For example, it accepted strings enclosed in single quotes, but treated
strings enclosed in double quotes literally. A node_name defined thusly:

    node_name="somenode"

would result in the literal value '"somenode"' being used, which could
lead to unobvious errors along the lines of:

    no record found for ""somenode""

The configuration file parser has been adapted from the one used by
PostgreSQL itself, so behaves more-or-less identically (though some
functions such as file inclusion are not supported in repmgr).

This makes configuration parsing more robust and consistent;
additionally, error reporting will be more precise.

Note this does mean that some repmgr.conf items previously accepted
as valid by repmgr will now be rejected; in particular this includes
strings containing spaces which are not enclosed in single quotes.
2019-07-03 12:18:01 +09:00
Ian Barwick
b125628f7b doc: update release notes
Finalize release date.
2019-06-26 15:57:42 +09:00
Ian Barwick
09979eaa91 note that "standby follow" requires a primary to be available
While it's technically possible to have a standby follow another
standby while the primary is not available, repmgr will not be able
to update its metadata, which will cause Confusion and Chaos.

Update the documentation to make this clear, and provide a more helpful
error message if this situation occurs. The operation previously
failed anyway, but with an unhelpful message about not being able to
find a node record.
2019-06-11 15:14:17 +09:00
Ian Barwick
7180e2bed7 Canonicalize the data directory path when parsing the configuration file
This ensures the provided path matches the path PostgreSQL reports as its
data directory.
2019-06-07 09:48:01 +09:00
Ian Barwick
f5d29f6591 doc: update release notes 2019-06-06 11:30:30 +09:00
Ian Barwick
d4df0055c9 repmgr: use --compact (not --terse) in "cluster events" to hide details column
This is consistent with usage elsewhere.

"--terse" is intended to reduce logging noise.
2019-05-30 14:19:37 +09:00
Ian Barwick
9085ca46a8 doc: update release notes 2019-05-28 15:38:19 +09:00
Ian Barwick
c560dfbbce cluster show: display timeline ID
This helps provide a better picture of the state of the cluster, i.e.
making it more obvious whether there's been a timeline divergence.

This also provides infrastructure for further improvements in cluster
status display and diagnosis.

Note this is only available in PostgreSQL 9.6 and later as it relies
on the SQL functions for interrogating pg_control, which can be executed
remotely. As PostgreSQL 9.5 will shortly be the only community-supported
version without these functions, it's not worth the effort of trying
to duplicate their functionality.
2019-05-27 09:39:19 +09:00
Ian Barwick
2bce1b371c doc: fold putative 4.3.1 release notes into 4.4 2019-05-23 09:03:18 +09:00
Ian Barwick
c9e85996f5 repmgr: prevent a standby being cloned from a witness server
Previously repmgr would happily clone from whatever server
it found at the provided source server address. We should
ensure that a standby can only be cloned from a node which
is part of the main replication cluster.

This check fetches a list of nodes from the source server,
connects to the first non-witness server it finds, and
compares the system identifiers of the source node and the
node it has connected to. If there is a mismatch, then the
source server is clearly not part of the main replication
cluster, and is most likely the witness server.
2019-05-22 16:52:25 +09:00
Ian Barwick
f03e012c99 cluster show/daemon status: report if node not attached to advertised upstream 2019-05-14 16:15:03 +09:00
Ian Barwick
fca033fb9d cluster show/daemon status: report upstream node mismatches
When showing node information, check if the node's copy of its
record shows a different upstream to the one expected according
to the node where the command is executed.

This helps visualise situations where the cluster is in an
unexpected state, and provide a better idea of the actual state.

For example, if a cluster has divided somehow and a set of nodes are
following a new primary, when running "cluster show" etc., repmgr
will now show the name of the primary those nodes are actually
following, rather than the now outdated node name recorded
on the other side of the split. A warning will also be issued
about the situation.
2019-05-14 13:11:31 +09:00
Ian Barwick
d8e4c54ea4 "standby switchover": add "--repmgrd-force-unpause"
Implements GitHub #559.
2019-05-10 16:04:07 +09:00
Ian Barwick
d43b40c5c6 doc: enable creation of PDF files 2019-05-10 10:50:49 +09:00
Ian Barwick
5e03627e6c doc: update release notes 2019-05-07 15:29:56 +09:00
Ian Barwick
8da355eb3f doc: update release notes 2019-05-02 14:00:07 +09:00