Commit Graph

318 Commits

Author SHA1 Message Date
Ian Barwick
7fdf2f1778 Update copyright notices to 2020 2020-01-13 14:06:20 +09:00
Ian Barwick
3f5d2f6ee9 standby follow: don't attempt to delete slot if new upstream is same as current
An attempt will be made to delete an existing replication slot on the
old upstream node (this is important during e.g. a switchover operation
or when attaching a cascaded standby to a new upstream). However if the
standby is currently attached to the follow target node anyway, the
replication slot should never be deleted.
2019-12-10 15:56:00 +09:00
Ian Barwick
4ed72eb901 Minor formatting fix 2019-11-20 15:13:01 +09:00
Ian Barwick
220ec7fc96 Minimize user permissions requirements for replication slots
Enable operations which create or drop replication slots to be carried
out with the minimum necessary user permissions, i.e. a user with the
REPLICATION attribute.

This can be the repmgr user, or a dedicated replication user.
In the latter case, if the dedicated replication user is only
permitted to make replication connections, the streaming
replication protocol is used to create/drop slots.

Implements part of GitHub #536.
2019-10-30 15:51:15 +09:00
Ian Barwick
1a9bcddccd standby clone: fix typo in log message 2019-10-28 14:08:48 +09:00
Ian Barwick
52f9cd3bae Rename "_do_create_recovery_conf()" to "_do_create_replication_conf()"
As of PostgreSQL 12, the functionality is no longer specific to the
recovery.conf file.
2019-10-24 15:12:39 +09:00
Ian Barwick
dc11330d58 Rename replication slot create/drop functions
Append "_sql" to the respective function names, as we'll later be
creating equivalent functions which use the replication protocol
so need a way to distinguish between them.
2019-10-23 13:43:09 +09:00
Ian Barwick
be494f0d5f standby clone: minimize requirement to check upstream data directory location
repmgr has always insisted on determining the upstream's data directory
location, which requires superuser permissions (or from PostgreSQL 10,
membership of the default role "pg_read_all_settings").

Knowledge of the data directory location was required to implement rsync
cloning (now deprecated), but with pg_basebackup the minimum permission
requirement is now only a normal user with access to the repmgr metadata
and a user with replication permissions. The ability to determine the
data directory location is only required if the user specifies the
--copy-external-config-files option, which needs to be able to determine
the data directory to work out which configuration files are located
outside it.

This patch makes it possible to clone a standby with minimum
permissions, with appropriate checks for available permissions if
--copy-external-config-files is provided.

Implements part of GitHub #536 and addresses issue raised in #586.
2019-10-23 10:46:40 +09:00
Ian Barwick
b74f965f54 standby clone: rename --recovery-conf-only to --replication-conf-only
A more generic option name to cover pre- and post-Pg12 replication
configuration methods.

--recovery-conf-only is retained as an alias for backwards
compatibility.
2019-10-18 14:44:57 +09:00
Ian Barwick
b9e360d5b8 Tweak "repmgr standby --help" output not to mention recovery.conf
Use the more generic "replication configuration" to cover Pg12
and later.
2019-10-18 14:09:54 +09:00
Ian Barwick
f45b9d7024 Call check_93_config() directly from check_upstream_config() 2019-10-16 16:57:53 +09:00
Ian Barwick
de67fa2441 standby clone: simplify data directory check
Now we no longer care about the upstream's data directory, and
normally expect to find the data directory in repmgr.conf, we
can just exit with an error in the corner case that no repmgr.conf
is provided and no data directory specified with -D/--pgdata.
2019-10-16 16:06:26 +09:00
Ian Barwick
5f6d970fd9 standby clone: update code comment
Follow-up to 0dce03a
2019-10-16 15:45:20 +09:00
Ian Barwick
0dce03a5f8 standby clone: don't query upstream's data directory
In early repmgr versions, this used to be a requirement for cloning
via rsync, and/or as a fallback location if the user didn't supply
a data directory to clone into. However as rsync cloning has been
deprecated, and the data directory must be specified in repmgr.conf,
this is no longer required, and removing it simplifies user privilege
requirements.

Note that it is still possible to explicitly provide a target data
directory with -D/--pgdata, though this is primarily useful for
the niche use case where repmgr is used as a convenience tool to
clone a node which is not intended to become part of a repmgr
cluster.

This is part of the implementation of GitHub #536 for the minimizing
of user privilege requirements.
2019-10-16 13:21:29 +09:00
Ian Barwick
b885337abc Minor code formatting fix 2019-10-07 10:48:35 +09:00
Ian Barwick
931da14df1 Rename some "repmgr daemon ..." commands to "repmgr service ..."
"repmgr daemon" can be interpreted to mean the commands affect the local
daemon process only. Rename the commands which affect the entire cluster
to "repmgr service ...".

The "repmgr daemon ..." form of the affected commands is retained for backwards
 compatibility.
2019-08-28 14:58:11 +09:00
Ian Barwick
f122c44a77 Fix debugging output 2019-08-21 15:52:08 +09:00
Ian Barwick
b5225a2662 Support --create-recovery-conf in Pg12 2019-08-19 10:40:38 +09:00
Ian Barwick
d5ed38f573 Make "standby follow" work in Pg12 2019-08-19 10:40:35 +09:00
Ian Barwick
2a37e28304 write standby.signal 2019-08-19 10:40:31 +09:00
Ian Barwick
9eb6ce52b4 Write replication configuration for Pg12 and later 2019-08-19 10:40:27 +09:00
Ian Barwick
9e072c6773 Update code comment with list of options for "standby clone" 2019-08-15 18:11:47 +09:00
Ian Barwick
666c6f5140 "standby clone": improve error messages related to extension status
Previously repmgr would emit the "repmgr extension not found on source node"
which depending on context is somewhat misleading, as it may exist
but not be installed, or the user may be attempting to clone from the
wrong database.
2019-08-07 16:41:27 +09:00
Ian Barwick
09979eaa91 note that "standby follow" requires a primary to be available
While it's technically possible to have a standby follow another
standby while the primary is not available, repmgr will not be able
to update its metadata, which will cause Confusion and Chaos.

Update the documentation to make this clear, and provide a more helpful
error message if this situation occurs. The operation previously
failed anyway, but with an unhelpful message about not being able to
find a node record.
2019-06-11 15:14:17 +09:00
Ian Barwick
341421f8e0 standby follow: remove some ineffective code
For some reason we were taking the trouble to extract an appliction_name
from the local node's conninfo, but this was being subsequently overwritten
with the node name (which is what we want anyway).
2019-06-06 12:12:33 +09:00
Ian Barwick
f5d29f6591 doc: update release notes 2019-06-06 11:30:30 +09:00
Ian Barwick
c0ea5ffa04 Ensure parsed value of --upstream-conninfo is written to recovery.conf
Previously it was being parsed (a step which ensures any "application_name"
set by the caller is changed to the node name), but the original string
was being copied to "primary_conninfo" anyway.
2019-06-06 11:30:24 +09:00
Ian Barwick
45e17223b9 Update variable/field names relating to pg_basebackup's -X option
Now the "xlog nomenclature" Pg versions are fading into the past,
rename things related to handling pg_basebackup's -X option
(was: --xlog-method, now: --wal-method) to start with "wal_"
rather than "xlog_".

This is a cosmetic change for code clarity.
2019-05-30 09:32:06 +09:00
Ian Barwick
c153e2fc02 standby clone: improve --dry-run output
Log positive check results as an additional confirmation that the
upstream configuration appears to be correct.
2019-05-28 00:54:39 +09:00
Ian Barwick
44a39760a1 standby clone: improve source node replication connection check
Previously, the check was attempting to make replication connections
to the source node, and if these were failing, inferring that
insufficient walsenders were available.

However it's quite likely that the connections are refused due to
insufficient user connection permissions. So before performing
the connection check, query the number of potentially available
walsenders on the source node and compare it with the number
required (either 1 or 2) - if insufficient, exit with error and
hint about increasing "max_wal_senders".

Once we've established sufficient walsenders are available, inability
to connect is most likely related to permissions issues on the source
node.
2019-05-28 00:11:53 +09:00
Ian Barwick
b959f771c1 Improve naming/usage of node record variables in "standby clone"
Make it clearer we're dealing with the upstream node record.

Also avoid "overloading" the upstream record when checking for an
existing record with the same node name; this was not technically
a problem but mildly confusing when reading the code.
2019-05-27 23:39:49 +09:00
Ian Barwick
c9e85996f5 repmgr: prevent a standby being cloned from a witness server
Previously repmgr would happily clone from whatever server
it found at the provided source server address. We should
ensure that a standby can only be cloned from a node which
is part of the main replication cluster.

This check fetches a list of nodes from the source server,
connects to the first non-witness server it finds, and
compares the system identifiers of the source node and the
node it has connected to. If there is a mismatch, then the
source server is clearly not part of the main replication
cluster, and is most likely the witness server.
2019-05-22 16:52:25 +09:00
Ian Barwick
dd78a16006 Change return type of is_downstream_node_attached() from bool to NodeAttached
This enables us to better determine whether a node is definitively
attached, definitively not attached, or if it was not possible to
determine the attached state.
2019-05-14 15:57:20 +09:00
Ian Barwick
d8e4c54ea4 "standby switchover": add "--repmgrd-force-unpause"
Implements GitHub #559.
2019-05-10 16:04:07 +09:00
Ian Barwick
4b37562444 Make it clearer that a witness node counts as a "sibling node"
It's not attached to the primary per-se, but needs to know what
the current primary is in order to correctly synchronise its
copy of the metadata.

Per GitHub #560.
2019-05-02 14:22:53 +09:00
Ian Barwick
fed09ecaae standby promote: have former siblings follow new primary 2019-05-02 12:04:49 +09:00
Ian Barwick
98d09f83b5 standby (promote|switchover): improve --dry-run functionality
Continue checks as far as possible.
2019-05-02 12:04:43 +09:00
Ian Barwick
7bbe938e19 Separate promotion candidate walsender/slot checks into discrete functions
For use by "standby promote" as well as "standby follow"
2019-05-02 12:04:40 +09:00
Ian Barwick
63c7f758c3 Remove unneeded server version number variables
No need to pass these around.
2019-05-02 12:04:33 +09:00
Ian Barwick
b9f07f6a91 standby promote: use variable name "local_conn" for the local connection handle
This is consistent with usage in other functions, and makes it easier to
differentiate between the local node connection and the primary connection.
2019-05-02 12:04:26 +09:00
Ian Barwick
e4615b4666 Refactor code for executing --siblings-follow
This will enable provision of "--siblings-follow" to "repmgr standby promote"
2019-05-02 12:04:15 +09:00
Ian Barwick
52905f1eb3 Standardize on "ID: %i" when logging node IDs
Previously there was a mix of "id:", "node id:", "node ID:" and "node_id:".
2019-04-30 17:07:33 +09:00
Ian Barwick
89a7261483 Always quote node names in log messages 2019-04-30 15:52:56 +09:00
Ian Barwick
e32acda8c0 standby switchover: ignore nodes which are unreachable and marked as inactive
Previously "repmgr standby switchover" would abort if any node was unreachable,
as that means it was unable to check if repmgrd is running.

However if the node has been marked as inactive in the repmgr metadata, it's
reasonable to assume the node is no longer part of the replication cluster
and does not need to be checked.
2019-04-29 14:35:49 +09:00
Ian Barwick
ad28cf95bd standby register: add upstream node ID in event details 2019-04-16 11:01:22 +09:00
Ian Barwick
dd454a8374 Miscellaneous string handling cleanup
This is mainly to prevent effectively spurious truncation warnings
in recent GCC versions.
2019-04-10 16:18:56 +09:00
Ian Barwick
77b9887d61 standby clone: improve --dry-run behaviour in barman mode
- emit additional informational output
- ensure that provision of --force does not result in an existing
  data directory being modified in any way
2019-04-08 15:12:22 +09:00
Ian Barwick
67e977592c standby switchover: list nodes which will remain attatched to the old primary
If --siblings-follow is not supplied, list all nodes which repmgr considers
to be siblings (this will include the witness server, if in use), and
which will remain attached to the old primary.
2019-04-02 10:46:59 +09:00
Ian Barwick
79613af8d0 Handle potential NULL return from string_skip_prefix() 2019-03-28 12:45:53 +09:00
Ian Barwick
e44c048ae2 Update code comment 2019-03-28 12:44:30 +09:00