repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-22 22:56:29 +00:00

Author	SHA1	Message	Date
Ian Barwick	a502b2cf96	Move function parse_repmgr_version() to a more appropriate location	2019-09-24 13:14:03 +09:00
Ian Barwick	fca033fb9d	cluster show/daemon status: report upstream node mismatches When showing node information, check if the node's copy of its record shows a different upstream to the one expected according to the node where the command is executed. This helps visualise situations where the cluster is in an unexpected state, and provide a better idea of the actual state. For example, if a cluster has divided somehow and a set of nodes are following a new primary, when running "cluster show" etc., repmgr will now show the name of the primary those nodes are actually following, rather than the now outdated node name recorded on the other side of the split. A warning will also be issued about the situation.	2019-05-14 13:11:31 +09:00
Ian Barwick	d8e4c54ea4	"standby switchover": add "--repmgrd-force-unpause" Implements GitHub #559.	2019-05-10 16:04:07 +09:00
Ian Barwick	9fe2fa2daf	daemon status: make output more like that of "cluster show" In particular make any issues with unexpected server state more obvious.	2019-04-25 14:45:41 +09:00
Ian Barwick	ba1f05ece9	Restrict "node_name" to maximum 63 characters In "recovery.conf", the configuration parameter "node_name" is used as the "application_name" value, which will be truncated by PostgreSQL to 63 characters (NAMEDATALEN - 1). repmgr sometimes needs to be able to extract the application name from pg_stat_replication to determine if a node is connected (e.g. when executing "repmgr standby register"), so the comparison will fail if "node_name" exceeds 63 characters.	2019-03-28 10:37:57 +09:00
Ian Barwick	1615353f48	repmgrd: optionally disconnect WAL receivers during failover This is intended to ensure that all nodes have a constant LSN while making the failover decision. This feature is experimental and needs to be explicitly enabled with the configuration file option "standby_disconnect_on_failover". Note enabling this option will result in a delay in the failover decision until the WAL receiver is disconnected on all nodes.	2019-03-06 15:53:57 +09:00
Ian Barwick	b1875a8d91	Split command execution functions into separate library These may need to be executed by repmgrd.	2019-02-27 14:41:17 +09:00
Ian Barwick	48381a5b4e	Use --compact option for abbreviated display output --terse is meant for reducing log chatter.	2019-02-02 13:06:59 +09:00
Ian Barwick	d7420d7274	daemon (start\|stop): verify that repmgrd starts/stops. Note this may not always be possible for "daemon stop" if we are unable to determine the repmgrd PID.	2019-01-30 14:41:31 +09:00
Ian Barwick	32b81e7d49	"daemon start": initial implementation	2019-01-29 13:01:14 +09:00
Ian Barwick	061932d023	"node rejoin": verify status of rejoin target This adapts the code previously added to "standby follow" to verify whether the rejoin target can actually be rejoined.	2019-01-23 17:08:55 +09:00
Ian Barwick	3f5762e03a	Refactor upstream attachment check code Move it from the "standby follow" code to an independent function so it can be used in other contexts, e.g. "node rejoin".	2019-01-23 15:11:42 +09:00
Ian Barwick	7dce3ed234	Update copyright notices to 2019	2019-01-21 14:54:35 +09:00
Ian Barwick	0b3a310802	Add --data-directory-config option to "repmgr node check" Implements part of GitHub #523.	2019-01-16 16:03:44 +09:00
Ian Barwick	10be941298	Fix typo "node join" should be "node rejoin"	2019-01-14 15:39:13 +09:00
Ian Barwick	c66c8ebc98	repmgr: add --terse mode to "cluster show" This suppresses display of the usually lengthy "conninfo" column, mainly useful for generating a compact table suitable for pasting into emails, chats etc. without messy line breaks. Implements GitHub #521.	2019-01-09 10:06:37 +09:00
Ian Barwick	455a0bd93f	Use make_remote_repmgr_path() in place of make_repmgr_path() Also we can now simplify "cluster (matrix\|crosscheck)" commands as beginning with v4.0, we know where the configuration file is, so can provide that when invoking repmgr remotely.	2018-10-02 09:59:18 +09:00
Ian Barwick	11d25e2aef	Add configuration parameter "repmgr_bindir" This is to facilitate remote invocation of repmgr when the repmgr binary is located somewhere other than the PostgreSQL binary directory, as it cannot be assumed all package maintainers will install repmgr there. This parameter is optional; if not set (the default), repmgr will fall back to "pg_bindir" (if set). Addresses GitHub #246.	2018-10-02 09:59:12 +09:00
Ian Barwick	2491b8ae52	Add functionality to "pause" repmgrd In some circumstances, e.g. while performing a switchover, it is essential that repmgrd does not take any kind of failover action, as this will put the cluster into an incorrect state. Previously it was necessary to stop repmgrd on all nodes (or at least those nodes which repmgrd would consider as promotion candidates), however this is a cumbersome and potentially risk-prone operation, particularly if the replication cluster contains more than a couple of servers. To prevent this issue from occurring, this patch introduces the ability to "pause" repmgrd on all nodes wth a single command ("repmgr daemon pause") which notifies repmgrd not to take any failover action until the node is "unpaused" ("repmgr daemon unpause"). "repmgr daemon status" provides an overview of each node and whether repmgrd is running, and if so whether it is paused. "repmgr standby switchover" has been modified to automatically pause repmgrd while carrying out the switchover. See documentation for further details.	2018-09-27 16:42:10 +09:00
Ian Barwick	9681708b1a	repmgr: improve slot handling in "node rejoin" On the rejoined node, if a replication slot for the new upstream exists (which is typically the case after a failover), delete that slot. Also emit a warning about any inactive replication slots which may need to be cleaned up manually. GitHub #499.	2018-08-30 12:24:13 +09:00
Ian Barwick	56919ea499	repmgr: add -q/--quiet option This suppresses log output below log level ERROR. This is useful mainly when repmgr is being executed programmatically, e.g. in a cronjob, where it's only useful to receive output if something goes wrong. Note we advise against using this option when executing repmgr commands which operate on PostgreSQL nodes (standby follow, standby promote, standby switchover, node rejoin), particularly when executed by repmgrd, as the log output will provide valuable troubleshooting information. Implements suggestion in GitHub #468.	2018-07-13 12:09:41 +09:00
Ian Barwick	37311e15a3	repmgr: fix "standby register --wait-sync" when no timeout provided The default value for "wait_register_sync_seconds" was zero, which is treated as disabling --wait-sync altogether. Default value now set to -1, which is taken to mean no timeout value supplied.	2018-07-04 17:22:04 +09:00
Ian Barwick	080a29c33b	node check: add --missing-slots check This enables an explicit check for slots which should exist (according to the repmgr metadata) but which aren't present.	2018-06-22 17:21:40 +09:00
Ian Barwick	8320179f34	Add configuration file parameter "config_directory" This enables explicit provision of an external configuration file directory, which if set will be passed to "pg_ctl" as the -D parameter. Otherwise "pg_ctl" will default to using the data directory, which will cause some operations to fail if the configuration files are not present there. Note this is implemented primarily for feature completeness and for development/testing purposes. Users who have installed "repmgr" from a package should not rely on "pg_ctl" to stop/start/restart PostgreSQL, instead they should set the appropriate "service_..._command" for their operating system. For more details see: https://repmgr.org/docs/4.0/configuration-service-commands.html Note: in a future release, the presence of "config_directory" in repmgr.conf will be used to implictly set "--copy-external-config-files=samepath" when cloning a standby; this is a behaviour change so will be implemented in the next major realease (repmgr 4.1). Implements GitHub #424.	2018-04-25 11:58:24 +09:00
Ian Barwick	3b00dc912a	Catch various corner cases when restarting a PostgreSQL instance	2018-04-03 14:40:53 +09:00
Ian Barwick	a3f371b8c0	"node rejoin": actively check for node to rejoin cluster Previously repmgr was relying on whatever command was configured to start PostgreSQL to determine whether the node being rejoined had started correctly. However it's preferable to actively poll the upstream to confirm it has restarted and actually attached as a standby before confirming success of the "node rejoin" action. This can be overridden with the -W/--no-wait option. (Note that for consistency with other PostgreSQL utilities, the short form of the --wait option is now "-w"; this is currently only used in "repmgr standby follow".) Also update "repmgr node rejoin" documentation with a list of supported options, and add some useful index entries for "pg_rewind". Implements GitHub #415.	2018-04-03 10:34:44 +09:00
Ian Barwick	3ccf1cf182	Enable pg_rewind to be used with PostgreSQL 9.3/9.4 pg_rewind is not part of the core distribution for those, but we provided support in repmgr 3.3 so should extend it to repmgr 4. Note that there is no check in place whether the pg_rewind binary exists, so it's up to the user to ensure it's present. Addresses GitHub #413.	2018-04-02 20:54:29 +09:00
Ian Barwick	9c5e76401f	Fix "repmgr cluster crosscheck" output Addresses GitHub #398.	2018-03-27 16:44:04 +09:00
Ian Barwick	ee98a3a58e	"standby clone": add --recovery-conf-only option This will generate "recovery.conf" for an existing standby. Typical use-case is a standby cloned manually from an external data source (e.g. Barman), where "recovery.conf" needs to be created (and if required a replication slot). The --dry-run option will check the pre-requisites but not actually create "recovery.conf" or a replication slot. This requires that the upstream node is running, a replication connection can be made and if required a replication slot can be created. Implements GitHub #382.	2018-02-22 15:50:51 +09:00
Ian Barwick	927bf038a0	"standby switchover": check demotion candidate can make replication connection Check it's actually possible for the demotion candidate to attach to the promotion candidate before executing the switchover. As with other checks of this nature, there's a faint possibility the situation could change between the time the check is carried out and the demotion candidate is restarted to connect to the promotion candidate, but there's not a lot we can do about that. The main purpose is to be able to catch existing misconfigurations before anything gets changed. Implements GitHub #370.	2018-02-09 10:00:54 +09:00
Ian Barwick	b705127a34	"repmgr standby register": add --wait-start option Implements GitHub #356.	2018-01-04 14:56:08 +09:00
Ian Barwick	26a9e848fd	Update copyright notices to 2018	2018-01-02 10:19:46 +09:00
Ian Barwick	8c121da8a1	Add diagnostic option "repmgr node check --has-passfile" This checks if the active libpq version (9.6 and later) has the "passfile" option, and returns 0 if present, 1 if not. `	2017-12-11 20:09:48 +09:00
Ian Barwick	30b11c08e6	Disable any configuration settings not compatible with PostgreSQL 9.3 And emit a warning while we're at it.	2017-09-18 13:12:38 +09:00
Ian Barwick	ea2693bc75	Move create_recovery_file() et al to repmgr-action-standby.c As they're only ever called from there.	2017-09-18 09:53:08 +09:00
Ian Barwick	b6b31b15b2	Implement "repmgr cluster cleanup"	2017-09-11 13:48:46 +09:00
Ian Barwick	a9f4a027a7	pgindent run	2017-09-11 11:14:13 +09:00
Ian Barwick	e4f7dc8234	Add copyright notices	2017-09-08 13:27:39 +09:00
Ian Barwick	edee80cc37	Rename option "node check --is-shutdown" to "--is-shutdown-cleanly" As that's what we really want to know. Also return "UNCLEAN_SHUTDOWN" if that's the case, rather than "RUNNING" which is confusing, even though it's a command for internal use.	2017-09-07 11:15:27 +09:00
Ian Barwick	47a4b49890	Add "repmgr standby follow --upstream-node-id" In an automatic failover situation, after a standby has been promoted there's a risk the original primary may become available again before "standby follow" is issued on another standby node, in which case "standby follow" will reconnect to the original primary. As the standby's repmgrd will have received a notification from the new primary, it will know the primary's ID and can therefore explicitly direct "standby follow" to follow that primary.	2017-09-04 09:11:59 +09:00
Ian Barwick	91941183bc	Use replication user, if set, when checking replication connections	2017-08-31 17:54:49 +09:00
Ian Barwick	0e0b221507	Add configuration file setting "use_primary_conninfo_password" If, for whatever reason, the upstream server password needs to be set in "primary_conninfo", enable it to be extracted from $PGPASSWORD.	2017-08-31 14:57:07 +09:00
Ian Barwick	da24d883e5	Remove option "--wal-keep-segments" This is a remnant of the early repmgr days when there were no alternative mechanisms for ensuring sufficient WAL remains available while cloning a standby. The purpose of this setting was to override a check for an (arbitrary) minimum setting for "wal_keep_segments". As there's no reliable way of determining a sensible value for this, and improvements in pg_basebackup mean WALs can be streamed (possibly using a replication slot) while the backup is in progress, there's no point in keeping this around. We will however still emit a warning about setting "wal_keep_segments" if the configuration doesn't appear to provide any other way of ensuring WAL is available during/after the cloning process and "wal_keep_segments" is not set.	2017-08-17 14:45:13 +09:00
Ian Barwick	b1ba476241	Rename "archiver" check etc. to "archive-ready" Gives a better indication of what's being checked.	2017-08-17 12:23:56 +09:00
Ian Barwick	4efc8fb9ce	Add placeholder functions for "repmgr $command --help" There are now too many options to sensibly fit into general --help output; we'll add separate output for each repmgr command, e.g. "repmgr node --help".	2017-08-16 13:24:14 +09:00
Ian Barwick	4c0d719cdb	Add replication slot check to "repmgr node check"	2017-08-16 11:17:02 +09:00
Ian Barwick	554673e83e	Add "repmgr node check --downstream"	2017-08-15 15:50:46 +09:00
Ian Barwick	10ef30096c	"node check": add server role check	2017-08-14 22:57:09 +09:00
Ian Barwick	fa7d60cd51	"node check": initial general output	2017-08-14 17:32:44 +09:00
Ian Barwick	8a50a72dc5	Additional "node status" output	2017-08-10 17:18:08 +09:00

1 2

79 Commits