repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-07-16 22:39:04 +00:00

Author	SHA1	Message	Date
Ian Barwick	59ed86c01a	"cluster show": fix formatting with multiple digit node IDs	2019-02-02 14:07:49 +09:00
Ian Barwick	48381a5b4e	Use --compact option for abbreviated display output --terse is meant for reducing log chatter.	2019-02-02 13:06:59 +09:00
Ian Barwick	b9ba97a36d	"standby switchover": check replication connection to upstream Ensure repmgr checks the standby (promotion candidate) is currently attached to the primary (demotion candidate). Addresses issue reported in GitHub #519.	2019-02-01 15:28:06 +09:00
Ian Barwick	9273e7af73	"standby switchover": avoid potential race condition with WAL location check Immediately after the demotion candidate (primary) has shut down, we can't be absolutely sure that the walreceiver has flushed all WAL to disk, so checking pg_last_wal_receive_lsn() at that point might not reflect the actual last available WAL location. To handle this, we'll loop for a while (timeout controlled by configuration parameter "wal_receive_check_timeout") before finally deciding whether the standby is still behind the shut-down primary. Addresses issue raised in GitHub #518.	2019-02-01 12:06:22 +09:00
Ian Barwick	7654dd615b	Finalize "daemon (start\|stop)" commands Implements GitHub #528.	2019-01-29 13:16:11 +09:00
Ian Barwick	061932d023	"node rejoin": verify status of rejoin target This adapts the code previously added to "standby follow" to verify whether the rejoin target can actually be rejoined.	2019-01-23 17:08:55 +09:00
Ian Barwick	58efb0f158	repmgrd: on a cascaded standby, don't fail over if "failover=manual" Addresses GitHub #531.	2019-01-21 14:16:49 +09:00
Ian Barwick	8881b69c06	"standby switchover": check remote data directory configuration The switchover will fail if the data_directory parameter in repmgr.conf on the remote node (demotion candidate) is incorrectly configured. We use the previously added "repmgr node check --data-directory-config to verify this, and abort early if an issue is discovered. Implements GitHub #523.	2019-01-16 16:03:49 +09:00
Ian Barwick	0b3a310802	Add --data-directory-config option to "repmgr node check" Implements part of GitHub #523.	2019-01-16 16:03:44 +09:00
Ian Barwick	d4e993a240	Improve handling of connection URIs when executing remote commands Previously, if connection URIs were in use and "repmgr standby switchover" was executed, repmgr would pass the connection URI as-is to the demotion candidate to execute "repmgr node rejoin". However the presence of unescaped ampersands in the connection URI was causing the rejoin command to be incorrectly executed. Addresses GitHub #525.	2019-01-14 11:11:51 +09:00
Ian Barwick	b3c2831bd3	repmgr: add --dry-run option to "standby promote" Implements GitHub #522.	2019-01-10 12:36:58 +09:00
Ian Barwick	c66c8ebc98	repmgr: add --terse mode to "cluster show" This suppresses display of the usually lengthy "conninfo" column, mainly useful for generating a compact table suitable for pasting into emails, chats etc. without messy line breaks. Implements GitHub #521.	2019-01-09 10:06:37 +09:00
Ian Barwick	b5b9aacc8a	Add command line option "repmgr --version-number" Outputs the raw version number. Intended for use by scripts etc.	2019-01-08 10:08:23 +09:00
Ian Barwick	40408a1734	repmgrd: check binary and extension major versions match repmgr requires that the same "major version" (e.g. 4.3) is present on all nodes, otherwise - particularly in the case of repmgrd - it's highly likely things won't work as expected. Implements part of GitHub #515.	2019-01-07 15:39:40 +09:00
Ian Barwick	a6a2be2239	Teach witness repmgrd to deal with the absence of a primary Previously it would refuse to start if the primary was not reachable, the thinking being that it's pointless trying to monitor an incomplete cluster. However following an aborted failover situation, repmgrd will restart monitoring and on the witness server, this will lead to it aborting itself due to to continuing absence of primary. To resolve this, witness repmgrd will now start monitoring in degraded mode if no primary is found in the hope a primary will reappear at some point.	2018-11-29 12:15:41 +09:00
Ian Barwick	9f587efb74	doc: update HISTORY	2018-11-29 10:34:28 +09:00
Ian Barwick	c3bc5585d9	Add sanity check for extension version This should cover the cases where the "repmgr" extension was installed manually but not updated, or an upgrade was not fully completed.	2018-10-31 11:16:36 +09:00
Ian Barwick	96895ba8a8	doc: update 4.2 release notes	2018-10-24 15:24:00 +09:00
Ian Barwick	6999dbb52a	Doc: update HISTORY and 4.2 release notes	2018-10-17 11:47:28 +09:00
Ian Barwick	b346914d4d	repmgr: fix "Missing replication slots" label in "node check" Per report in GitHub #507.	2018-10-03 13:53:52 +09:00
Ian Barwick	a40fd60cb5	repmgrd: fix parsing of -d/--daemonize option	2018-10-03 11:36:38 +09:00
Ian Barwick	7ab81e10de	Log SSH errors when running "repmgr cluster (matrix\|crosscheck)" Previously repmgr would abort with an unhelpful message about being unable to parse CSV output. With this commit, it will continue running, and display a list of inaccessible nodes as an addendum to the main output (unless --csv or --terse options are specified). Addresses GitHub #246.	2018-10-03 10:12:18 +09:00
Ian Barwick	11d25e2aef	Add configuration parameter "repmgr_bindir" This is to facilitate remote invocation of repmgr when the repmgr binary is located somewhere other than the PostgreSQL binary directory, as it cannot be assumed all package maintainers will install repmgr there. This parameter is optional; if not set (the default), repmgr will fall back to "pg_bindir" (if set). Addresses GitHub #246.	2018-10-02 09:59:12 +09:00
Ian Barwick	688337dec3	repmgr: add "--node-id" option to "cluster cleanup" Implements GitHub #493.	2018-09-25 15:56:40 +09:00
Ian Barwick	38e3aae053	repmgr: add parameter "shutdown_check_timeout" Previously, "repmgr standby switchover" used the configuration file parameters "reconnect_interval" and "reconnect_attempts" to define a timeout to determine whether the current primary (demotion candidate) has shut down. However, these parameters are intended for primary failure detection and are generally lower in value, while a controlled shutdown may take longer, resulting in the switchover being aborted as repmgr was not waiting long enough. To prevent this happening, parameter "shutdown_check_timeout" has been added. This complements the existing "standby_reconnect_timeout" parameter used by "repmgr standby switchover". Implements GitHub #504.	2018-09-25 11:34:06 +09:00
Ian Barwick	17e75f6b31	repmgrd: improve reconnection handling Previously, if the server being monitored was not available, repmgrd would always close the existing connection handle and open a new one. However, in some cases, e.g. a brief network outage, the existing connection handle is still good and does not need to be reopened. This could be particularly problematic if monitoring_history is on, as this risks leaving orphan sessions on the primary which (given a sufficiently unstable network) could lead to all available backends being occupied. Instead, during an outage we now use a new connection to verify the server is accessible; if the old connection is still available (e.g. following a short network interruption) we continue using that; if not (e.g. the server was restarted), we use the new one.	2018-08-30 15:46:08 +09:00
Ian Barwick	3b8586d82a	doc: update release notes	2018-08-30 13:05:17 +09:00
Ian Barwick	c1586e39b7	Log text of failed queries at log level ERROR Previously query texts were always logged at log level DEBUG, but that doesn't help much in a normal production environment when trying to identify the cause of issues. Also make various other minor improvements to query logging and handling of database errors. Implements GitHub #498.	2018-08-29 10:08:52 +09:00
Ian Barwick	7745844078	"standby switchover": improve replication connection check Previously repmgr would first check that a replication can be made from the demotion candidate to the promotion candidate, however it's preferable to sanity-check the number of available walsenders first, to provide a more useful error message.	2018-08-24 16:31:25 +09:00
Ian Barwick	e1e59e85d7	repmgr: add "cluster_cleanup" event GitHub #492.	2018-08-24 09:20:05 +09:00
Ian Barwick	f4df6696ba	doc: update release notes	2018-08-20 15:24:43 +09:00
Ian Barwick	c3949b2aea	"standby clone" - don't copy external config files in dry run mode Avoid copying files during a --dry-run as it may introduce unexpected changes on the target node. During an actual clone operation, any problems with copying files will be detected early and the operation aborted before the actual database cloning commences. GitHub #491.	2018-08-20 15:23:37 +09:00
Ian Barwick	6ba49de44e	"standby promote": improve log messages Make it clearer what repmgr is waiting for, and what to do if the promotion appears to fail.	2018-08-16 11:52:01 +09:00
Ian Barwick	34c4f4c3f8	repmgr: truncate version string if necessary Some distributions may add extra information to PG_VERSION after the actual version number (e.g. "10.4 (Debian 10.4-2.pgdg90+1)"), so copy the version number string up until the first space is found. GitHub #490.	2018-08-14 09:55:23 +09:00
Ian Barwick	78b969f208	repmgrd: report version number after logger initialisation This ensures the version number always makes it into the log destination. Implements GitHub #487.	2018-08-08 15:44:06 +09:00
Ian Barwick	33dedf4e96	repmgrd: always reopen log file after receiving SIGHUP For whatever reason, since at least repmgr 2.0 the log file was only ever reopened if a configuration file change took place. GitHub #485.	2018-08-02 10:54:31 +09:00
Ian Barwick	f3f002bea5	doc: add release date for 4.1.0	2018-07-31 11:00:38 +09:00
Ian Barwick	7ecfb333b9	doc: add note about switchover and exclusive backups Also rename server_not_in_exclusive_backup_mode() to avoid double negatives. GitHub #476.	2018-07-19 16:02:31 +09:00
Ian Barwick	673bde2b7f	repmgr: fix "primary_slot_name" when using "standby clone" with --recovery-conf-only Addresses GitHub #474.	2018-07-17 13:42:10 +09:00
Ian Barwick	69782cf703	repmgr: enable "witness unregister" to be run on any node Provide the ID of the witness node with --node-id=... Implements GitHub #472.	2018-07-13 17:37:59 +09:00
Ian Barwick	5acb3e6790	doc: update release notes	2018-07-13 15:35:34 +09:00
Ian Barwick	6dfcaa357e	doc: update release notes	2018-07-13 15:06:04 +09:00
Ian Barwick	56919ea499	repmgr: add -q/--quiet option This suppresses log output below log level ERROR. This is useful mainly when repmgr is being executed programmatically, e.g. in a cronjob, where it's only useful to receive output if something goes wrong. Note we advise against using this option when executing repmgr commands which operate on PostgreSQL nodes (standby follow, standby promote, standby switchover, node rejoin), particularly when executed by repmgrd, as the log output will provide valuable troubleshooting information. Implements suggestion in GitHub #468.	2018-07-13 12:09:41 +09:00
Ian Barwick	b3f64987cb	repmgr: add --csv output to "cluster event" Implements GitHub #471.	2018-07-13 11:19:42 +09:00
Ian Barwick	8b059bc9b0	Change default for "log_level" to INFO Default was previously NOTICE (as in repmgr 3.x) but documentation implied it was INFO, and many of the the documentation examples assume it is. This produces some quite informative log output, without creating excessive log file volume. In particular it's useful to get a better idea of what repmgrd is actually doing. Also add documentation section for the log configuration parameters. GitHub #470, containing change suggested in GitHub #467.	2018-07-12 14:50:48 +09:00
Ian Barwick	ae60caacdd	repmgr: make "node check" and "node status" return ERR_NODE_STATUS when appropriate If any issue is detected (and "node check" is not being executed with a specific individual check), "ERR_NODE_STATUS" is returned.	2018-07-05 14:31:06 +09:00
Ian Barwick	92d0e6809b	repmgr: "cluster show" to return non-zero value if an issue encountered	2018-07-05 13:32:50 +09:00
Ian Barwick	4c7c681a14	repmgr: have "cluster show" exit with a non-zero value if issues detected If any issues are detected (e.g. node not reachable, unexpected node status etc.), "repmgr cluster show" returns exit code 25 ("ERR_NODE_STATUS"). Note that exit code 25 was introduced recently as "ERR_CLUSTER_CHECK", however it makes sense to use this to indicate issues detected by any command which can detect node issues. Addresses GitHub #456.	2018-07-05 11:03:48 +09:00
Ian Barwick	37311e15a3	repmgr: fix "standby register --wait-sync" when no timeout provided The default value for "wait_register_sync_seconds" was zero, which is treated as disabling --wait-sync altogether. Default value now set to -1, which is taken to mean no timeout value supplied.	2018-07-04 17:22:04 +09:00
Ian Barwick	a194cf56b3	repmgr: exit with an error if an unrecognised command line option is provided. This matches the behaviour of other PostgreSQL utilities such as psql, though repmgr will only abort once all command line options are parsed, so as many errors as possible are found and displayed. If a repmgr "command" (e.g. "repmgr primary ..." was provided, a hint about the relevant command help section (e.g. "repmgr primary --help") will be provided alongside the generic help command (i.e. "repmgr --help"). Addresses GitHub #464, with further improvements.	2018-07-04 11:02:50 +09:00

1 2

89 Commits