repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-07-16 14:29:05 +00:00

Author	SHA1	Message	Date
Ian Barwick	c2dded1d7b	Log text of failed queries at log level ERROR Previously query texts were always logged at log level DEBUG, but that doesn't help much in a normal production environment when trying to identify the cause of issues. Also make various other minor improvements to query logging and handling of database errors. Implements GitHub #498.	2018-08-29 10:09:51 +09:00
Ian Barwick	457dbbd267	"standby switchover": improve replication connection check Previously repmgr would first check that a replication can be made from the demotion candidate to the promotion candidate, however it's preferable to sanity-check the number of available walsenders first, to provide a more useful error message.	2018-08-24 16:31:46 +09:00
Ian Barwick	4cbba98193	repmgr: add "cluster_cleanup" event GitHub #492.	2018-08-20 16:48:08 +09:00
Ian Barwick	719dd93676	doc: update release notes	2018-08-20 12:33:11 +09:00
Ian Barwick	d4ad8ce20c	"standby clone" - don't copy external config files in dry run mode Avoid copying files during a --dry-run as it may introduce unexpected changes on the target node. During an actual clone operation, any problems with copying files will be detected early and the operation aborted before the actual database cloning commences. GitHub #491.	2018-08-16 14:03:39 +09:00
Ian Barwick	bacab8d31c	"standby promote": improve log messages Make it clearer what repmgr is waiting for, and what to do if the promotion appears to fail.	2018-08-16 11:52:18 +09:00
Ian Barwick	9ae9d31165	repmgr: truncate version string if necessary Some distributions may add extra information to PG_VERSION after the actual version number (e.g. "10.4 (Debian 10.4-2.pgdg90+1)"), so copy the version number string up until the first space is found. GitHub #490.	2018-08-14 09:56:54 +09:00
Ian Barwick	4c44c01380	doc: update release notes	2018-08-10 09:52:39 +09:00
Ian Barwick	5113ab0274	repmgrd: fix startup on witness node when local data is stale Previously, when running on a witness server, repmgrd didn't consider the local cache of the "repmgr.nodes" table might be outdated, e.g. as repmgrd wasn't running on the witness server during a failover, so could potentially end up monitoring a former primary now running as a standby. When running on a witness server, at startup repmgrd will now scan all nodes to determine the current primary, and refresh its local cache from there. This will also ensure it can start up even if the node currently registered as primary in the local cache is not available. Implements GitHub #488 and #489.	2018-08-09 16:42:20 +09:00
Ian Barwick	25f68bb283	repmgrd: report version number after logger initialisation This ensures the version number always makes it into the log destination. Implements GitHub #487.	2018-08-08 15:45:48 +09:00
Ian Barwick	3a789d53e0	repmgrd: always reopen log file after receiving SIGHUP For whatever reason, since at least repmgr 2.0 the log file was only ever reopened if a configuration file change took place. GitHub #485.	2018-08-02 10:51:18 +09:00
Ian Barwick	cb4f6f6e3f	doc: add release date for 4.1.0	2018-07-31 10:58:06 +09:00
Ian Barwick	7ecfb333b9	doc: add note about switchover and exclusive backups Also rename server_not_in_exclusive_backup_mode() to avoid double negatives. GitHub #476.	2018-07-19 16:02:31 +09:00
Ian Barwick	673bde2b7f	repmgr: fix "primary_slot_name" when using "standby clone" with --recovery-conf-only Addresses GitHub #474.	2018-07-17 13:42:10 +09:00
Ian Barwick	69782cf703	repmgr: enable "witness unregister" to be run on any node Provide the ID of the witness node with --node-id=... Implements GitHub #472.	2018-07-13 17:37:59 +09:00
Ian Barwick	5acb3e6790	doc: update release notes	2018-07-13 15:35:34 +09:00
Ian Barwick	6dfcaa357e	doc: update release notes	2018-07-13 15:06:04 +09:00
Ian Barwick	56919ea499	repmgr: add -q/--quiet option This suppresses log output below log level ERROR. This is useful mainly when repmgr is being executed programmatically, e.g. in a cronjob, where it's only useful to receive output if something goes wrong. Note we advise against using this option when executing repmgr commands which operate on PostgreSQL nodes (standby follow, standby promote, standby switchover, node rejoin), particularly when executed by repmgrd, as the log output will provide valuable troubleshooting information. Implements suggestion in GitHub #468.	2018-07-13 12:09:41 +09:00
Ian Barwick	b3f64987cb	repmgr: add --csv output to "cluster event" Implements GitHub #471.	2018-07-13 11:19:42 +09:00
Ian Barwick	8b059bc9b0	Change default for "log_level" to INFO Default was previously NOTICE (as in repmgr 3.x) but documentation implied it was INFO, and many of the the documentation examples assume it is. This produces some quite informative log output, without creating excessive log file volume. In particular it's useful to get a better idea of what repmgrd is actually doing. Also add documentation section for the log configuration parameters. GitHub #470, containing change suggested in GitHub #467.	2018-07-12 14:50:48 +09:00
Ian Barwick	ae60caacdd	repmgr: make "node check" and "node status" return ERR_NODE_STATUS when appropriate If any issue is detected (and "node check" is not being executed with a specific individual check), "ERR_NODE_STATUS" is returned.	2018-07-05 14:31:06 +09:00
Ian Barwick	92d0e6809b	repmgr: "cluster show" to return non-zero value if an issue encountered	2018-07-05 13:32:50 +09:00
Ian Barwick	4c7c681a14	repmgr: have "cluster show" exit with a non-zero value if issues detected If any issues are detected (e.g. node not reachable, unexpected node status etc.), "repmgr cluster show" returns exit code 25 ("ERR_NODE_STATUS"). Note that exit code 25 was introduced recently as "ERR_CLUSTER_CHECK", however it makes sense to use this to indicate issues detected by any command which can detect node issues. Addresses GitHub #456.	2018-07-05 11:03:48 +09:00
Ian Barwick	37311e15a3	repmgr: fix "standby register --wait-sync" when no timeout provided The default value for "wait_register_sync_seconds" was zero, which is treated as disabling --wait-sync altogether. Default value now set to -1, which is taken to mean no timeout value supplied.	2018-07-04 17:22:04 +09:00
Ian Barwick	a194cf56b3	repmgr: exit with an error if an unrecognised command line option is provided. This matches the behaviour of other PostgreSQL utilities such as psql, though repmgr will only abort once all command line options are parsed, so as many errors as possible are found and displayed. If a repmgr "command" (e.g. "repmgr primary ..." was provided, a hint about the relevant command help section (e.g. "repmgr primary --help") will be provided alongside the generic help command (i.e. "repmgr --help"). Addresses GitHub #464, with further improvements.	2018-07-04 11:02:50 +09:00
Ian Barwick	802755fd60	repmgrd: daemonize process by default It's hard to imagine a use case where this isn't desirable, but in case, for whatever reason, the user does not wish to daemonize the process, the command line option "--daemonize=false" can be provided. Implements GitHub #458.	2018-06-29 22:01:49 +09:00
Ian Barwick	d00c0c67d0	repmgrd: document PID file options/configuration	2018-06-29 17:00:25 +09:00
Ian Barwick	080a29c33b	node check: add --missing-slots check This enables an explicit check for slots which should exist (according to the repmgr metadata) but which aren't present.	2018-06-22 17:21:40 +09:00
Ian Barwick	3b0cde2846	repmgr: cluster check commands - non-zero exit code if node(s) unavailable Return ERR_CLUSTER_CHECK if one or nodes was not reachable. Implements GitHub #447.	2018-06-12 10:30:11 +09:00
Ian Barwick	00704913a6	doc: 4.0.6 release notes	2018-06-12 10:29:35 +09:00
Ian Barwick	e12fbb7b4d	doc: update release notes	2018-06-07 15:04:38 +09:00
Ian Barwick	2d43feb34b	doc: update HISTORY and add 4.0.5 release notes	2018-05-01 10:21:40 +09:00
Ian Barwick	5e4bdb5a1b	repmgrd: handle failover with two nodes in the primary location If two nodes were in the primary location, and at least one node in another location, the non-failed node in the primary location was not recognising itself as a promotion candidate. Addresses GitHub #407.	2018-04-02 20:51:27 +09:00
Ian Barwick	22c40ae62d	doc: update HISTORY and release notes	2018-03-30 09:41:48 +09:00
Ian Barwick	1558497ae4	repmgr: poll demoted primary after restart during switchover During a switchover operation, once the demoted primary has been restarted as a standby, repmgr attempts to reconnect to verify its status and drop any redundant replication slots. However it's possible the standby may still be in the startup phase, so poll for "standby_reconnect_timeout" seconds before giving up. Addresses GitHub #408.	2018-03-27 16:44:10 +09:00
Ian Barwick	9c5e76401f	Fix "repmgr cluster crosscheck" output Addresses GitHub #398.	2018-03-27 16:44:04 +09:00
Ian Barwick	0219f4c91f	Always set "connect_timeout" when pinging a PostgreSQL instance Insert "connect_timeout=2" into the connection parameters, if not explicitly set by the user. This will prevent excessive wait time for the host operating system to report a connection timeout.	2018-03-21 11:48:57 +09:00
Ian Barwick	85a4adc99c	Update HISTORY	2018-03-21 06:48:32 +09:00
Ian Barwick	d7702b3444	Correctly handle error message pointer when parsing strings. When parsing conninfo strings, ensure the error message pointer is actually returned to the caller. Not a criticial issue, just meant the contents of the error message were not being displayed.	2018-03-10 14:29:12 +09:00
Ian Barwick	e8cdf72ecd	Add 4.0.4 release notes	2018-03-07 19:21:49 +09:00
Ian Barwick	9981ede1af	"standby clone": fix --superuser handling get_superuser_connection() was erroneously using the local node record to connect to as a superuser, which works when registering the primary but obviously not when cloning a standby. Addresses GitHub #380.	2018-03-02 16:43:19 +09:00
Ian Barwick	40ccae57a3	Update HISTORY	2018-03-02 11:05:30 +09:00
Ian Barwick	15625183c1	"standby clone": document --recovery-conf-only option	2018-02-23 11:19:21 +09:00
Ian Barwick	22b3a74fa0	repmgrd: improve detection of status change from primary to standby If repmgrd is running in degraded mode on a primary which has been stopped, then manually been brought back online as a standby (e.g. by creating recovery.conf and starting the server), ensure it not only detects the change but automatically updates the node record so it can resume monitoring the node as a standby. Previously, repmgrd was looping waiting for the record to be updated (as is done transparently when executing "repmgr node rejoin") but if the record was not updated within the timeout period (e.g. by "repmgr standby register) it would fail to resume monitoring as a standby. It seems reasonable to have repmgrd automatically update the node record, as this will restore failover capability as quickly as possible. If this is not desired, then the onus is on the user to shut down repmgrd while making the desired changes.	2018-02-22 15:50:45 +09:00
Ian Barwick	98af51da03	"node rejoin": ensure --dry-run is honoured Addresses GitHub #383.	2018-02-20 15:31:03 +09:00
Ian Barwick	728a256a93	doc: update release notes	2018-02-16 12:15:35 +09:00
Ian Barwick	76a93af15c	"witness register": fix primary node check Addresses GitHub #377, based on report by user yonj1e in #373.	2018-02-08 16:41:04 +09:00
Ian Barwick	ee2df36a76	"standby switchover": additional sanity checks Check that sufficient walsenders will be available on the promotion candidate, and if replication slots are in use check if enough of those will be available. Note these checks can't guarantee that the walsenders/slots will be available at the appropriate points during the switchover process, but do ensure that existing configuration problems will be caught. Implements GitHub #371.	2018-02-08 15:19:24 +09:00
Ian Barwick	f9528efdb8	"standby clone": ensure "pg_subtrans" directory is created in Barman mode	2018-02-07 14:45:04 +09:00
Ian Barwick	e6aa831782	Update HISTORY and release notes	2018-02-07 14:43:43 +09:00

1 2

64 Commits