repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-22 22:56:29 +00:00

Author	SHA1	Message	Date
Ian Barwick	e8ba213174	"standby register": add sanity check when --upstream-node-id not supplied If --upstream-node-id was not supplied to "repmgr standby register", repmgr defaults to the primary node as upstream node. If the local node is available, we now double-check that it's attached to the primary, in case the lack of --upstream-node-id was an accidental ommission. This check is only made when the local node is available. This behaviour can be overriden with -F/--force (though it's hard to imagine a scenario where that would be useful). Addresses GitHub #395.	2018-04-05 17:38:55 +09:00
Ian Barwick	ec998bf9c5	doc: update HISTORY and release notes	2018-04-03 15:00:49 +09:00
Ian Barwick	e36b180de8	Ensure correct server version number used for replication stats query	2018-04-03 14:45:37 +09:00
Ian Barwick	a2068768ab	Execute a CHECKPOINT immediately after promoting the server This ensures "pg_control" is updated with the latest timeline, mainly to ensure that if "pg_rewind" is executed as part of a switchover that it sees the latest timeline. Per suggestion from GitHub user "superflav" in GitHub #378. See also: https://www.postgresql.org/message-id/flat/20150428180253.GU30322%40tamriel.snowman.net	2018-04-03 14:44:44 +09:00
Ian Barwick	bde9fea48c	Fix directory creation when cloning from Barman	2018-04-03 14:44:03 +09:00
Ian Barwick	3b00dc912a	Catch various corner cases when restarting a PostgreSQL instance	2018-04-03 14:40:53 +09:00
Ian Barwick	1e1b4b1a65	"standby register/follow": provide primary node details for event notifications For events generated by these commands, it may be useful to know details of the primary node. This makes following additional parameters available to event notification scripts: - %p: node ID of the primary - %a: node name of the primary - %c: conninfo string for the primary Implements GitHub #375	2018-04-03 14:32:19 +09:00
Ian Barwick	cf64f9e95c	Always initialise t_conninfo_param_list structures	2018-04-03 14:31:24 +09:00
Ian Barwick	dfdebd6c08	Enable provision of "archive_cleanup_command" in recovery.conf If "archive_cleanup_command" is defined in "repmgr.conf", a corresponding entry will be made in the node's "recovery.conf" file after cloning a standby. Note that we recommend using PgBarman to manage WAL archives, but are providing this facility to help repmgr to be integrated in existing environments. Implements GitHub #416.	2018-04-03 14:10:21 +09:00
Ian Barwick	63a11f8926	"standby promote": make timeout values configurable This introduces following new configuration file parameters, which were previously hard-coded values: - promote_check_timeout - promote_check_interval Implements GitHub #387.	2018-04-03 14:10:14 +09:00
Ian Barwick	ad24b04c35	Refactor pg_control parsing The "data_checksum_version" field towards the end of the ControlFileData struct, meaning its position varies between versions. Previously this wasn't a problem as it was only required for operations involving 9.5 and later, and its position within the control file has not changed between the current release and current HEAD. However, in order to support pg_rewind in 9.3 and 9.4, which both have changes in the control file format, we'll need version-specific parsing. This will also make it easier to deal with any future changes to the control file format.	2018-04-02 20:54:42 +09:00
Ian Barwick	3ccf1cf182	Enable pg_rewind to be used with PostgreSQL 9.3/9.4 pg_rewind is not part of the core distribution for those, but we provided support in repmgr 3.3 so should extend it to repmgr 4. Note that there is no check in place whether the pg_rewind binary exists, so it's up to the user to ensure it's present. Addresses GitHub #413.	2018-04-02 20:54:29 +09:00
Ian Barwick	239a548e9d	"standby switchover": force checkpoint if pg_rewind requested. Addresses issue described in GitHub #378. PostgreSQL itself doesn't issue a checkpoint after promotion to ensure the newly promoted server is available as quickly as possible, so we'll only execute an explicit CHECKPOINT when it's actually required, i.e. when pg_rewind will be executed. This is required as pg_rewind uses the timeline reported in the pg_control file to compare with the server to be rewound, and the pg_control timeline is only updated after the first checkpoint, so there is an interval where pg_rewind will erroneously assume both servers are on the timeline and take no action.	2018-03-29 23:55:08 +09:00
Ian Barwick	231ef5563e	"standby switchover": update hint	2018-03-29 23:41:59 +09:00
Ian Barwick	7111483b65	repmgr: move demoted primary check to the final step during switchover This will give the demoted primary more time to start up as a standby, during which "standby follow" can be executed on sibling nodes, if specified.	2018-03-27 16:44:15 +09:00
Ian Barwick	1558497ae4	repmgr: poll demoted primary after restart during switchover During a switchover operation, once the demoted primary has been restarted as a standby, repmgr attempts to reconnect to verify its status and drop any redundant replication slots. However it's possible the standby may still be in the startup phase, so poll for "standby_reconnect_timeout" seconds before giving up. Addresses GitHub #408.	2018-03-27 16:44:10 +09:00
Ian Barwick	93deab3e96	Add error code ERR_FOLLOW_FAIL	2018-03-21 13:11:30 +09:00
Ian Barwick	d7702b3444	Correctly handle error message pointer when parsing strings. When parsing conninfo strings, ensure the error message pointer is actually returned to the caller. Not a criticial issue, just meant the contents of the error message were not being displayed.	2018-03-10 14:29:12 +09:00
Ian Barwick	d2a5cc23cc	"standby clone": improve replication user selection Use the upstream node's replication user when checking the replication connection.	2018-03-02 16:43:23 +09:00
Ian Barwick	9981ede1af	"standby clone": fix --superuser handling get_superuser_connection() was erroneously using the local node record to connect to as a superuser, which works when registering the primary but obviously not when cloning a standby. Addresses GitHub #380.	2018-03-02 16:43:19 +09:00
Ian Barwick	3c2b8e5792	"standby clone": remove restriction on replication slots in Barman mode While it's preferable to avoid standby replication slots if Barman is in use, there's no technical reason to prevent this. Implements GitHub #379.	2018-03-02 11:05:25 +09:00
Ian Barwick	354231284e	repmgr: escape "restore_command" in generated recovery.conf	2018-03-02 11:05:21 +09:00
Ian Barwick	dbbfcb6a63	"standy clone": fix primary_conninfo when --upstream-conninfo provided	2018-03-02 11:05:15 +09:00
Ian Barwick	b6a1b75d22	"standby clone --recovery-conf-only": display generated file with --dry-run Refactor the original code which generates "recovery.conf" to place the output into a buffer, which can either be output as "recovery.conf" or copied to a buffer specified by the caller.	2018-02-23 11:18:45 +09:00
Ian Barwick	ee98a3a58e	"standby clone": add --recovery-conf-only option This will generate "recovery.conf" for an existing standby. Typical use-case is a standby cloned manually from an external data source (e.g. Barman), where "recovery.conf" needs to be created (and if required a replication slot). The --dry-run option will check the pre-requisites but not actually create "recovery.conf" or a replication slot. This requires that the upstream node is running, a replication connection can be made and if required a replication slot can be created. Implements GitHub #382.	2018-02-22 15:50:51 +09:00
Ian Barwick	64d85587de	repmgrd: check "repmgr" extension is installed before starting Implements GitHub #361.	2018-02-12 11:38:31 +09:00
Ian Barwick	927bf038a0	"standby switchover": check demotion candidate can make replication connection Check it's actually possible for the demotion candidate to attach to the promotion candidate before executing the switchover. As with other checks of this nature, there's a faint possibility the situation could change between the time the check is carried out and the demotion candidate is restarted to connect to the promotion candidate, but there's not a lot we can do about that. The main purpose is to be able to catch existing misconfigurations before anything gets changed. Implements GitHub #370.	2018-02-09 10:00:54 +09:00
Ian Barwick	ee2df36a76	"standby switchover": additional sanity checks Check that sufficient walsenders will be available on the promotion candidate, and if replication slots are in use check if enough of those will be available. Note these checks can't guarantee that the walsenders/slots will be available at the appropriate points during the switchover process, but do ensure that existing configuration problems will be caught. Implements GitHub #371.	2018-02-08 15:19:24 +09:00
Ian Barwick	571e6b2783	"standby clone": cowardly refuse to clone into an active data directory By checking the PID file in the same way pg_ctl does, we can be pretty much certain whether the target data directory contains an active PostgreSQL instance.	2018-02-08 10:19:05 +09:00
Ian Barwick	76cc11b786	Fix "standby clone" in Barman mode with --no-upstream-connection "--upstream-node-id", if provided, was not being passed through to the SQL query executed via the Barman server. Also modified the query to select the primary node if "--upstream-node-id" is not provided. Note: this is a very niche use case.	2018-02-07 16:34:01 +09:00
Ian Barwick	56710f4819	repmgr: simplify data directory checks when cloning Attempting to use the contents of pg_control to tell whether the directory is in use by PostgreSQL can result in false positives; we should use a check based on the pidfile. Also change the HINT to indicate a data directory can be overwritten if -F/--force is provided.	2018-02-07 14:45:37 +09:00
Ian Barwick	f9528efdb8	"standby clone": ensure "pg_subtrans" directory is created in Barman mode	2018-02-07 14:45:04 +09:00
Ian Barwick	9b56f157dc	Move parse_output_to_argv() to configfile.c So it can be used by parse_pg_basebackup_options(). Addresses GitHub #376.	2018-02-07 09:47:50 +09:00
Ian Barwick	57f1e939c5	"standby register": add event notification "standby_register_sync" Implements GitHub #374.	2018-02-05 15:20:19 +09:00
Ian Barwick	6c81e54f76	"standby follow": check for replication slot availability on target node	2018-02-02 17:18:43 +09:00
Ian Barwick	e23d28a22d	"standby follow": initial implementation of --dry-run option GitHub #363.	2018-02-01 14:16:49 +09:00
Ian Barwick	811d2a45bd	"standby switchover": improve log messages and add new exit code Previously, if an issue was encountered with the old primary, but user provided -F/--force to have repmgr promote the standby anyway, repmgr would exit with the log message "STANDBY SWITCHOVER is complete" and exit code 0 (SUCCESS). To better report this partial completion, repmgr will now emit the message "STANDBY SWITCHOVER has completed with issues" (and a HINT to check preceding log messages) and new exit code 22 (ERR_SWITCHOVER_INCOMPLETE).	2018-01-31 11:03:54 +09:00
Ian Barwick	92f4710ee2	Have do_standby_follow_internal() not abort on error Pass the error code back to the caller instead, mainly so "repmgr node rejoin" can better report errors.	2018-01-31 11:03:27 +09:00
Ian Barwick	044d8a1098	repmgr: improve switchover handling when "pg_ctl" used If logging output not explicitly rediretced with "-l" in the pg_ctl options, repmgr would hang waiting for pg_ctl output. Note that we recommend using the OS-level service commands where available.	2018-01-30 16:56:26 +09:00
Ian Barwick	b38f45120c	"repmgr standby register": improve error output when standby not running Add explicit HINT	2018-01-27 07:17:34 +09:00
Ian Barwick	8fd0c4ad83	repmgr: assume node is actually shutting down if pingable and that's the reported status	2018-01-12 21:53:37 +09:00
Ian Barwick	7ccae6c2b1	repmgr: automatically create slot name if missing It's possible that a node was registered with "use_replication_slots=false" but that was later changed to "use_replication_slots=true". If the node was not subsequently re-registered, the node record will contain an empty slot name, which will cause any slot creation operation during "standby follow" or "node rejoin" to fail. To prevent this happening, check for an empty slot name and automatically set before proceeding. Addresses GitHub #343.	2018-01-11 14:47:50 +09:00
Ian Barwick	61d46172b9	repmgr: catch possible corner case when checking node shutdown status It's conceivable that PQping is returning "no response" but the shutdown hasn't quite completed.	2018-01-10 15:09:21 +09:00
Ian Barwick	810471b2f2	repmgr: during switchover, correctly detect unclean shutdown status	2018-01-10 12:25:16 +09:00
Ian Barwick	5bd8cf958a	repmgr standby switchover: add "%p" event notification parameter This will contain the node ID of the former primary.	2018-01-10 12:25:12 +09:00
Ian Barwick	f1f5100007	repmgr standby switchover: add event details	2018-01-10 12:25:00 +09:00
Ian Barwick	1c8ad4d89b	Consolidate parsing of output from executing repmgr on a remote server This should also fix the issue reported in GitHub #349.	2018-01-09 16:24:13 +09:00
Ian Barwick	b705127a34	"repmgr standby register": add --wait-start option Implements GitHub #356.	2018-01-04 14:56:08 +09:00
Ian Barwick	26a9e848fd	Update copyright notices to 2018	2018-01-02 10:19:46 +09:00
Ian Barwick	295c18f6ff	repmgr: fix configuration file sanity check The check was being carried out regardless of whether --copy-external-config-files was specified, which means cloning will fail if no SSH connection is available. Addresses GitHub #342	2017-11-23 22:48:34 +09:00

... 3 4 5 6 7 ...

396 Commits