repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-22 22:56:29 +00:00

Author	SHA1	Message	Date
Martín Marqués	8f13a66aaa	Check that there is no exclusive backup taking place while we perform a switchover. We've found that this can cause some issues with postgres control metadata (could be a postgres bug) so best thing is not no switchover if there's a backup taking place. It's also a bad idea from an architectual point of view, as a switchover is supposed to be planed, so why perform it when we are taking backups. GitHub #476.	2018-07-19 16:02:21 +09:00
Ian Barwick	673bde2b7f	repmgr: fix "primary_slot_name" when using "standby clone" with --recovery-conf-only Addresses GitHub #474.	2018-07-17 13:42:10 +09:00
Martín Marqués	81de200561	Add information to the --help and docs of standby clone regarding the need to provide a conninfo line to the upstream from which we will be cloning from.	2018-07-16 18:56:41 -03:00
Ian Barwick	29de052dd8	repmgr: clarify intent behind --wait-sync timeout processing	2018-07-05 10:09:04 +09:00
Ian Barwick	37311e15a3	repmgr: fix "standby register --wait-sync" when no timeout provided The default value for "wait_register_sync_seconds" was zero, which is treated as disabling --wait-sync altogether. Default value now set to -1, which is taken to mean no timeout value supplied.	2018-07-04 17:22:04 +09:00
Ian Barwick	fcf237fe31	node status: improve output and documentation In the default text output mode, list inactive slots. In CSV output mode, list inactive slots as additional information; add output line with number of missing slots and a list thereof. Also document --csv output mode.	2018-06-22 11:46:50 +09:00
Ian Barwick	c5ba72c2c5	standby switchover: fix behaviour if witness node is a sibling The witness node is not a streaming replication standby, so executing "repmgr standby follow" will fail. Instead, execute "repmgr witness register --force" to update the witness node record on the primary and its local copy of all node records. Addresses GitHub #453.	2018-06-21 16:48:58 +09:00
Ian Barwick	efc388065e	standby follow: check node has connect to new primary After restarting the standby, poll pg_stat_replication on the upstream until the standby connects, and exit with an error if it doesn't by the timeout defined in "standby_follow_timeout". Implments GitHub #444.	2018-06-07 15:04:45 +09:00
Ian Barwick	0108fb2e72	standby follow: add hint about using "node rejoin" If "repmgr standby follow" is executed on a node which isn't running, point out "repmgr node rejoin" should probably be used instead.	2018-06-07 15:04:30 +09:00
Ian Barwick	535fba43d3	standby clone: improve external configuration file copying If --copy-external-config-files was provided, check that we can copy the files before cloning the standby, and abort if an error is encountered. This will give the user the opportunity to fix any issues before running the entire (and potentially lengthy) clone. Previously errors were logged but no action taken, and the final message indicated the clone operation was successful. Addresses GitHub #443.	2018-06-07 15:04:01 +09:00
Ian Barwick	7613b1769c	standby clone: --recovery-conf-only expects the standby to be registered Note this in the documentation, and add a HINT about registering it if the standby record is not available. Related to GitHub #438.	2018-05-31 09:42:53 +09:00
Ian Barwick	276239422b	standby clone: don't assume existence of "user" in upstream conninfo Usually a seperate user (typically "repmgr") is set up specifically to manage the repmgr metadata, however there's no compelling requirement to do this, and it's possible the database owner (usually: "postgres") will be used, in which case it's possible the username will be left out of the conninfo string. Addresses GitHub #437.	2018-05-24 15:52:51 +09:00
Ian Barwick	6c518f1403	"standby clone": log actual connection string used to connect to upstream Useful for diagnostic purposes.	2018-05-10 12:03:13 +09:00
Ian Barwick	8320179f34	Add configuration file parameter "config_directory" This enables explicit provision of an external configuration file directory, which if set will be passed to "pg_ctl" as the -D parameter. Otherwise "pg_ctl" will default to using the data directory, which will cause some operations to fail if the configuration files are not present there. Note this is implemented primarily for feature completeness and for development/testing purposes. Users who have installed "repmgr" from a package should not rely on "pg_ctl" to stop/start/restart PostgreSQL, instead they should set the appropriate "service_..._command" for their operating system. For more details see: https://repmgr.org/docs/4.0/configuration-service-commands.html Note: in a future release, the presence of "config_directory" in repmgr.conf will be used to implictly set "--copy-external-config-files=samepath" when cloning a standby; this is a behaviour change so will be implemented in the next major realease (repmgr 4.1). Implements GitHub #424.	2018-04-25 11:58:24 +09:00
Ian Barwick	cda952f1e4	Add "dbname=replication" to all replication connection strings Previously repmgr was attempting to make replication connections with "dbname" set to the repmgr database name. While this works if e.g. the repmgr user also has replication permissions, it will fail if a dedicated replication user is specified, who only has permission to access the virtual "replication" database. Change this to use "dbname=replication" if the replication connection user is different to the normal repmgr database user. (We could just always set it to "replication", but that might break existing installations e.g. where a .pgpass file is in use and there's no "replication" entry for the normal repmgr database user). Addresses GitHub #421.	2018-04-12 16:11:16 +09:00
Ian Barwick	62c29aab32	Don't issue a CHECKPOINT after promoting a standby. Issuing a CHECKPOINT immediately after promoting a standby may impact performance. Commit `239a548e9d` ensures one is only issued when required, i.e. during a switchover when pg_rewind will be executed. This reverts commit `a2068768ab`.	2018-04-09 14:35:54 +09:00
Ian Barwick	e8ba213174	"standby register": add sanity check when --upstream-node-id not supplied If --upstream-node-id was not supplied to "repmgr standby register", repmgr defaults to the primary node as upstream node. If the local node is available, we now double-check that it's attached to the primary, in case the lack of --upstream-node-id was an accidental ommission. This check is only made when the local node is available. This behaviour can be overriden with -F/--force (though it's hard to imagine a scenario where that would be useful). Addresses GitHub #395.	2018-04-05 17:38:55 +09:00
Ian Barwick	ec998bf9c5	doc: update HISTORY and release notes	2018-04-03 15:00:49 +09:00
Ian Barwick	e36b180de8	Ensure correct server version number used for replication stats query	2018-04-03 14:45:37 +09:00
Ian Barwick	a2068768ab	Execute a CHECKPOINT immediately after promoting the server This ensures "pg_control" is updated with the latest timeline, mainly to ensure that if "pg_rewind" is executed as part of a switchover that it sees the latest timeline. Per suggestion from GitHub user "superflav" in GitHub #378. See also: https://www.postgresql.org/message-id/flat/20150428180253.GU30322%40tamriel.snowman.net	2018-04-03 14:44:44 +09:00
Ian Barwick	bde9fea48c	Fix directory creation when cloning from Barman	2018-04-03 14:44:03 +09:00
Ian Barwick	3b00dc912a	Catch various corner cases when restarting a PostgreSQL instance	2018-04-03 14:40:53 +09:00
Ian Barwick	1e1b4b1a65	"standby register/follow": provide primary node details for event notifications For events generated by these commands, it may be useful to know details of the primary node. This makes following additional parameters available to event notification scripts: - %p: node ID of the primary - %a: node name of the primary - %c: conninfo string for the primary Implements GitHub #375	2018-04-03 14:32:19 +09:00
Ian Barwick	cf64f9e95c	Always initialise t_conninfo_param_list structures	2018-04-03 14:31:24 +09:00
Ian Barwick	dfdebd6c08	Enable provision of "archive_cleanup_command" in recovery.conf If "archive_cleanup_command" is defined in "repmgr.conf", a corresponding entry will be made in the node's "recovery.conf" file after cloning a standby. Note that we recommend using PgBarman to manage WAL archives, but are providing this facility to help repmgr to be integrated in existing environments. Implements GitHub #416.	2018-04-03 14:10:21 +09:00
Ian Barwick	63a11f8926	"standby promote": make timeout values configurable This introduces following new configuration file parameters, which were previously hard-coded values: - promote_check_timeout - promote_check_interval Implements GitHub #387.	2018-04-03 14:10:14 +09:00
Ian Barwick	ad24b04c35	Refactor pg_control parsing The "data_checksum_version" field towards the end of the ControlFileData struct, meaning its position varies between versions. Previously this wasn't a problem as it was only required for operations involving 9.5 and later, and its position within the control file has not changed between the current release and current HEAD. However, in order to support pg_rewind in 9.3 and 9.4, which both have changes in the control file format, we'll need version-specific parsing. This will also make it easier to deal with any future changes to the control file format.	2018-04-02 20:54:42 +09:00
Ian Barwick	3ccf1cf182	Enable pg_rewind to be used with PostgreSQL 9.3/9.4 pg_rewind is not part of the core distribution for those, but we provided support in repmgr 3.3 so should extend it to repmgr 4. Note that there is no check in place whether the pg_rewind binary exists, so it's up to the user to ensure it's present. Addresses GitHub #413.	2018-04-02 20:54:29 +09:00
Ian Barwick	239a548e9d	"standby switchover": force checkpoint if pg_rewind requested. Addresses issue described in GitHub #378. PostgreSQL itself doesn't issue a checkpoint after promotion to ensure the newly promoted server is available as quickly as possible, so we'll only execute an explicit CHECKPOINT when it's actually required, i.e. when pg_rewind will be executed. This is required as pg_rewind uses the timeline reported in the pg_control file to compare with the server to be rewound, and the pg_control timeline is only updated after the first checkpoint, so there is an interval where pg_rewind will erroneously assume both servers are on the timeline and take no action.	2018-03-29 23:55:08 +09:00
Ian Barwick	231ef5563e	"standby switchover": update hint	2018-03-29 23:41:59 +09:00
Ian Barwick	7111483b65	repmgr: move demoted primary check to the final step during switchover This will give the demoted primary more time to start up as a standby, during which "standby follow" can be executed on sibling nodes, if specified.	2018-03-27 16:44:15 +09:00
Ian Barwick	1558497ae4	repmgr: poll demoted primary after restart during switchover During a switchover operation, once the demoted primary has been restarted as a standby, repmgr attempts to reconnect to verify its status and drop any redundant replication slots. However it's possible the standby may still be in the startup phase, so poll for "standby_reconnect_timeout" seconds before giving up. Addresses GitHub #408.	2018-03-27 16:44:10 +09:00
Ian Barwick	93deab3e96	Add error code ERR_FOLLOW_FAIL	2018-03-21 13:11:30 +09:00
Ian Barwick	d7702b3444	Correctly handle error message pointer when parsing strings. When parsing conninfo strings, ensure the error message pointer is actually returned to the caller. Not a criticial issue, just meant the contents of the error message were not being displayed.	2018-03-10 14:29:12 +09:00
Ian Barwick	d2a5cc23cc	"standby clone": improve replication user selection Use the upstream node's replication user when checking the replication connection.	2018-03-02 16:43:23 +09:00
Ian Barwick	9981ede1af	"standby clone": fix --superuser handling get_superuser_connection() was erroneously using the local node record to connect to as a superuser, which works when registering the primary but obviously not when cloning a standby. Addresses GitHub #380.	2018-03-02 16:43:19 +09:00
Ian Barwick	3c2b8e5792	"standby clone": remove restriction on replication slots in Barman mode While it's preferable to avoid standby replication slots if Barman is in use, there's no technical reason to prevent this. Implements GitHub #379.	2018-03-02 11:05:25 +09:00
Ian Barwick	354231284e	repmgr: escape "restore_command" in generated recovery.conf	2018-03-02 11:05:21 +09:00
Ian Barwick	dbbfcb6a63	"standy clone": fix primary_conninfo when --upstream-conninfo provided	2018-03-02 11:05:15 +09:00
Ian Barwick	b6a1b75d22	"standby clone --recovery-conf-only": display generated file with --dry-run Refactor the original code which generates "recovery.conf" to place the output into a buffer, which can either be output as "recovery.conf" or copied to a buffer specified by the caller.	2018-02-23 11:18:45 +09:00
Ian Barwick	ee98a3a58e	"standby clone": add --recovery-conf-only option This will generate "recovery.conf" for an existing standby. Typical use-case is a standby cloned manually from an external data source (e.g. Barman), where "recovery.conf" needs to be created (and if required a replication slot). The --dry-run option will check the pre-requisites but not actually create "recovery.conf" or a replication slot. This requires that the upstream node is running, a replication connection can be made and if required a replication slot can be created. Implements GitHub #382.	2018-02-22 15:50:51 +09:00
Ian Barwick	64d85587de	repmgrd: check "repmgr" extension is installed before starting Implements GitHub #361.	2018-02-12 11:38:31 +09:00
Ian Barwick	927bf038a0	"standby switchover": check demotion candidate can make replication connection Check it's actually possible for the demotion candidate to attach to the promotion candidate before executing the switchover. As with other checks of this nature, there's a faint possibility the situation could change between the time the check is carried out and the demotion candidate is restarted to connect to the promotion candidate, but there's not a lot we can do about that. The main purpose is to be able to catch existing misconfigurations before anything gets changed. Implements GitHub #370.	2018-02-09 10:00:54 +09:00
Ian Barwick	ee2df36a76	"standby switchover": additional sanity checks Check that sufficient walsenders will be available on the promotion candidate, and if replication slots are in use check if enough of those will be available. Note these checks can't guarantee that the walsenders/slots will be available at the appropriate points during the switchover process, but do ensure that existing configuration problems will be caught. Implements GitHub #371.	2018-02-08 15:19:24 +09:00
Ian Barwick	571e6b2783	"standby clone": cowardly refuse to clone into an active data directory By checking the PID file in the same way pg_ctl does, we can be pretty much certain whether the target data directory contains an active PostgreSQL instance.	2018-02-08 10:19:05 +09:00
Ian Barwick	76cc11b786	Fix "standby clone" in Barman mode with --no-upstream-connection "--upstream-node-id", if provided, was not being passed through to the SQL query executed via the Barman server. Also modified the query to select the primary node if "--upstream-node-id" is not provided. Note: this is a very niche use case.	2018-02-07 16:34:01 +09:00
Ian Barwick	56710f4819	repmgr: simplify data directory checks when cloning Attempting to use the contents of pg_control to tell whether the directory is in use by PostgreSQL can result in false positives; we should use a check based on the pidfile. Also change the HINT to indicate a data directory can be overwritten if -F/--force is provided.	2018-02-07 14:45:37 +09:00
Ian Barwick	f9528efdb8	"standby clone": ensure "pg_subtrans" directory is created in Barman mode	2018-02-07 14:45:04 +09:00
Ian Barwick	9b56f157dc	Move parse_output_to_argv() to configfile.c So it can be used by parse_pg_basebackup_options(). Addresses GitHub #376.	2018-02-07 09:47:50 +09:00
Ian Barwick	57f1e939c5	"standby register": add event notification "standby_register_sync" Implements GitHub #374.	2018-02-05 15:20:19 +09:00

... 3 4 5 6 7 ...

412 Commits