repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-23 15:16:29 +00:00

Author	SHA1	Message	Date
Ian Barwick	9273e7af73	"standby switchover": avoid potential race condition with WAL location check Immediately after the demotion candidate (primary) has shut down, we can't be absolutely sure that the walreceiver has flushed all WAL to disk, so checking pg_last_wal_receive_lsn() at that point might not reflect the actual last available WAL location. To handle this, we'll loop for a while (timeout controlled by configuration parameter "wal_receive_check_timeout") before finally deciding whether the standby is still behind the shut-down primary. Addresses issue raised in GitHub #518.	2019-02-01 12:06:22 +09:00
Ian Barwick	32b81e7d49	"daemon start": initial implementation	2019-01-29 13:01:14 +09:00
Ian Barwick	7dce3ed234	Update copyright notices to 2019	2019-01-21 14:54:35 +09:00
Ian Barwick	11d25e2aef	Add configuration parameter "repmgr_bindir" This is to facilitate remote invocation of repmgr when the repmgr binary is located somewhere other than the PostgreSQL binary directory, as it cannot be assumed all package maintainers will install repmgr there. This parameter is optional; if not set (the default), repmgr will fall back to "pg_bindir" (if set). Addresses GitHub #246.	2018-10-02 09:59:12 +09:00
Ian Barwick	38e3aae053	repmgr: add parameter "shutdown_check_timeout" Previously, "repmgr standby switchover" used the configuration file parameters "reconnect_interval" and "reconnect_attempts" to define a timeout to determine whether the current primary (demotion candidate) has shut down. However, these parameters are intended for primary failure detection and are generally lower in value, while a controlled shutdown may take longer, resulting in the switchover being aborted as repmgr was not waiting long enough. To prevent this happening, parameter "shutdown_check_timeout" has been added. This complements the existing "standby_reconnect_timeout" parameter used by "repmgr standby switchover". Implements GitHub #504.	2018-09-25 11:34:06 +09:00
Ian Barwick	44a224ad92	repmgrd: fix configuration file reloading Don't allow "promote_command" or "follow_command" to be empty. GitHub #486.	2018-08-02 16:35:26 +09:00
Ian Barwick	a194cf56b3	repmgr: exit with an error if an unrecognised command line option is provided. This matches the behaviour of other PostgreSQL utilities such as psql, though repmgr will only abort once all command line options are parsed, so as many errors as possible are found and displayed. If a repmgr "command" (e.g. "repmgr primary ..." was provided, a hint about the relevant command help section (e.g. "repmgr primary --help") will be provided alongside the generic help command (i.e. "repmgr --help"). Addresses GitHub #464, with further improvements.	2018-07-04 11:02:50 +09:00
Ian Barwick	802755fd60	repmgrd: daemonize process by default It's hard to imagine a use case where this isn't desirable, but in case, for whatever reason, the user does not wish to daemonize the process, the command line option "--daemonize=false" can be provided. Implements GitHub #458.	2018-06-29 22:01:49 +09:00
Ian Barwick	8d636690bd	repmgrd: create pid file by default Traditionally repmgrd will only write a pidfile if explicitly requested with -p/--pid-file. However it's normally desirable to have a pidfile, and it's preferable to have one used by default to prevent accidentally starting a second repmgrd instance. Following changes made: - add configuration file parameter "repmgrd_pid_file" (initially overridden by -p/--pid-file for backwards compatibility, though eventually we'll want to drop -p/--pid-file altogether) - add command line option --no-pid-file - if neither "repmgrd_pid_file" nor -p/--pid-file is set, create the pid file in a temporary directory Implements GitHub #457.	2018-06-29 14:36:24 +09:00
Ian Barwick	b2081dca52	De-overload configuration file parameter "standby_reconnect_timeout" Currently the (very generic sounding) "standby_reconnect_timeout" configuration file parameter is used in several different contexts and it would be useful to have more granular control over the different timeouts it's used to configure. This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout" (which wasn't documented) when "repmgr node rejoin" is executed, to determine how long to wait for the node to rejoin the replication cluster. Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for failover situations, when repmgrd executes "repmgr standby follow" to follow a new primary, and waits for the standby to restart and become available for connections. "standby_reconnect_timeout" is now only relevant for "repmgr standby switchover". Implements GitHub #454.	2018-06-28 18:00:55 +09:00
Ian Barwick	efc388065e	standby follow: check node has connect to new primary After restarting the standby, poll pg_stat_replication on the upstream until the standby connects, and exit with an error if it doesn't by the timeout defined in "standby_follow_timeout". Implments GitHub #444.	2018-06-07 15:04:45 +09:00
Ian Barwick	8320179f34	Add configuration file parameter "config_directory" This enables explicit provision of an external configuration file directory, which if set will be passed to "pg_ctl" as the -D parameter. Otherwise "pg_ctl" will default to using the data directory, which will cause some operations to fail if the configuration files are not present there. Note this is implemented primarily for feature completeness and for development/testing purposes. Users who have installed "repmgr" from a package should not rely on "pg_ctl" to stop/start/restart PostgreSQL, instead they should set the appropriate "service_..._command" for their operating system. For more details see: https://repmgr.org/docs/4.0/configuration-service-commands.html Note: in a future release, the presence of "config_directory" in repmgr.conf will be used to implictly set "--copy-external-config-files=samepath" when cloning a standby; this is a behaviour change so will be implemented in the next major realease (repmgr 4.1). Implements GitHub #424.	2018-04-25 11:58:24 +09:00
Ian Barwick	dfdebd6c08	Enable provision of "archive_cleanup_command" in recovery.conf If "archive_cleanup_command" is defined in "repmgr.conf", a corresponding entry will be made in the node's "recovery.conf" file after cloning a standby. Note that we recommend using PgBarman to manage WAL archives, but are providing this facility to help repmgr to be integrated in existing environments. Implements GitHub #416.	2018-04-03 14:10:21 +09:00
Ian Barwick	63a11f8926	"standby promote": make timeout values configurable This introduces following new configuration file parameters, which were previously hard-coded values: - promote_check_timeout - promote_check_interval Implements GitHub #387.	2018-04-03 14:10:14 +09:00
Ian Barwick	55441f2729	repmgrd: add configuration file parameter "standby_reconnect_timeout" This is used for determining a timeout when reconnecting to the standby after executing the "follow_command". This will normally not need to be set explicitly, but maybe useful in cases where the standby's startup phase can last longer than usual.	2018-03-02 11:04:56 +09:00
Ian Barwick	9b56f157dc	Move parse_output_to_argv() to configfile.c So it can be used by parse_pg_basebackup_options(). Addresses GitHub #376.	2018-02-07 09:47:50 +09:00
Ian Barwick	26a9e848fd	Update copyright notices to 2018	2018-01-02 10:19:46 +09:00
Ian Barwick	a6cc4d80f0	Add "witness register" functionality	2017-11-15 13:47:45 +09:00
Ian Barwick	eb14bb58c6	Add configuration file "passfile" This will enable a custom .pgpass to be included in "primary_conninfo" (provided it's supported by the libpq version on the standby).	2017-11-14 19:30:25 +09:00
Ian Barwick	30b11c08e6	Disable any configuration settings not compatible with PostgreSQL 9.3 And emit a warning while we're at it.	2017-09-18 13:12:38 +09:00
Ian Barwick	55f203a2fc	Add "-o ConnectTimeout=10" as default in "ssh_options"	2017-09-13 13:23:16 +09:00
Ian Barwick	a9f4a027a7	pgindent run	2017-09-11 11:14:13 +09:00
Ian Barwick	e4f7dc8234	Add copyright notices	2017-09-08 13:27:39 +09:00
Ian Barwick	0e0b221507	Add configuration file setting "use_primary_conninfo_password" If, for whatever reason, the upstream server password needs to be set in "primary_conninfo", enable it to be extracted from $PGPASSWORD.	2017-08-31 14:57:07 +09:00
Ian Barwick	c1ed248fb1	Handle "event_notifications" when reloading configuration	2017-08-25 23:07:07 +09:00
Ian Barwick	5208655a35	Parse "recovery_min_apply_delay" from recovery.conf	2017-08-25 21:47:14 +09:00
Ian Barwick	5ee1eb6bf7	Convert --recovery-min-apply-delay to configuration file option That way it only needs to be set once, and won't get lost during follow operations etc.	2017-08-25 21:25:15 +09:00
Ian Barwick	6259463007	repmgrd: various fixes for "manual" failover mode	2017-08-23 10:56:55 +09:00
Ian Barwick	b1ba476241	Rename "archiver" check etc. to "archive-ready" Gives a better indication of what's being checked.	2017-08-17 12:23:56 +09:00
Ian Barwick	f972aec198	Parse recovery.conf file This will be useful for various kinds of diagnostics.	2017-08-10 23:58:16 +09:00
Ian Barwick	1d99a07b43	Store configuration file in repmgr.nodes table When executing repmgr on remote nodes, we otherwise end up jumping through hoops as we can't make assumptions about where the configuration file is located, but really need to be able to provide it. From a support point of view it will also make life easier as it will be easy to specify exactly which file to provide.	2017-08-10 08:03:24 +09:00
Ian Barwick	f2cf46bba3	Check replication lag before attempting switchover	2017-08-08 10:16:47 +09:00
Ian Barwick	2499b42ef8	switchover: check for pending archive files on the demotion candidate If the current primary (demotion candidate) still has any files to archive, it will delay the shutdown until all files are archived. If there is a substantial number of files, and/or the archive command executes slowly, this will probably lead to an unwelcome delay in the switchover process.	2017-08-08 00:37:20 +09:00
Ian Barwick	112ca6321a	Initial switchover implementation The repmgr3 implementation required the promotion candidate (standby) to directly work with the demotion candidate's data directory, directly execute server control commands etc. Here we delegated a lot more of that work to the repmgr on the demotion candidate, which reduces the amount of back-and-forth over SSH and generally makes things cleaner and smoother. In particular the repmgr on the demotion candidate will carry out a thorough check that the node is shut down and report the last checkpoint LSN to the promotion candidate; this can then be used to determine whether pg_rewind needs to be executed on the demoted primary before reintegrating it back into the cluster (todo). Also implement "--dry-run" for this action, which will sanity-check the nodes as far as possible without executing the switchover. Additionally some of the new repmgr node commands (or command options) introduced for this can be also executed by the user to obtain additional information about the status of each node.	2017-08-03 16:38:37 +09:00
Ian Barwick	c67aa15581	Make "pgdata" a mandatory configuration file setting There are some circumstances, e.g. during switchover operations, where repmgr may need to operate on a data directory while the server isn't running, in which case there's no way to retrieve that information.	2017-08-02 23:04:24 +09:00
Ian Barwick	83cda89362	Get data directory for server commands if needed Also add configuration file option "pgdata" for hard-coding the node's data directory - if the "repmgr" DB user isn't a superuser or doesn't have permission to extract the data directory, we'll need another way of finding out.	2017-08-02 13:16:16 +09:00
Ian Barwick	7cf3b9b618	repmgrd: improve logging of BDR monitoring Also always log information about event_notification command	2017-07-27 21:12:41 +09:00
Ian Barwick	4cf66c33db	repmgrd: more fixes to BDR recovery handling	2017-07-27 16:33:41 +09:00
Ian Barwick	eff26b496c	repmgrd: updates for BDR monitoring	2017-07-27 09:49:53 +09:00
Ian Barwick	56b2e9bb84	Rename/add configuration file options In previous versions of repmgr, some options had ambiguous meanings, and/or were used for slightly different purposes. This way we end up with a couple more options (most of which probably won't need adjusting) but greater clarity and flexibility. Removed: master_reponse_timeout: renamed to "async_query_timeout", as this was its main usage retry_promote_interval_secs: replaced by "primary_notification_timeout" Added: async_query_timeout: timeout (in seconds) when executing asynchronous queries primary_notification_timeout: number of seconds to wait for notification from the new primary after a failover primary_follow_timeout: number of seconds to wait for the new primary to become available when executing "repmgr standby follow"	2017-07-25 11:13:32 +09:00
Ian Barwick	38730033d4	Miscellaneous code cleanup	2017-07-20 09:11:38 +09:00
Ian Barwick	ec00202a31	Add configure option --with-bdr-only Builds repmgr with only BDR functionality; other code is disabled at critical points.	2017-07-16 17:18:34 +09:00
Ian Barwick	a29bc3e0fa	Rename config.[ch] to configfile.[ch]	2017-07-16 09:41:26 +09:00

43 Commits