repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-23 07:06:30 +00:00

Author	SHA1	Message	Date
Ian Barwick	81d01bf0e8	Canonicalize the data directory path when parsing the configuration file This ensures the provided path matches the path PostgreSQL reports as its data directory.	2019-06-07 09:53:44 +09:00
Ian Barwick	d893ce227b	repmgrd: optionally exclude/include witness server from child node checks	2019-06-03 16:04:54 +09:00
Ian Barwick	20d710e34c	doc: update filename referenced in code comment	2019-06-03 15:30:02 +09:00
Ian Barwick	45e17223b9	Update variable/field names relating to pg_basebackup's -X option Now the "xlog nomenclature" Pg versions are fading into the past, rename things related to handling pg_basebackup's -X option (was: --xlog-method, now: --wal-method) to start with "wal_" rather than "xlog_". This is a cosmetic change for code clarity.	2019-05-30 09:32:06 +09:00
Ian Barwick	5a90513878	repmgrd: monitor standbys attached to primary This functionality enables repmgrd (when running on the primary) to monitor connected child nodes. It will log connections and disconnections and generate events. Additionally, repmgrd can execute a custom script if the number of connected child nodes falls below a configurable threshold. This script can be used e.g. to "fence" the primary following a failover situation where a new primary has been promoted and all standbys are now child nodes of that primary.	2019-04-22 16:18:52 +09:00
Ian Barwick	80f66e87c9	Improve string handling during configuration file reload	2019-04-16 11:20:41 +09:00
Ian Barwick	1a344d488a	Use sizeof() consistently	2019-04-11 23:07:58 +09:00
Ian Barwick	dd454a8374	Miscellaneous string handling cleanup This is mainly to prevent effectively spurious truncation warnings in recent GCC versions.	2019-04-10 16:18:56 +09:00
Ian Barwick	ba1f05ece9	Restrict "node_name" to maximum 63 characters In "recovery.conf", the configuration parameter "node_name" is used as the "application_name" value, which will be truncated by PostgreSQL to 63 characters (NAMEDATALEN - 1). repmgr sometimes needs to be able to extract the application name from pg_stat_replication to determine if a node is connected (e.g. when executing "repmgr standby register"), so the comparison will fail if "node_name" exceeds 63 characters.	2019-03-28 10:37:57 +09:00
Ian Barwick	7d0caefaee	Fix logging related to "connection_check_type" Also log the selected type at repmgrd startup.	2019-03-20 11:58:18 +09:00
Ian Barwick	c2206b007a	repmgrd: optionally check upstream availability through connection attempts	2019-03-14 15:44:53 +09:00
Ian Barwick	fc397f25f6	repmgrd: enable election rerun If "failover_validation_command" is set, and the command returns an error, rerun the election. There is a pause between reruns to avoid "churn"; the length of this pause is controlled by the configuration parameter "election_rerun_interval".	2019-03-12 17:12:19 +09:00
Ian Barwick	b9cdcd55e7	doc: update list of reloadable repmgrd configuration options	2019-03-11 16:18:10 +09:00
Ian Barwick	663c2e75b4	Make "failover_validation_command" reloadable	2019-03-08 09:27:19 +09:00
Ian Barwick	db0d71c6a7	Initial implementation of "failover_validation_command"	2019-03-08 08:49:15 +09:00
Ian Barwick	6f4f56dd8c	Make recently added configuration options reloadable	2019-03-07 10:58:25 +09:00
Ian Barwick	33fefd9f52	Add configuration option "primary_visibility_consensus" This determines whether repmgrd should continue with a failover if one or more nodes report they can still see the standby.	2019-03-07 10:41:42 +09:00
Ian Barwick	a3f90d2bba	Add configuration option "sibling_nodes_disconnect_timeout" This controls the maximum length of time in seconds that repmgrd will wait for other standbys to disconnect their WAL receivers in a failover situation. This setting is only used when "standby_disconnect_on_failover" is set to "true".	2019-03-06 15:56:21 +09:00
Ian Barwick	1615353f48	repmgrd: optionally disconnect WAL receivers during failover This is intended to ensure that all nodes have a constant LSN while making the failover decision. This feature is experimental and needs to be explicitly enabled with the configuration file option "standby_disconnect_on_failover". Note enabling this option will result in a delay in the failover decision until the WAL receiver is disconnected on all nodes.	2019-03-06 15:53:57 +09:00
Ian Barwick	dd04ebb809	repmgrd: handle reconnect to restarted server when using "connection" checks	2019-03-06 14:54:05 +09:00
Ian Barwick	63f7ad546e	repmgrd: add option "connection_check_type" This enable selection of the method repmgrd uses to check whether the upstream node is available. Possible values are: - "ping" (default): uses PQping() to check server availability - "connection": executes a query on the connection to check server availability (similar to repmgr3.x).	2019-03-06 12:09:54 +09:00
Ian Barwick	9273e7af73	"standby switchover": avoid potential race condition with WAL location check Immediately after the demotion candidate (primary) has shut down, we can't be absolutely sure that the walreceiver has flushed all WAL to disk, so checking pg_last_wal_receive_lsn() at that point might not reflect the actual last available WAL location. To handle this, we'll loop for a while (timeout controlled by configuration parameter "wal_receive_check_timeout") before finally deciding whether the standby is still behind the shut-down primary. Addresses issue raised in GitHub #518.	2019-02-01 12:06:22 +09:00
Ian Barwick	70e4243a1d	Clean up calls to repmgr_atoi() In some places we were still providing "false" from the original implementation, which was intended to indicate whether a negative value was allowed. This has not been a problem, as it merely means we have been providing "0", which is the same thing; however we can finer-tune some of the calls (e.g. node ID must be or greater).	2019-01-30 11:43:43 +09:00
Ian Barwick	32b81e7d49	"daemon start": initial implementation	2019-01-29 13:01:14 +09:00
Ian Barwick	a48d408e4e	Consistently log strerror output as DETAIL	2019-01-29 12:10:55 +09:00
Ian Barwick	7dce3ed234	Update copyright notices to 2019	2019-01-21 14:54:35 +09:00
Ian Barwick	ba7ef9e643	doc: update PostgreSQL documentation links "/static/" path element no longer required.	2019-01-15 12:45:33 +09:00
Ian Barwick	3b10750a7f	doc: fix missing quotation marks Patch from Cédric Villemain	2018-11-12 10:22:07 +09:00
Ian Barwick	ab6c3d9b6e	Handle NULL strings when parsing boolean arguments	2018-10-17 11:47:32 +09:00
Ian Barwick	3e38759c02	use appendPQExpBufferStr/-Char() consistently	2018-10-04 08:42:42 +09:00
Ian Barwick	11d25e2aef	Add configuration parameter "repmgr_bindir" This is to facilitate remote invocation of repmgr when the repmgr binary is located somewhere other than the PostgreSQL binary directory, as it cannot be assumed all package maintainers will install repmgr there. This parameter is optional; if not set (the default), repmgr will fall back to "pg_bindir" (if set). Addresses GitHub #246.	2018-10-02 09:59:12 +09:00
Ian Barwick	401f903456	repmgrd: document parameters which can be reloaded via SIGHUP Also add a new subsection with details on reloading repmgrd configuration.	2018-09-27 10:44:23 +09:00
Ian Barwick	38e3aae053	repmgr: add parameter "shutdown_check_timeout" Previously, "repmgr standby switchover" used the configuration file parameters "reconnect_interval" and "reconnect_attempts" to define a timeout to determine whether the current primary (demotion candidate) has shut down. However, these parameters are intended for primary failure detection and are generally lower in value, while a controlled shutdown may take longer, resulting in the switchover being aborted as repmgr was not waiting long enough. To prevent this happening, parameter "shutdown_check_timeout" has been added. This complements the existing "standby_reconnect_timeout" parameter used by "repmgr standby switchover". Implements GitHub #504.	2018-09-25 11:34:06 +09:00
Ian Barwick	44a224ad92	repmgrd: fix configuration file reloading Don't allow "promote_command" or "follow_command" to be empty. GitHub #486.	2018-08-02 16:35:26 +09:00
Ian Barwick	cb46fb6410	repmgrd: when reloading configuration, log any errors encountered	2018-07-16 16:46:39 +09:00
Ian Barwick	a194cf56b3	repmgr: exit with an error if an unrecognised command line option is provided. This matches the behaviour of other PostgreSQL utilities such as psql, though repmgr will only abort once all command line options are parsed, so as many errors as possible are found and displayed. If a repmgr "command" (e.g. "repmgr primary ..." was provided, a hint about the relevant command help section (e.g. "repmgr primary --help") will be provided alongside the generic help command (i.e. "repmgr --help"). Addresses GitHub #464, with further improvements.	2018-07-04 11:02:50 +09:00
Ian Barwick	802755fd60	repmgrd: daemonize process by default It's hard to imagine a use case where this isn't desirable, but in case, for whatever reason, the user does not wish to daemonize the process, the command line option "--daemonize=false" can be provided. Implements GitHub #458.	2018-06-29 22:01:49 +09:00
Ian Barwick	8d636690bd	repmgrd: create pid file by default Traditionally repmgrd will only write a pidfile if explicitly requested with -p/--pid-file. However it's normally desirable to have a pidfile, and it's preferable to have one used by default to prevent accidentally starting a second repmgrd instance. Following changes made: - add configuration file parameter "repmgrd_pid_file" (initially overridden by -p/--pid-file for backwards compatibility, though eventually we'll want to drop -p/--pid-file altogether) - add command line option --no-pid-file - if neither "repmgrd_pid_file" nor -p/--pid-file is set, create the pid file in a temporary directory Implements GitHub #457.	2018-06-29 14:36:24 +09:00
Ian Barwick	b2081dca52	De-overload configuration file parameter "standby_reconnect_timeout" Currently the (very generic sounding) "standby_reconnect_timeout" configuration file parameter is used in several different contexts and it would be useful to have more granular control over the different timeouts it's used to configure. This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout" (which wasn't documented) when "repmgr node rejoin" is executed, to determine how long to wait for the node to rejoin the replication cluster. Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for failover situations, when repmgrd executes "repmgr standby follow" to follow a new primary, and waits for the standby to restart and become available for connections. "standby_reconnect_timeout" is now only relevant for "repmgr standby switchover". Implements GitHub #454.	2018-06-28 18:00:55 +09:00
Ian Barwick	d60bd232f0	Enable "recovery_min_apply_delay" to be zero. Addresses GitHub #448.	2018-06-14 11:11:33 +09:00
Ian Barwick	efc388065e	standby follow: check node has connect to new primary After restarting the standby, poll pg_stat_replication on the upstream until the standby connects, and exit with an error if it doesn't by the timeout defined in "standby_follow_timeout". Implments GitHub #444.	2018-06-07 15:04:45 +09:00
Ian Barwick	635bdccb2c	Fix parsing of "archive_ready_critical" configuration file parameter. Per report in GitHub #426.	2018-04-28 07:00:56 +09:00
Ian Barwick	8320179f34	Add configuration file parameter "config_directory" This enables explicit provision of an external configuration file directory, which if set will be passed to "pg_ctl" as the -D parameter. Otherwise "pg_ctl" will default to using the data directory, which will cause some operations to fail if the configuration files are not present there. Note this is implemented primarily for feature completeness and for development/testing purposes. Users who have installed "repmgr" from a package should not rely on "pg_ctl" to stop/start/restart PostgreSQL, instead they should set the appropriate "service_..._command" for their operating system. For more details see: https://repmgr.org/docs/4.0/configuration-service-commands.html Note: in a future release, the presence of "config_directory" in repmgr.conf will be used to implictly set "--copy-external-config-files=samepath" when cloning a standby; this is a behaviour change so will be implemented in the next major realease (repmgr 4.1). Implements GitHub #424.	2018-04-25 11:58:24 +09:00
Ian Barwick	dfdebd6c08	Enable provision of "archive_cleanup_command" in recovery.conf If "archive_cleanup_command" is defined in "repmgr.conf", a corresponding entry will be made in the node's "recovery.conf" file after cloning a standby. Note that we recommend using PgBarman to manage WAL archives, but are providing this facility to help repmgr to be integrated in existing environments. Implements GitHub #416.	2018-04-03 14:10:21 +09:00
Ian Barwick	63a11f8926	"standby promote": make timeout values configurable This introduces following new configuration file parameters, which were previously hard-coded values: - promote_check_timeout - promote_check_interval Implements GitHub #387.	2018-04-03 14:10:14 +09:00
Ian Barwick	e1413fa8ea	Fix minimum accepted value for "degraded_monitoring_timeout" Should be -1, the default. Addresses GitHub #411.	2018-03-29 21:15:03 +09:00
Ian Barwick	55441f2729	repmgrd: add configuration file parameter "standby_reconnect_timeout" This is used for determining a timeout when reconnecting to the standby after executing the "follow_command". This will normally not need to be set explicitly, but maybe useful in cases where the standby's startup phase can last longer than usual.	2018-03-02 11:04:56 +09:00
Ian Barwick	f5f02ae0ee	Replace remaining instances of strcpy() with strncpy() Also use strncmp() to match.	2018-02-15 13:31:55 +09:00
Ian Barwick	9b56f157dc	Move parse_output_to_argv() to configfile.c So it can be used by parse_pg_basebackup_options(). Addresses GitHub #376.	2018-02-07 09:47:50 +09:00
Ian Barwick	26a9e848fd	Update copyright notices to 2018	2018-01-02 10:19:46 +09:00

1 2

93 Commits