repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-23 07:06:30 +00:00

Author	SHA1	Message	Date
Ian Barwick	fcd111ac4c	Improve logging output during failover process	2017-08-24 22:44:03 +09:00
Ian Barwick	eee8d65259	Update view "replication_status"	2017-08-24 15:05:13 +09:00
Ian Barwick	a659132ea4	repmgrd: write monitoring statistics	2017-08-24 11:49:44 +09:00
Ian Barwick	a0bad5fdc0	General code cleanup	2017-08-16 23:09:02 +09:00
Ian Barwick	8ff545f9ae	Add --help output for "repmgr cluster"	2017-08-16 16:33:07 +09:00
Ian Barwick	4c0d719cdb	Add replication slot check to "repmgr node check"	2017-08-16 11:17:02 +09:00
Ian Barwick	554673e83e	Add "repmgr node check --downstream"	2017-08-15 15:50:46 +09:00
Ian Barwick	10ef30096c	"node check": add server role check	2017-08-14 22:57:09 +09:00
Ian Barwick	3b2158edbf	Initialise variables, where appropriate	2017-08-14 15:11:42 +09:00
Ian Barwick	eabd56f3be	"standby follow": check node system identifiers match	2017-08-14 11:45:08 +09:00
Ian Barwick	0f31756733	General code cleanup	2017-08-14 10:04:53 +09:00
Ian Barwick	b95b3e50e3	Return system identification information with appropriate data types	2017-08-14 08:50:54 +09:00
Ian Barwick	50b82f785e	Add function to execute "IDENTIFY_SYSTEM"	2017-08-11 22:01:02 +09:00
Ian Barwick	7ca68b7cc8	Standardize "primary_conninfo" generation Previously repmgr would write all the default libpq parameters into "primary_conninfo" on "standby clone", but not for "standby follow", which is inconsistent. For repmgr4 we'll determine that the upstream node's conninfo must be canonical and contain all required connection parameters, even if these are available as defaults or environment variables in the local environment, as those are transient and may not be available in all environments/situations. recovery.conf's "primary_conninfo" will be generated using the upstream's conninfo parameters, except for those specific to the downstream node. These are: - "application_name": this will always be set to the "node_name" of the downstream node - "passfile" and "servicefile": these, must of course reference files on the downstream node so will be extracted from the downstream node's conninfo, if set	2017-08-10 12:37:50 +09:00
Ian Barwick	5fb86771b1	Use stored node configuration file path when executing remote commands Makes life much easier.	2017-08-10 09:12:07 +09:00
Ian Barwick	1d99a07b43	Store configuration file in repmgr.nodes table When executing repmgr on remote nodes, we otherwise end up jumping through hoops as we can't make assumptions about where the configuration file is located, but really need to be able to provide it. From a support point of view it will also make life easier as it will be easy to specify exactly which file to provide.	2017-08-10 08:03:24 +09:00
Ian Barwick	f2cf46bba3	Check replication lag before attempting switchover	2017-08-08 10:16:47 +09:00
Ian Barwick	2499b42ef8	switchover: check for pending archive files on the demotion candidate If the current primary (demotion candidate) still has any files to archive, it will delay the shutdown until all files are archived. If there is a substantial number of files, and/or the archive command executes slowly, this will probably lead to an unwelcome delay in the switchover process.	2017-08-08 00:37:20 +09:00
Ian Barwick	972f8394ff	Fix slot deletion after switchover	2017-08-04 13:16:46 +09:00
Ian Barwick	82639b6903	Refactor slot name handling Better to work with the slot name in a node record, rather than creating a global variable.	2017-08-04 11:56:11 +09:00
Ian Barwick	5948cf6cda	repmgr standby switchover: add sanity check for pg_rewind useability pg_rewind will only be executed on a demoted primary if explictly requested, to prevent transactions on the primary, which were never replicated, from being automatically overwritten. If --force-rewind is provided, we'll need to check pg_rewind is actually useable before we need to use it.	2017-08-04 00:45:55 +09:00
Ian Barwick	112ca6321a	Initial switchover implementation The repmgr3 implementation required the promotion candidate (standby) to directly work with the demotion candidate's data directory, directly execute server control commands etc. Here we delegated a lot more of that work to the repmgr on the demotion candidate, which reduces the amount of back-and-forth over SSH and generally makes things cleaner and smoother. In particular the repmgr on the demotion candidate will carry out a thorough check that the node is shut down and report the last checkpoint LSN to the promotion candidate; this can then be used to determine whether pg_rewind needs to be executed on the demoted primary before reintegrating it back into the cluster (todo). Also implement "--dry-run" for this action, which will sanity-check the nodes as far as possible without executing the switchover. Additionally some of the new repmgr node commands (or command options) introduced for this can be also executed by the user to obtain additional information about the status of each node.	2017-08-03 16:38:37 +09:00
Ian Barwick	aa528dfdfb	Consolidate generation of various server control commands This is needed for better switchover control, so we can instruct the remote repmgr to issue the appropriate server command rather than trying to work out what it should be from the local node.	2017-08-02 12:01:20 +09:00
Ian Barwick	e5d50bbfd5	Separate configuration file queries into a discrete function Simplifies main application code and makes it easier to reuse the queries.	2017-08-02 00:04:20 +09:00
Ian Barwick	f023b9c90c	Add "repmgr node archive-config"	2017-08-01 17:38:54 +09:00
Ian Barwick	3683d096f1	Avoid using PG_VERSION_NUM in frontend code Debian.	2017-08-01 10:43:42 +09:00
Ian Barwick	8a5665a421	repmgr node status: add information about current LSN locations for streaming standbys	2017-08-01 10:34:12 +09:00
Ian Barwick	7cf3b9b618	repmgrd: improve logging of BDR monitoring Also always log information about event_notification command	2017-07-27 21:12:41 +09:00
Ian Barwick	fed6fba4ef	repmgrd: more fixes for BDR node recovery	2017-07-27 14:13:39 +09:00
Ian Barwick	dc24d62009	repmgrd: improve BDR recovery handling	2017-07-27 11:53:55 +09:00
Ian Barwick	8a2e4db1bc	Add "repmgr node status" Outputs an overview of a node's status, and emits warnings if any issues detected.	2017-07-25 00:39:04 +09:00
Ian Barwick	93c35618a2	Use bdr.bdr_is_active_in_db() when checking for BDR presence	2017-07-24 19:09:09 +09:00
Ian Barwick	e9cdf1c870	Add note	2017-07-20 23:57:28 +09:00
Ian Barwick	1a45287e76	Misc updates and fixes	2017-07-20 21:15:55 +09:00
Ian Barwick	b99443b0c8	Improvements to `repmgr cluster show` Add documentation; show recovery status in --csv mode.	2017-07-20 10:25:13 +09:00
Ian Barwick	49ac9cf9ca	Add "repmgr cluster show"	2017-07-19 17:36:21 +09:00
Ian Barwick	a7b7d86ecc	repmgrd: handle manual failover mode correctly	2017-07-19 14:01:01 +09:00
Ian Barwick	23e6440dfd	repmgrd: initiate primary monitoring when local node is promoted manually	2017-07-19 11:15:38 +09:00
Ian Barwick	6e270b2faf	repmgrd: catch cases where more than one node has initiated voting The node(s) with higher ID will "yield", leaving the decision making up to the node with the lower ID. This happens very rarely, usually when the random delay is close enough on two or mode nodes that vote initiation is simultaneous.	2017-07-18 17:04:24 +09:00
Ian Barwick	2c8dd49831	repmgrd: additional check to ensure only one node handles failover It's possible the "failover" is completed by one repmgrd before the other has a chance to react, in which case the am_bdr_failover_handler() check will not apply. Instead check if the node record has already been set to "inactive".	2017-07-17 16:47:42 +09:00
Ian Barwick	a56bb41891	Remove redundant fields from node record struct	2017-07-17 14:11:14 +09:00
Ian Barwick	ec554e5694	Improve connection handling Set "connect_timeout" and "fallback_application_name" if not present.	2017-07-17 11:10:37 +09:00
Ian Barwick	951c7dbd07	repmgrd: in BDR mode, have each repmgrd monitor each node This will cover both the case when an entire node including repmgrd goes down, and when one PostgreSQL instance goes down but repmgrd is still up (in which case only one of the repmgrds will handle the failover).	2017-07-14 15:01:18 +09:00
Ian Barwick	e3b3fb65f0	repmgrd: restrict BDR monitoring to two node setup It's not safe to have more than two nodes with this kind of "failover", so we don't need to select alternative nodes by priority.	2017-07-14 12:56:11 +09:00
Ian Barwick	d653888c65	Support pre-10 WAL functions	2017-07-14 10:40:11 +09:00
Ian Barwick	dfcf85a62f	repmgrd: further BDR sanity checks	2017-07-14 10:27:28 +09:00
Ian Barwick	0320f409aa	Detect BDR capability via presence of extension	2017-07-13 14:13:46 +09:00
Ian Barwick	7eadbf6b17	Various improvements to "repmgr bdr register/unregister"	2017-07-12 22:38:03 +09:00
Ian Barwick	0a1addfdc0	When registering a BDR node, sync repmgr.nodes from another node If a BDR node is added via bdr_group_join(), repmgr.nodes will start off empty, so we'll need to sync it ourselves before adding it to the repmgr replication set.	2017-07-12 10:11:25 +09:00
Ian Barwick	1cccb1dd5a	Add "repmgr bdr unregister"	2017-07-12 10:11:21 +09:00

1 2 3

111 Commits