repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-06-01 11:49:06 +00:00

Author	SHA1	Message	Date
Ian Barwick	82639b6903	Refactor slot name handling Better to work with the slot name in a node record, rather than creating a global variable.	2017-08-04 11:56:11 +09:00
Ian Barwick	2c682b31c2	Attempt to delete replication slot on old primary after switchover	2017-08-04 11:55:54 +09:00
Ian Barwick	c34f5c1ed1	Initial switchover code	2017-08-04 09:39:30 +09:00
Ian Barwick	5948cf6cda	repmgr standby switchover: add sanity check for pg_rewind useability pg_rewind will only be executed on a demoted primary if explictly requested, to prevent transactions on the primary, which were never replicated, from being automatically overwritten. If --force-rewind is provided, we'll need to check pg_rewind is actually useable before we need to use it.	2017-08-04 00:45:55 +09:00
Ian Barwick	0815accdef	Formatting fix	2017-08-03 23:58:25 +09:00
Ian Barwick	7d77fd4072	Log successful switchover event	2017-08-03 17:02:30 +09:00
Ian Barwick	112ca6321a	Initial switchover implementation The repmgr3 implementation required the promotion candidate (standby) to directly work with the demotion candidate's data directory, directly execute server control commands etc. Here we delegated a lot more of that work to the repmgr on the demotion candidate, which reduces the amount of back-and-forth over SSH and generally makes things cleaner and smoother. In particular the repmgr on the demotion candidate will carry out a thorough check that the node is shut down and report the last checkpoint LSN to the promotion candidate; this can then be used to determine whether pg_rewind needs to be executed on the demoted primary before reintegrating it back into the cluster (todo). Also implement "--dry-run" for this action, which will sanity-check the nodes as far as possible without executing the switchover. Additionally some of the new repmgr node commands (or command options) introduced for this can be also executed by the user to obtain additional information about the status of each node.	2017-08-03 16:38:37 +09:00
Ian Barwick	c67aa15581	Make "pgdata" a mandatory configuration file setting There are some circumstances, e.g. during switchover operations, where repmgr may need to operate on a data directory while the server isn't running, in which case there's no way to retrieve that information.	2017-08-02 23:04:24 +09:00
Ian Barwick	83cda89362	Get data directory for server commands if needed Also add configuration file option "pgdata" for hard-coding the node's data directory - if the "repmgr" DB user isn't a superuser or doesn't have permission to extract the data directory, we'll need another way of finding out.	2017-08-02 13:16:16 +09:00
Ian Barwick	791640e3b4	repmgrd: never execute "service_promote_command" directly	2017-08-02 12:09:25 +09:00
Ian Barwick	aa528dfdfb	Consolidate generation of various server control commands This is needed for better switchover control, so we can instruct the remote repmgr to issue the appropriate server command rather than trying to work out what it should be from the local node.	2017-08-02 12:01:20 +09:00
Ian Barwick	5b7b276ada	Make log levels case-insensitive	2017-08-02 09:46:53 +09:00
Ian Barwick	e5d50bbfd5	Separate configuration file queries into a discrete function Simplifies main application code and makes it easier to reuse the queries.	2017-08-02 00:04:20 +09:00
Ian Barwick	a1ad62d04e	Add "repmgr node restore-config"	2017-08-01 22:13:32 +09:00
Ian Barwick	f023b9c90c	Add "repmgr node archive-config"	2017-08-01 17:38:54 +09:00
Ian Barwick	3683d096f1	Avoid using PG_VERSION_NUM in frontend code Debian.	2017-08-01 10:43:42 +09:00
Ian Barwick	8a5665a421	repmgr node status: add information about current LSN locations for streaming standbys	2017-08-01 10:34:12 +09:00
Ian Barwick	d00cb63179	repmgrd: prevent segfault if no configfile provided	2017-07-31 12:54:23 +09:00
Ian Barwick	fbe74cbee4	Rename repmgr{d}4 binaries to repmgr{d} This was useful during initial development but now no longer required.	2017-07-31 10:37:15 +09:00
Ian Barwick	8d7d83347a	repmgrd: add log line to indicate node recovery detected	2017-07-31 09:58:13 +09:00
Ian Barwick	3582a80e48	Rename package from repmgr4 to repmgr	2017-07-28 12:21:55 +09:00
Ian Barwick	dd73039d02	Update BDR documentation	2017-07-27 21:44:10 +09:00
Ian Barwick	7cf3b9b618	repmgrd: improve logging of BDR monitoring Also always log information about event_notification command	2017-07-27 21:12:41 +09:00
Ian Barwick	0037d58dae	Update README	2017-07-27 18:12:29 +09:00
Ian Barwick	5606434a97	Initial BDR failover documentation	2017-07-27 18:11:49 +09:00
Ian Barwick	42ecf5de74	Add TODO for repmgr cluster show	2017-07-27 18:11:13 +09:00
Ian Barwick	4c2ba42000	Update sample configuration file	2017-07-27 18:10:56 +09:00
Ian Barwick	4cf66c33db	repmgrd: more fixes to BDR recovery handling	2017-07-27 16:33:41 +09:00
Ian Barwick	b4a655d074	Update README	2017-07-27 16:33:23 +09:00
Ian Barwick	fed6fba4ef	repmgrd: more fixes for BDR node recovery	2017-07-27 14:13:39 +09:00
Ian Barwick	dc24d62009	repmgrd: improve BDR recovery handling	2017-07-27 11:53:55 +09:00
Ian Barwick	d8a1799215	Update -?/--help output	2017-07-27 10:08:32 +09:00
Ian Barwick	eff26b496c	repmgrd: updates for BDR monitoring	2017-07-27 09:49:53 +09:00
Ian Barwick	a9b0c16b3c	Add "cluster matrix" and "cluster crosscheck" actions	2017-07-26 11:24:33 +09:00
Ian Barwick	c3083a0ba0	repmgr node status: add "raw" data columns too	2017-07-25 12:06:42 +09:00
Ian Barwick	2a08317984	repmgr node status: optional CSV output	2017-07-25 11:26:09 +09:00
Ian Barwick	56b2e9bb84	Rename/add configuration file options In previous versions of repmgr, some options had ambiguous meanings, and/or were used for slightly different purposes. This way we end up with a couple more options (most of which probably won't need adjusting) but greater clarity and flexibility. Removed: master_reponse_timeout: renamed to "async_query_timeout", as this was its main usage retry_promote_interval_secs: replaced by "primary_notification_timeout" Added: async_query_timeout: timeout (in seconds) when executing asynchronous queries primary_notification_timeout: number of seconds to wait for notification from the new primary after a failover primary_follow_timeout: number of seconds to wait for the new primary to become available when executing "repmgr standby follow"	2017-07-25 11:13:32 +09:00
Ian Barwick	cbe19d5868	repmgr node status: collate output into list To make output in different formats (e.g. CSV) easier.	2017-07-25 09:27:21 +09:00
Ian Barwick	a793e951b6	Remove unused function PQExpBuffers used to generate SQL, no need to worry about maximum query length and more flexible for generating dynamic queries.	2017-07-25 08:22:21 +09:00
Ian Barwick	8a2e4db1bc	Add "repmgr node status" Outputs an overview of a node's status, and emits warnings if any issues detected.	2017-07-25 00:39:04 +09:00
Ian Barwick	93c35618a2	Use bdr.bdr_is_active_in_db() when checking for BDR presence	2017-07-24 19:09:09 +09:00
Ian Barwick	d3c2a0f505	repmgrd: record bdr_recovery event on the node which was up Attempting to write on the recovered node may result in an error if it hadn't already started up.	2017-07-24 18:56:18 +09:00
Ian Barwick	8f2dde3bde	repmgrd: log BDR node recovery on the running node, not the recovered node The recovered node might still be starting up.	2017-07-24 12:50:51 +09:00
Ian Barwick	e9cdf1c870	Add note	2017-07-20 23:57:28 +09:00
Ian Barwick	1a45287e76	Misc updates and fixes	2017-07-20 21:15:55 +09:00
Ian Barwick	b99443b0c8	Improvements to `repmgr cluster show` Add documentation; show recovery status in --csv mode.	2017-07-20 10:25:13 +09:00
Ian Barwick	a5c5d9fa40	Show BDR status in "repmgr cluster show" output	2017-07-20 09:23:24 +09:00
Ian Barwick	38730033d4	Miscellaneous code cleanup	2017-07-20 09:11:38 +09:00
Ian Barwick	8dcfbfc313	Improve "repmgr cluster show" display Rather than simply emit "FAILED" for an unreachable node, indicate whether its state matches that expected by repmgr. E.g. following output: ID \| Name \| Role \| Status \| Upstream \| Connection string ----+-------+---------+----------------------+----------+---------------------------------------------------- 1 \| node1 \| primary \| * running \| \| host=localhost dbname=repmgr user=repmgr port=5501 2 \| node2 \| standby \| ? unreachable \| node1 \| host=localhost dbname=repmgr user=repmgr port=5502 3 \| node3 \| standby \| ! running as primary \| node1 \| host=localhost dbname=repmgr user=repmgr port=5503 is for a cluster where "node2" has been manually stopped, and "node3" manually promoted.	2017-07-19 23:16:16 +09:00
Ian Barwick	076934558d	Allow "CLUSTER EVENTS" as synonym for "CLUSTER EVENT"	2017-07-19 22:08:22 +09:00

1 2 3 4 5 ...

258 Commits