repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-23 15:16:29 +00:00

Author	SHA1	Message	Date
Ian Barwick	d26989bd12	node status: improve output and documentation In the default text output mode, list inactive slots. In CSV output mode, list inactive slots as additional information; add output line with number of missing slots and a list thereof. Also document --csv output mode.	2018-06-22 15:46:44 +09:00
Ian Barwick	a549941d4f	repmgr: don't count witness node as a standby when running "node status" Addresses GitHub #451.	2018-06-21 14:27:47 +09:00
Ian Barwick	afdaf9be66	_create_event(): log event and node ID for debugging	2018-06-11 15:20:01 +09:00
Ian Barwick	3dba8336e9	standby clone: don't assume existence of "user" in upstream conninfo Usually a seperate user (typically "repmgr") is set up specifically to manage the repmgr metadata, however there's no compelling requirement to do this, and it's possible the database owner (usually: "postgres") will be used, in which case it's possible the username will be left out of the conninfo string. Addresses GitHub #437.	2018-05-24 15:51:41 +09:00
Ian Barwick	97d0cee259	"config_file" is MAXPGPATH, not MAXLEN The two values are the same anyway, so change is more for consistency.	2018-05-22 17:19:55 +09:00
Ian Barwick	a880b6ce16	Include "arpa/inet.h" in dbutils.c Needed for htonl() on FreeBSD.	2018-05-10 11:25:52 +09:00
Ian Barwick	b553a70ad5	repmgrd: always close the connection if the pointer is not NULL	2018-04-25 14:08:17 +09:00
Ian Barwick	242fa287b4	repmgrd: catch corner case in standby connection handle check If repmgrd marks the local node as unavailable, and it was actually restarting but a failover event occured before the next local node check, failover will continue with the stale connection handle. Add a final local node check just before starting the failover process, so repmgrd can reconnect if it wasn't able to before.	2018-04-24 21:55:36 +09:00
Ian Barwick	afa942fef6	repmgrd: prevent standby connection handle from going stale If monitoring history not in use, there's no activity on the standby's connection handle, so if e.g. the standby is restarted, PQstatus() never returns CONNECTION_BAD and repmgrd never notices the connection is stale. Therefore execute a throw-away statement at "monitor_interval_secs".	2018-04-23 23:51:03 +09:00
Ian Barwick	09e597dcdd	Fix superuser password handling When establishing a superuser connection, the connection parameters were being copied from the existing (non-superuser) connection, which in some circumstances can lead to that user's password being included in the copied parameter list. The password parameter, if set, will now always be removed, which will cause libpq to retrieve the correct one from the .pgpass file. Addresses GitHub #400.	2018-04-12 12:50:17 +09:00
Ian Barwick	6a1797cadd	Enable pg_rewind to be used with PostgreSQL 9.3/9.4 pg_rewind is not part of the core distribution for those, but we provided support in repmgr 3.3 so should extend it to repmgr 4. Note that there is no check in place whether the pg_rewind binary exists, so it's up to the user to ensure it's present. Addresses GitHub #413.	2018-04-02 20:55:04 +09:00
Ian Barwick	94d26dbe9f	Always set "connect_timeout" when pinging a PostgreSQL instance Insert "connect_timeout=2" into the connection parameters, if not explicitly set by the user. This will prevent excessive wait time for the host operating system to report a connection timeout.	2018-04-02 09:31:42 +09:00
Ian Barwick	28c7737dc0	Log pg_control access errors as WARNINGs rather than DEBUG This will make it easier to diagnose issues, possibly with an incorrect "data_directory" setting in "repmgr.conf".	2018-03-30 11:24:44 +09:00
Ian Barwick	37e53108a2	Consolidate connection closure calls	2018-03-27 08:52:23 +09:00
Ian Barwick	a15e5c9d52	Tidy up queries in dbutils.c - standardize formatting - prefix various internal function calls with "pg_catalog.", to mitigate possible risks from CVE-2018-1058	2018-03-23 10:33:28 +08:00
Ian Barwick	c4f6abe951	Update HISTORY	2018-03-21 06:51:56 +09:00
Martín Marqués	e454fb77d3	While reviewing `7cb6e5af8d` before merging I noticed that besides the result cleanup added, there was still a missing spot inside the if condition. Adding the PQclear that was missing.	2018-03-21 06:51:50 +09:00
Andrzej Nowicki	b76e5852d3	One more memory leak fixed	2018-03-21 06:51:43 +09:00
Andrzej Nowicki	0674364ffd	Clear node list to avoid memory leak, fixes #402	2018-03-21 06:51:37 +09:00
Ian Barwick	b2eb9b8525	Correctly handle error message pointer when parsing strings. When parsing conninfo strings, ensure the error message pointer is actually returned to the caller. Not a criticial issue, just meant the contents of the error message were not being displayed.	2018-03-10 14:28:10 +09:00
Ian Barwick	6fbbe2a97a	"standby clone": fix --superuser handling get_superuser_connection() was erroneously using the local node record to connect to as a superuser, which works when registering the primary but obviously not when cloning a standby. Addresses GitHub #380.	2018-03-02 14:49:17 +09:00
Ian Barwick	518866eba5	"node status": improve replication slot warnings Addresses GitHub #385	2018-02-23 11:06:47 +09:00
Ian Barwick	425839d764	Fix typo in function name	2018-02-22 15:48:41 +09:00
Ian Barwick	3a764f678a	"standby clone": add --recovery-conf-only option This will generate "recovery.conf" for an existing standby. Typical use-case is a standby cloned manually from an external data source (e.g. Barman), where "recovery.conf" needs to be created (and if required a replication slot). The --dry-run option will check the pre-requisites but not actually create "recovery.conf" or a replication slot. This requires that the upstream node is running, a replication connection can be made and if required a replication slot can be created. Implements GitHub #382.	2018-02-22 15:47:19 +09:00
Ian Barwick	829cf5cca4	repmgrd: improve detection of status change from primary to standby If repmgrd is running in degraded mode on a primary which has been stopped, then manually been brought back online as a standby (e.g. by creating recovery.conf and starting the server), ensure it not only detects the change but automatically updates the node record so it can resume monitoring the node as a standby. Previously, repmgrd was looping waiting for the record to be updated (as is done transparently when executing "repmgr node rejoin") but if the record was not updated within the timeout period (e.g. by "repmgr standby register) it would fail to resume monitoring as a standby. It seems reasonable to have repmgrd automatically update the node record, as this will restore failover capability as quickly as possible. If this is not desired, then the onus is on the user to shut down repmgrd while making the desired changes.	2018-02-22 11:35:47 +09:00
Ian Barwick	b47448d0e5	Replace remaining instances of strcpy() with strncpy() Also use strncmp() to match.	2018-02-15 13:17:06 +09:00
Ian Barwick	c9eb1bfcc0	Always initialise t_conninfo_param_list structures	2018-02-13 10:48:18 +09:00
Ian Barwick	eb7dca2919	"node status": add warning about missing replication slots Implements GitHub #364.	2018-02-12 10:53:31 +09:00
Ian Barwick	fbbe7afd61	doc: update HISTORY and release notes	2018-02-09 11:42:16 +09:00
Ian Barwick	d3e1937808	"standby switchover": additional sanity checks Check that sufficient walsenders will be available on the promotion candidate, and if replication slots are in use check if enough of those will be available. Note these checks can't guarantee that the walsenders/slots will be available at the appropriate points during the switchover process, but do ensure that existing configuration problems will be caught. Implements GitHub #371.	2018-02-08 15:23:10 +09:00
Ian Barwick	64035ef701	"standby register/follow": provide primary node details for event notifications For events generated by these commands, it may be useful to know details of the primary node. This makes following additional parameters available to event notification scripts: - %p: node ID of the primary - %a: node name of the primary - %c: conninfo string for the primary Implements GitHub #375	2018-02-06 09:36:46 +09:00
Ian Barwick	f96cc3b906	"cluster show": improve handling of database errors In particular, if running "repmgr cluster show" against a database without the repmgr metadata, showing the error (rather than just "no records found" etc.) will provide some clues about the problem.	2018-02-05 10:15:48 +09:00
Ian Barwick	50894b6124	"standby follow": check for replication slot availability on target node	2018-02-02 15:01:23 +09:00
Ian Barwick	3d6437c8f8	repmgr: assume node is actually shutting down if pingable and that's the reported status	2018-01-16 11:17:06 +09:00
Ian Barwick	54b5c8ad94	repmgrd: log execution error in "repmgrd_get_local_node_id()" That shouldn't happen, but if it does it will make it easier to identify the issue.	2018-01-16 11:14:04 +09:00
Ian Barwick	ae7963dc64	repmgr: automatically create slot name if missing It's possible that a node was registered with "use_replication_slots=false" but that was later changed to "use_replication_slots=true". If the node was not subsequently re-registered, the node record will contain an empty slot name, which will cause any slot creation operation during "standby follow" or "node rejoin" to fail. To prevent this happening, check for an empty slot name and automatically set before proceeding. Addresses GitHub #343.	2018-01-11 11:13:41 +09:00
Ian Barwick	faffb2a6e7	repmgr: catch possible corner case when checking node shutdown status It's conceivable that PQping is returning "no response" but the shutdown hasn't quite completed.	2018-01-10 14:56:00 +09:00
Ian Barwick	5d57044118	repmgr: during switchover, correctly detect unclean shutdown status	2018-01-10 12:21:04 +09:00
Ian Barwick	07a88c78a5	repmgr standby switchover: add "%p" event notification parameter This will contain the node ID of the former primary.	2018-01-10 11:01:00 +09:00
Ian Barwick	aee12dc2c7	"repmgr bdr register": create missing connection replication set if needed Previously the assumption was that the "repmgr" replication set would be set up when the nodes are created, however no checks were implemented and this was not well-documented. Addresses GitHub #347.	2018-01-04 17:12:52 +09:00
Ian Barwick	c5c86e1ada	"repmgr bdr register": improve node name check We'll use "bdr.bdr_get_local_node_name()" to check the local BDR node name and the repmgr one match.	2018-01-04 16:07:06 +09:00
Ian Barwick	3d2530d6f9	Fix query in is_active_bdr_node() Boolean column was not being checked correctly. Also add detail output in "repmgr node role --check", where the function is called.	2018-01-04 10:48:31 +09:00
Ian Barwick	b26e400199	"repmgr cluster event": move query to dbutils.c	2018-01-04 10:06:54 +09:00
Ian Barwick	1521657965	Update copyright notices to 2018	2018-01-02 10:20:09 +09:00
Ian Barwick	54a10a0c3f	Add diagnostic option "repmgr node check --has-passfile" This checks if the active libpq version (9.6 and later) has the "passfile" option, and returns 0 if present, 1 if not. `	2017-12-05 12:53:04 +09:00
Ian Barwick	270da1294c	repmgr: initialise "voting_term" in "repmgr primary register" This previously happened in the extension SQL code, which could potentially cause replay problems if installing on a BDR cluster. As this table is only required for streaming replication failover, move the initialisation to "repmgr primary register". Addresses GitHub #344 .	2017-11-28 12:26:33 +09:00
Ian Barwick	f8a0b051c8	repmgr: fix return code output for repmgr node check --action=... Addresses GitHub #340	2017-11-23 10:35:41 +09:00
Martín Marqués	3e4a5e6ff5	Fix missing FQN for the nodes table. This bug was not detected before because most users work with the repmgr user. For that reason, the repmgr schema is already in the search_path by default. Add the repmgr schema to the nodes table in the LEFT JOIN used for cluster show (and in other places) Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>	2017-11-23 10:35:38 +09:00
Ian Barwick	67e27f9ecd	Remove unneeded functions	2017-11-20 15:26:32 +09:00
Ian Barwick	3f872cde0c	"repmgr node ...": fixes for 9.3 Mainly to account for the lack of replication slots.	2017-11-16 11:26:39 +09:00

1 2 3 4

181 Commits