repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-07-16 22:39:04 +00:00

Author	SHA1	Message	Date
Ian Barwick	7fda2a1bcf	doc: fix typo in repmgr.conf.sample	2018-10-08 09:37:41 +09:00
Ian Barwick	d26141b8ab	Fix LWLockRelease() call in unset_bdr_failover_handler()	2018-10-08 09:37:31 +09:00
Ian Barwick	4a6b5fe913	Update control file checks for PostgreSQL 11	2018-09-27 14:08:39 +09:00
Ian Barwick	a71e644255	repmgrd: document parameters which can be reloaded via SIGHUP Also add a new subsection with details on reloading repmgrd configuration.	2018-09-27 10:44:34 +09:00
Ian Barwick	8646fd6004	doc: fix link in 4.1.1 release notes	2018-09-25 14:30:57 +09:00
Ian Barwick	3e1bb1a523	doc: minor fixes to "repmgr.conf.sample"	2018-09-25 10:54:54 +09:00
Ian Barwick	f5e58fc062	doc: update "repmgr node rejoin" documentation Clarify various points related to --force-rewind and pg_rewind usage.	2018-09-14 14:09:33 +09:00
Ian Barwick	6b95a96f3a	repmgr: improve "cluster show" output Only output full contents of connection error messages in --verbose mode, otherwise it can spew a lot of text onto the screen.	2018-09-12 14:17:39 +09:00
Ian Barwick	bd146ae9ac	repmgrd: update local node id in shared memory after local node restart Also ensure local node restarts are handled more elegantly, so we're not surprised by a stale connection handle. GitHub #502.	2018-09-12 14:17:35 +09:00
Ian Barwick	c7f8e48d12	Bump version 4.1.2	2018-09-07 13:08:55 +09:00
Ian Barwick	322190516c	doc: update link	2018-09-05 15:41:32 +09:00
Ian Barwick	31a49ff781	doc: update version v4.1.1	2018-09-04 12:33:44 +09:00
Ian Barwick	a6f99b58dd	doc: update 4.1.1 release notes	2018-09-04 12:33:10 +09:00
Ian Barwick	09b041433e	doc: update 4.1.1 release notes	2018-09-04 09:46:59 +09:00
Ian Barwick	058c8168e1	repmgrd: fix syntax	2018-08-30 15:54:31 +09:00
Ian Barwick	0468e47ef3	repmgrd: improve reconnection handling Previously, if the server being monitored was not available, repmgrd would always close the existing connection handle and open a new one. However, in some cases, e.g. a brief network outage, the existing connection handle is still good and does not need to be reopened. This could be particularly problematic if monitoring_history is on, as this risks leaving orphan sessions on the primary which (given a sufficiently unstable network) could lead to all available backends being occupied. Instead, during an outage we now use a new connection to verify the server is accessible; if the old connection is still available (e.g. following a short network interruption) we continue using that; if not (e.g. the server was restarted), we use the new one.	2018-08-30 15:47:49 +09:00
Ian Barwick	216326f316	doc: update release notes	2018-08-30 13:09:41 +09:00
Ian Barwick	3fb20ce774	repmgr: improve slot handling in "node rejoin" On the rejoined node, if a replication slot for the new upstream exists (which is typically the case after a failover), delete that slot. Also emit a warning about any inactive replication slots which may need to be cleaned up manually. GitHub #499.	2018-08-30 11:57:44 +09:00
Ian Barwick	e468ca859e	repmgrd: improve monitoring statistics logging Add more granular logging to help diagnose issues, and also keep track of when the last monitoring statistics update was set and emit that as DETAIL every time we emit a log status update.	2018-08-29 14:48:30 +09:00
Ian Barwick	623c84c022	Add additional query error logging It's unlikely we'll get an error in these cases, but you never know. Also, with queries which return a list of node records, it's necessary to call _populate_node_records() even if the query fails, so a properly initalised, albeit empty list is returned to the caller.	2018-08-29 10:27:42 +09:00
Ian Barwick	c2dded1d7b	Log text of failed queries at log level ERROR Previously query texts were always logged at log level DEBUG, but that doesn't help much in a normal production environment when trying to identify the cause of issues. Also make various other minor improvements to query logging and handling of database errors. Implements GitHub #498.	2018-08-29 10:09:51 +09:00
Ian Barwick	457dbbd267	"standby switchover": improve replication connection check Previously repmgr would first check that a replication can be made from the demotion candidate to the promotion candidate, however it's preferable to sanity-check the number of available walsenders first, to provide a more useful error message.	2018-08-24 16:31:46 +09:00
Ian Barwick	5485c06bc1	doc: fix internal link	2018-08-24 09:43:18 +09:00
Cédric Villemain	00ae42eb07	Fix grep to find conninfo it used to use \t* but [[:space:]] should be better as it does match more kind of spaces (the current one being broken in my case on RH7)	2018-08-24 09:20:51 +09:00
Ian Barwick	33525491ae	doc: update package signing key link	2018-08-23 12:33:48 +09:00
Ian Barwick	8c84f7a214	doc: update source requirement links Per report from Daymel Bonne.	2018-08-23 10:56:49 +09:00
Ian Barwick	efe4bed88e	doc: improve event notification documentation - add undocumented events (per report from Daymel Bonne) - split up list into sections for better overview - where feasible, add cross-links	2018-08-23 10:22:05 +09:00
Ian Barwick	9ba8dcbac3	doc: clarify statement about BDR HA support	2018-08-23 09:36:58 +09:00
Ian Barwick	a8996a5bfa	doc: clarify when "standby follow" can be used. The unqualified wording previously implied that any running server could be rejoined with "standby follow", which is not the case with a "split brain" primary.	2018-08-21 13:53:21 +09:00
Ian Barwick	4cbba98193	repmgr: add "cluster_cleanup" event GitHub #492.	2018-08-20 16:48:08 +09:00
Ian Barwick	23e6b85de3	doc: document sources of old package versions	2018-08-20 14:16:48 +09:00
Ian Barwick	d5ecb09f22	doc: add information about snapshot packages	2018-08-20 13:03:04 +09:00
Ian Barwick	719dd93676	doc: update release notes	2018-08-20 12:33:11 +09:00
Ian Barwick	5747f1d446	repmgrd: improve cascaded standby failover handling In particular, improve handling of the case where the standby follow command fails due to the primary not being available. GitHub #480.	2018-08-16 17:14:05 +09:00
Ian Barwick	9313b43cb1	repmgrd: fix PQExpBuffer handling in upstream failover handler Was sometimes leading to blank log lines.	2018-08-16 16:14:14 +09:00
Ian Barwick	5aeb1b0589	repmgrd: don't imply primary is in recovery if it's not available	2018-08-16 15:31:25 +09:00
Ian Barwick	6c93388848	repmgrd: fix "repmgrd_upstream_reconnect" event notification Upstream node is not always the primary node. Per report in GitHub #480.	2018-08-16 14:57:11 +09:00
Ian Barwick	d4ad8ce20c	"standby clone" - don't copy external config files in dry run mode Avoid copying files during a --dry-run as it may introduce unexpected changes on the target node. During an actual clone operation, any problems with copying files will be detected early and the operation aborted before the actual database cloning commences. GitHub #491.	2018-08-16 14:03:39 +09:00
Ian Barwick	bacab8d31c	"standby promote": improve log messages Make it clearer what repmgr is waiting for, and what to do if the promotion appears to fail.	2018-08-16 11:52:18 +09:00
Ian Barwick	14856e3a4d	repmgrd: ensure primary connection handle is refreshed after reconnect In some circumstances, if monitoring history was in use, repmgrd was attempting to fetch the primary's current LSN on a stale connection handle.	2018-08-15 16:57:21 +09:00
Ian Barwick	ca9242badb	repmgr: fix handling of slot creation error when cloning If cloning from another node other than the intended upstream, and replication slots are in use, once the cloning process is complete, repmgr will attempt to connect to the intended upstream to create the replication slot. Previously it would abort with a connection error, but as this issue is not fatal to the cloning process itself, and in some situations may be intentional, it's better to log a warning and continue. We should probably collate this (and any similar items needing attention after the cloning operation) into a list output at the end, otherwise the warning may get overlooked.	2018-08-15 15:11:13 +09:00
Ian Barwick	ff0929e882	doc: update FAQ Explain why some values in recovery.conf are surrounded by pairs of single quotes.	2018-08-15 14:48:23 +09:00
Abhijit Menon-Sen	8cd1811edb	Fix upstream node name in warning This log_warning is supposed to reproduce the error in the block above, but used the current node's name instead of the intended upstream node.	2018-08-14 10:10:50 +09:00
Ian Barwick	bf15c0d40f	doc: improve "repmgr cluster cleanup" documentation	2018-08-14 10:09:18 +09:00
Ian Barwick	9ae9d31165	repmgr: truncate version string if necessary Some distributions may add extra information to PG_VERSION after the actual version number (e.g. "10.4 (Debian 10.4-2.pgdg90+1)"), so copy the version number string up until the first space is found. GitHub #490.	2018-08-14 09:56:54 +09:00
Ian Barwick	d5064bdc02	doc: clarify repmgrd FAQ item "priority" must be 0 or greater.	2018-08-10 10:53:08 +09:00
Ian Barwick	9d0524a008	doc: update FAQ Add note about why repmgrd refuses to start up if the upstream is not running.	2018-08-10 10:47:23 +09:00
Ian Barwick	5398fd2d22	doc: better explain where pg_bindir won't be applied Basically any setting which can contain a user-defined script must have the full path set, even if it's repmgr being executed. We could potentially apply some heuristics to detect if the first item in the setting is "repmgr" (or more precisely repmgrd's program name), but this will require some careful thought and testing that it works as intended.	2018-08-10 10:29:06 +09:00
Ian Barwick	4c44c01380	doc: update release notes	2018-08-10 09:52:39 +09:00
Ian Barwick	5113ab0274	repmgrd: fix startup on witness node when local data is stale Previously, when running on a witness server, repmgrd didn't consider the local cache of the "repmgr.nodes" table might be outdated, e.g. as repmgrd wasn't running on the witness server during a failover, so could potentially end up monitoring a former primary now running as a standby. When running on a witness server, at startup repmgrd will now scan all nodes to determine the current primary, and refresh its local cache from there. This will also ensure it can start up even if the node currently registered as primary in the local cache is not available. Implements GitHub #488 and #489.	2018-08-09 16:42:20 +09:00

1 2 3 4 5 ...

965 Commits