Commit Graph

1249 Commits

Author SHA1 Message Date
Ian Barwick
a7f3f899ff doc: update repmgrd example output 2019-03-20 12:10:31 +09:00
Ian Barwick
3ec43eda36 doc: remove references to "primary_visibility_consensus"
Feature remains experimental.
v4.3.0rc2
2019-03-18 17:43:16 +09:00
Ian Barwick
ce8e1cccc4 Remove outdated comment
This was only relevant for repmgr3 and earlier; in repmgr4 the schema
is hard-coded.
2019-03-18 15:19:25 +09:00
Ian Barwick
70bfa4c8e1 Clarify calls to check_primary_status()
Use a constant rather than a magic number to indicate non-provision
of elapsed degraded monitoring time.
2019-03-18 14:21:41 +09:00
Ian Barwick
f0d5ad503d doc: clarify "cluster show" error codes 2019-03-18 10:50:05 +09:00
John Naylor
b9ee57ee0f Fix assorted Makefile bugs
1. The target additional-maintainer-clean was misspelled as
maintainer-additional-clean.

2. Add add missing clean targets, in particular sysutils.o, config.h,
repmgr_version.h, and Makefile.global. While at it, use a wildcard
for obj files.

3. Don't delete configure.

4. Remove generated file doc/version.sgml from the repo.

5. Have maintainer-clean recurse to the doc directory.
v4.3.0rc1
2019-03-15 16:30:27 +09:00
Ian Barwick
d5d6ed4be7 Bump version
4.3rc1
2019-03-15 14:41:41 +09:00
Ian Barwick
f4655074ae doc: miscellaenous cleanup 2019-03-15 14:39:55 +09:00
Ian Barwick
67d26ab7e2 doc: tweak wording in event notification documentation 2019-03-15 14:08:18 +09:00
Ian Barwick
70a7b45a03 doc: add explanation of the configuration file format 2019-03-15 14:07:19 +09:00
Ian Barwick
4251590833 doc: update "connection_check_type" descriptions 2019-03-15 14:07:13 +09:00
Ian Barwick
9347d34ce0 repmgrd: optionally check upstream availability through connection attempts 2019-03-15 14:07:08 +09:00
John Naylor
feb90ee50c Correct some doc typos 2019-03-15 14:07:05 +09:00
Ian Barwick
0a6486bb7f doc: expand "standby_disconnect_on_failover" documentation 2019-03-15 14:07:01 +09:00
Ian Barwick
39443bbcee Count witness and zero-priority nodes in visibility check 2019-03-15 14:06:58 +09:00
Ian Barwick
fc636b1bd2 Ensure witness node sets last upstream seen time 2019-03-15 14:06:55 +09:00
Ian Barwick
048bad1c88 doc: fix option name typo 2019-03-15 14:06:51 +09:00
Ian Barwick
4528eb1796 doc: expand "failover_validate_command" documentation 2019-03-15 14:06:37 +09:00
Ian Barwick
169c9ccd32 repmgrd: improve logging output when executing "failover_validate_command" 2019-03-15 14:06:34 +09:00
Ian Barwick
5f92fbddf2 doc: various updates 2019-03-15 14:06:30 +09:00
Ian Barwick
617e466f72 doc: merge repmgrd witness server description into failover section 2019-03-13 16:19:41 +09:00
Ian Barwick
435fac297b doc: merge repmgrd split network handling description into failover section 2019-03-13 16:19:37 +09:00
Ian Barwick
4bc12b4c94 doc: merge repmgrd monitoring description into operating section 2019-03-13 16:19:33 +09:00
Ian Barwick
91234994e2 doc: merge repmgrd degraded monitoring description into operation section 2019-03-13 16:19:30 +09:00
Ian Barwick
ee9da30f20 doc: merge repmgrd notes into operation documentation 2019-03-13 16:19:27 +09:00
Ian Barwick
2e67bc1341 doc: merge repmgrd pause documentation into overview 2019-03-13 16:19:24 +09:00
Ian Barwick
18ab5cab4e doc: initial repmgrd doc refactoring 2019-03-13 16:19:20 +09:00
Ian Barwick
60bb4e9fc8 doc: update repmgrd configuration documentation 2019-03-13 16:19:17 +09:00
Ian Barwick
52bee6b98d repmgrd: various minor logging improvements 2019-03-13 16:19:13 +09:00
Ian Barwick
ecb1f379f5 repmgrd: remove global variable
Make the "sibling_nodes" local, and pass by reference where relevant.
2019-03-13 16:19:10 +09:00
Ian Barwick
e1cd2c22d4 repmgrd: enable election rerun
If "failover_validation_command" is set, and the command returns an error,
rerun the election.

There is a pause between reruns to avoid "churn"; the length of this pause
is controlled by the configuration parameter "election_rerun_interval".
2019-03-13 16:19:03 +09:00
Ian Barwick
1dea6b76d9 Remove redundant struct allocation 2019-03-13 16:19:00 +09:00
Ian Barwick
702f90fc9d doc: update list of reloadable repmgrd configuration options 2019-03-13 16:18:56 +09:00
Ian Barwick
c4d1eec6f3 doc: document "failover_validation_command" 2019-03-13 16:18:53 +09:00
Ian Barwick
b241c606c0 doc: expand repmgrd configuration section 2019-03-13 16:18:50 +09:00
Ian Barwick
45c896d716 Execute "failover_validation_command" when only one standby exists 2019-03-08 15:29:17 +09:00
Ian Barwick
514595ea10 Make "failover_validation_command" reloadable 2019-03-08 15:29:12 +09:00
Ian Barwick
531194fa27 Initial implementation of "failover_validation_command" 2019-03-08 15:29:06 +09:00
Ian Barwick
2aa67c992c Make recently added configuration options reloadable 2019-03-08 15:28:59 +09:00
Ian Barwick
37892afcfc Add configuration option "primary_visibility_consensus"
This determines whether repmgrd should continue with a failover if
one or more nodes report they can still see the standby.
2019-03-08 15:28:53 +09:00
Ian Barwick
e4e5e35552 Add configuration option "sibling_nodes_disconnect_timeout"
This controls the maximum length of time in seconds that repmgrd will
wait for other standbys to disconnect their WAL receivers in a failover
situation.

This setting is only used when "standby_disconnect_on_failover" is set to "true".
2019-03-08 15:28:48 +09:00
Ian Barwick
b320c1f0ae Reset "wal_retrieve_retry_interval" for all nodes 2019-03-08 15:28:42 +09:00
Ian Barwick
280654bed6 repmgrd: don't wait for WAL receiver to reconnect during failover
If the WAL receiver has been temporarily disabled, we don't want to
wait for it to start up as it may not be able to at that point; we do
however need to reset "wal_retrieve_retry_interval".
2019-03-08 15:28:27 +09:00
Ian Barwick
ae675059c0 Improve logging/sanity checking for "node control" options 2019-03-08 15:28:22 +09:00
Ian Barwick
454ebabe89 Improve logging when disabling/enabling WAL receiver
Also check action is being run on node which is in recovery.
2019-03-08 15:28:17 +09:00
Ian Barwick
d1d6ef8d12 Check for WAL receiver start up 2019-03-08 15:28:11 +09:00
Ian Barwick
5d6eab74f6 Log warning if "standby_disconnect_on_failover" used on pre-9.5
"standby_disconnect_on_failover" requires availability of "wal_retrieve_retry_interval",
which is available from PostgreSQL 9.5.

9.4 will fall out of community support this year, so it doesn't seem
productive at this point to do anything more than put the onus on the user
to read the documentation and heed any warning messages in the logs.
2019-03-08 15:28:01 +09:00
Ian Barwick
59b7453bbf repmgrd: optionally disconnect WAL receivers during failover
This is intended to ensure that all nodes have a constant LSN while
making the failover decision.

This feature is experimental and needs to be explicitly enabled with the
configuration file option "standby_disconnect_on_failover".

Note enabling this option will result in a delay in the failover decision
until the WAL receiver is disconnected on all nodes.
2019-03-08 15:27:54 +09:00
Ian Barwick
bde8c7e29c repmgrd: handle reconnect to restarted server when using "connection" checks 2019-03-08 15:27:49 +09:00
Ian Barwick
bc6584a90d *_transaction() functions: log error message text as DETAIL
Per behaviour elsewhere.
2019-03-06 13:23:57 +09:00