Ian Barwick
b4f6043abc
Update .gitignore
...
Ignore artefacts from failed patch application.
2019-03-20 12:11:57 +09:00
Ian Barwick
a7f3f899ff
doc: update repmgrd example output
2019-03-20 12:10:31 +09:00
Ian Barwick
3ec43eda36
doc: remove references to "primary_visibility_consensus"
...
Feature remains experimental.
v4.3.0rc2
2019-03-18 17:43:16 +09:00
Ian Barwick
ce8e1cccc4
Remove outdated comment
...
This was only relevant for repmgr3 and earlier; in repmgr4 the schema
is hard-coded.
2019-03-18 15:19:25 +09:00
Ian Barwick
70bfa4c8e1
Clarify calls to check_primary_status()
...
Use a constant rather than a magic number to indicate non-provision
of elapsed degraded monitoring time.
2019-03-18 14:21:41 +09:00
Ian Barwick
f0d5ad503d
doc: clarify "cluster show" error codes
2019-03-18 10:50:05 +09:00
John Naylor
b9ee57ee0f
Fix assorted Makefile bugs
...
1. The target additional-maintainer-clean was misspelled as
maintainer-additional-clean.
2. Add add missing clean targets, in particular sysutils.o, config.h,
repmgr_version.h, and Makefile.global. While at it, use a wildcard
for obj files.
3. Don't delete configure.
4. Remove generated file doc/version.sgml from the repo.
5. Have maintainer-clean recurse to the doc directory.
v4.3.0rc1
2019-03-15 16:30:27 +09:00
Ian Barwick
d5d6ed4be7
Bump version
...
4.3rc1
2019-03-15 14:41:41 +09:00
Ian Barwick
f4655074ae
doc: miscellaenous cleanup
2019-03-15 14:39:55 +09:00
Ian Barwick
67d26ab7e2
doc: tweak wording in event notification documentation
2019-03-15 14:08:18 +09:00
Ian Barwick
70a7b45a03
doc: add explanation of the configuration file format
2019-03-15 14:07:19 +09:00
Ian Barwick
4251590833
doc: update "connection_check_type" descriptions
2019-03-15 14:07:13 +09:00
Ian Barwick
9347d34ce0
repmgrd: optionally check upstream availability through connection attempts
2019-03-15 14:07:08 +09:00
John Naylor
feb90ee50c
Correct some doc typos
2019-03-15 14:07:05 +09:00
Ian Barwick
0a6486bb7f
doc: expand "standby_disconnect_on_failover" documentation
2019-03-15 14:07:01 +09:00
Ian Barwick
39443bbcee
Count witness and zero-priority nodes in visibility check
2019-03-15 14:06:58 +09:00
Ian Barwick
fc636b1bd2
Ensure witness node sets last upstream seen time
2019-03-15 14:06:55 +09:00
Ian Barwick
048bad1c88
doc: fix option name typo
2019-03-15 14:06:51 +09:00
Ian Barwick
4528eb1796
doc: expand "failover_validate_command" documentation
2019-03-15 14:06:37 +09:00
Ian Barwick
169c9ccd32
repmgrd: improve logging output when executing "failover_validate_command"
2019-03-15 14:06:34 +09:00
Ian Barwick
5f92fbddf2
doc: various updates
2019-03-15 14:06:30 +09:00
Ian Barwick
617e466f72
doc: merge repmgrd witness server description into failover section
2019-03-13 16:19:41 +09:00
Ian Barwick
435fac297b
doc: merge repmgrd split network handling description into failover section
2019-03-13 16:19:37 +09:00
Ian Barwick
4bc12b4c94
doc: merge repmgrd monitoring description into operating section
2019-03-13 16:19:33 +09:00
Ian Barwick
91234994e2
doc: merge repmgrd degraded monitoring description into operation section
2019-03-13 16:19:30 +09:00
Ian Barwick
ee9da30f20
doc: merge repmgrd notes into operation documentation
2019-03-13 16:19:27 +09:00
Ian Barwick
2e67bc1341
doc: merge repmgrd pause documentation into overview
2019-03-13 16:19:24 +09:00
Ian Barwick
18ab5cab4e
doc: initial repmgrd doc refactoring
2019-03-13 16:19:20 +09:00
Ian Barwick
60bb4e9fc8
doc: update repmgrd configuration documentation
2019-03-13 16:19:17 +09:00
Ian Barwick
52bee6b98d
repmgrd: various minor logging improvements
2019-03-13 16:19:13 +09:00
Ian Barwick
ecb1f379f5
repmgrd: remove global variable
...
Make the "sibling_nodes" local, and pass by reference where relevant.
2019-03-13 16:19:10 +09:00
Ian Barwick
e1cd2c22d4
repmgrd: enable election rerun
...
If "failover_validation_command" is set, and the command returns an error,
rerun the election.
There is a pause between reruns to avoid "churn"; the length of this pause
is controlled by the configuration parameter "election_rerun_interval".
2019-03-13 16:19:03 +09:00
Ian Barwick
1dea6b76d9
Remove redundant struct allocation
2019-03-13 16:19:00 +09:00
Ian Barwick
702f90fc9d
doc: update list of reloadable repmgrd configuration options
2019-03-13 16:18:56 +09:00
Ian Barwick
c4d1eec6f3
doc: document "failover_validation_command"
2019-03-13 16:18:53 +09:00
Ian Barwick
b241c606c0
doc: expand repmgrd configuration section
2019-03-13 16:18:50 +09:00
Ian Barwick
45c896d716
Execute "failover_validation_command" when only one standby exists
2019-03-08 15:29:17 +09:00
Ian Barwick
514595ea10
Make "failover_validation_command" reloadable
2019-03-08 15:29:12 +09:00
Ian Barwick
531194fa27
Initial implementation of "failover_validation_command"
2019-03-08 15:29:06 +09:00
Ian Barwick
2aa67c992c
Make recently added configuration options reloadable
2019-03-08 15:28:59 +09:00
Ian Barwick
37892afcfc
Add configuration option "primary_visibility_consensus"
...
This determines whether repmgrd should continue with a failover if
one or more nodes report they can still see the standby.
2019-03-08 15:28:53 +09:00
Ian Barwick
e4e5e35552
Add configuration option "sibling_nodes_disconnect_timeout"
...
This controls the maximum length of time in seconds that repmgrd will
wait for other standbys to disconnect their WAL receivers in a failover
situation.
This setting is only used when "standby_disconnect_on_failover" is set to "true".
2019-03-08 15:28:48 +09:00
Ian Barwick
b320c1f0ae
Reset "wal_retrieve_retry_interval" for all nodes
2019-03-08 15:28:42 +09:00
Ian Barwick
280654bed6
repmgrd: don't wait for WAL receiver to reconnect during failover
...
If the WAL receiver has been temporarily disabled, we don't want to
wait for it to start up as it may not be able to at that point; we do
however need to reset "wal_retrieve_retry_interval".
2019-03-08 15:28:27 +09:00
Ian Barwick
ae675059c0
Improve logging/sanity checking for "node control" options
2019-03-08 15:28:22 +09:00
Ian Barwick
454ebabe89
Improve logging when disabling/enabling WAL receiver
...
Also check action is being run on node which is in recovery.
2019-03-08 15:28:17 +09:00
Ian Barwick
d1d6ef8d12
Check for WAL receiver start up
2019-03-08 15:28:11 +09:00
Ian Barwick
5d6eab74f6
Log warning if "standby_disconnect_on_failover" used on pre-9.5
...
"standby_disconnect_on_failover" requires availability of "wal_retrieve_retry_interval",
which is available from PostgreSQL 9.5.
9.4 will fall out of community support this year, so it doesn't seem
productive at this point to do anything more than put the onus on the user
to read the documentation and heed any warning messages in the logs.
2019-03-08 15:28:01 +09:00
Ian Barwick
59b7453bbf
repmgrd: optionally disconnect WAL receivers during failover
...
This is intended to ensure that all nodes have a constant LSN while
making the failover decision.
This feature is experimental and needs to be explicitly enabled with the
configuration file option "standby_disconnect_on_failover".
Note enabling this option will result in a delay in the failover decision
until the WAL receiver is disconnected on all nodes.
2019-03-08 15:27:54 +09:00
Ian Barwick
bde8c7e29c
repmgrd: handle reconnect to restarted server when using "connection" checks
2019-03-08 15:27:49 +09:00