Ian Barwick
6f0f338968
standby follow: set replication user when connecting to local node
2019-03-21 16:43:39 +09:00
Ian Barwick
bd26eb3025
standby switchover: don't attempt to pause repmgrd on unreachable nodes
2019-03-21 13:48:59 +09:00
Ian Barwick
9b089b7401
doc: add note about compiling against Pg11 and later with the --with-llvm option
2019-03-21 10:30:00 +09:00
Ian Barwick
314a1e8f4f
use a constant to denote unknown replication lag
2019-03-20 17:26:04 +09:00
Ian Barwick
7204a0faf4
doc: consolidate witness server documentation
2019-03-20 16:31:52 +09:00
Ian Barwick
5e775cef16
doc: various improvements to repmgrd documentation
2019-03-20 16:10:03 +09:00
Ian Barwick
7d0caefaee
Fix logging related to "connection_check_type"
...
Also log the selected type at repmgrd startup.
2019-03-20 11:58:18 +09:00
Ian Barwick
7434cc0b8e
repmgrd: improve witness node monitoring
...
Mainly fix a couple of places where "standby" was hard-coded into a log
message which can apply either to a witness or a standby.
2019-03-20 11:47:36 +09:00
Ian Barwick
b84d98fe81
Explictly log PQping() failures
2019-03-20 11:47:32 +09:00
Ian Barwick
46efe57cd0
Improve database connection failure logging
...
Log the output of PQerrorStatus() in a couple of places where it was missing.
Additionally, always log the output of PQerrorStatus() starting with a blank
line, otherwise the first line looks like it was emitted by repmgr, and
it's harder to scan the error message.
Before:
[2019-03-20 11:24:15] [DETAIL] could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5501?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5501?
After:
[2019-03-20 11:27:21] [DETAIL]
could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5501?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5501?
2019-03-20 11:47:28 +09:00
Ian Barwick
426759ca8e
check_primary_status(): handle case where recovery type unknown
2019-03-18 16:16:54 +09:00
Ian Barwick
39df55c39c
Check node recovery type before attempting to write an event record
...
In some corner cases (e.g. immediately after a switchover) where
the current primary has not yet been determined, the provided connection
might not be writeable. This prevents error messages such as
"cannot execute INSERT in a read-only transaction" generating unnecessary
noise in the logs.
2019-03-18 15:26:16 +09:00
Ian Barwick
f54ff85cfa
Remove outdated comment
...
This was only relevant for repmgr3 and earlier; in repmgr4 the schema
is hard-coded.
2019-03-18 15:19:11 +09:00
Ian Barwick
8ab51c2ae3
Refactor check_primary_status()
...
Reduce nested if/else branching, and improve documentation.
2019-03-18 15:01:21 +09:00
Ian Barwick
43f28f4097
Clarify calls to check_primary_status()
...
Use a constant rather than a magic number to indicate non-provision
of elapsed degraded monitoring time.
2019-03-18 14:21:34 +09:00
Ian Barwick
0940185f49
doc: clarify "cluster show" error codes
2019-03-18 10:49:38 +09:00
John Naylor
4f9fc56871
Fix assorted Makefile bugs
...
1. The target additional-maintainer-clean was misspelled as
maintainer-additional-clean.
2. Add add missing clean targets, in particular sysutils.o, config.h,
repmgr_version.h, and Makefile.global. While at it, use a wildcard
for obj files.
3. Don't delete configure.
4. Remove generated file doc/version.sgml from the repo.
5. Have maintainer-clean recurse to the doc directory.
2019-03-15 16:29:31 +09:00
Ian Barwick
fbdf9617fa
doc: update repmgrd example output
2019-03-15 15:43:11 +09:00
Ian Barwick
dfb92df05f
doc: miscellaenous cleanup
2019-03-15 14:39:37 +09:00
Ian Barwick
9dd87dd5ce
doc: add explanation of the configuration file format
2019-03-15 14:02:42 +09:00
Ian Barwick
a2df69512a
doc: update "connection_check_type" descriptions
2019-03-14 15:44:59 +09:00
Ian Barwick
c2206b007a
repmgrd: optionally check upstream availability through connection attempts
2019-03-14 15:44:53 +09:00
John Naylor
e06d3de444
Correct some doc typos
2019-03-14 11:58:31 +08:00
Ian Barwick
9d056b2f72
doc: expand "standby_disconnect_on_failover" documentation
2019-03-14 12:08:13 +09:00
Ian Barwick
19bf4d7434
Count witness and zero-priority nodes in visibility check
2019-03-14 11:17:51 +09:00
Ian Barwick
56d9f5b856
Ensure witness node sets last upstream seen time
2019-03-14 10:53:47 +09:00
Ian Barwick
c1d6753081
doc: fix option name typo
2019-03-14 09:32:06 +09:00
Ian Barwick
2b59b4894a
doc: expand "failover_validate_command" documentation
2019-03-13 21:10:03 +09:00
Ian Barwick
c3c58df7b9
repmgrd: improve logging output when executing "failover_validate_command"
2019-03-13 21:07:26 +09:00
Ian Barwick
0e2f3e563a
doc: various updates
2019-03-13 16:55:32 +09:00
Ian Barwick
8c4421d110
doc: merge repmgrd witness server description into failover section
2019-03-13 16:12:17 +09:00
Ian Barwick
69cb3f1e82
doc: merge repmgrd split network handling description into failover section
2019-03-13 16:12:14 +09:00
Ian Barwick
960acfeb3c
doc: merge repmgrd monitoring description into operating section
2019-03-13 16:12:11 +09:00
Ian Barwick
a8d50a5b98
doc: merge repmgrd degraded monitoring description into operation section
2019-03-13 16:12:06 +09:00
Ian Barwick
11e5993bf5
doc: merge repmgrd notes into operation documentation
2019-03-13 16:12:03 +09:00
Ian Barwick
09861a5604
doc: merge repmgrd pause documentation into overview
2019-03-13 16:11:59 +09:00
Ian Barwick
89bba77d4d
doc: initial repmgrd doc refactoring
2019-03-13 16:11:55 +09:00
Ian Barwick
dd6ece326f
doc: update repmgrd configuration documentation
2019-03-13 13:34:08 +09:00
Ian Barwick
573d027db6
repmgrd: various minor logging improvements
2019-03-13 11:27:17 +09:00
Ian Barwick
1afb41647b
repmgrd: remove global variable
...
Make the "sibling_nodes" local, and pass by reference where relevant.
2019-03-12 17:12:23 +09:00
Ian Barwick
fc397f25f6
repmgrd: enable election rerun
...
If "failover_validation_command" is set, and the command returns an error,
rerun the election.
There is a pause between reruns to avoid "churn"; the length of this pause
is controlled by the configuration parameter "election_rerun_interval".
2019-03-12 17:12:19 +09:00
Ian Barwick
99923f5ffc
Remove redundant struct allocation
2019-03-11 19:06:07 +09:00
Ian Barwick
b9cdcd55e7
doc: update list of reloadable repmgrd configuration options
2019-03-11 16:18:10 +09:00
Ian Barwick
db87ff46fd
doc: document "failover_validation_command"
2019-03-11 15:02:33 +09:00
Ian Barwick
2a8f8d8400
doc: expand repmgrd configuration section
2019-03-11 14:50:33 +09:00
Ian Barwick
4ef706c2ca
Execute "failover_validation_command" when only one standby exists
2019-03-08 12:19:37 +09:00
Ian Barwick
663c2e75b4
Make "failover_validation_command" reloadable
2019-03-08 09:27:19 +09:00
Ian Barwick
db0d71c6a7
Initial implementation of "failover_validation_command"
2019-03-08 08:49:15 +09:00
Ian Barwick
6f4f56dd8c
Make recently added configuration options reloadable
2019-03-07 10:58:25 +09:00
Ian Barwick
33fefd9f52
Add configuration option "primary_visibility_consensus"
...
This determines whether repmgrd should continue with a failover if
one or more nodes report they can still see the standby.
2019-03-07 10:41:42 +09:00