repmgrd: prevent endless loops in failover with manual node

The LSN reported by the shared memory function defaults to "0/0"
(InvalidXLogRecPtr) - this indicates that the repmgrd on that node
hasn't been able to update it yet. However during failover several
places in the code assumed this is an error, which would cause
an endless loop waiting for updates which would never come.

To get around this without changing function definitions, we can
store an explicit message in the shared memory location field so the
caller can tell whether the other node hasn't yet updated the field,
or encountered situation which means it should not be considered
as a promotion candidate (which in most cases will be because
`failover` is set to `manual`.

Resolves GitHub #222.
This commit is contained in:
Ian Barwick
2016-08-08 14:20:28 +09:00
parent 73280a426b
commit cb78802027
2 changed files with 73 additions and 19 deletions

View File

@@ -8,6 +8,11 @@
-l/--local-port (Ian)
improve "repmgr-auto" Debian package (Gianni)
3.1.5 2016-08-
repmgrd: in a failover situation, prevent endless looping when
attempting to establish the status of a node with
`failover=manual` (Ian)
3.1.4 2016-07-12
repmgr: new configuration option for setting "restore_command"
in the recovery.conf file generated by repmgr (Martín)