From 4528eb1796d7a2dc70dbcbbb5f54aa1d466df97d Mon Sep 17 00:00:00 2001 From: Ian Barwick Date: Wed, 13 Mar 2019 21:10:03 +0900 Subject: [PATCH] doc: expand "failover_validate_command" documentation --- doc/appendix-release-notes.sgml | 11 +++++- doc/repmgrd-automatic-failover.sgml | 55 ++++++++++++++++++++++++++++- doc/repmgrd-configuration.sgml | 7 ++-- 3 files changed, 69 insertions(+), 4 deletions(-) diff --git a/doc/appendix-release-notes.sgml b/doc/appendix-release-notes.sgml index 6f1e1b64..e0cb1cc6 100644 --- a/doc/appendix-release-notes.sgml +++ b/doc/appendix-release-notes.sgml @@ -32,7 +32,7 @@ REPMGRD_OPTS="--daemonize=false" - For further details, see repmgrd daemon configuration on Debian/Ubuntu. + For further details, see repmgrd configuration on Debian/Ubuntu. @@ -160,6 +160,7 @@ REPMGRD_OPTS="--daemonize=false" the absence of a running repmgrd. + Add option to enable selection of the method @@ -171,6 +172,14 @@ REPMGRD_OPTS="--daemonize=false" by executing an SQL statement on the node via the existing connection). + + + + New configuration option + to allow an external mechanism to validate the failover decision made by repmgrd. + + + diff --git a/doc/repmgrd-automatic-failover.sgml b/doc/repmgrd-automatic-failover.sgml index 8d893b06..8851d402 100644 --- a/doc/repmgrd-automatic-failover.sgml +++ b/doc/repmgrd-automatic-failover.sgml @@ -110,8 +110,61 @@ + + + repmgrd + failover validation + - + + failover validation + + + Failover validation + + From repmgr 4.3, &repmgr; makes it possible to provide a script + to repmgrd which, in a failover situation, + will be executed by the promotion candidate (the node which has been selected + to be the new primary) to confirm whether the node should actually be promoted. + + + To use this, in repmgr.conf + to a script executable by the postgres system user, e.g.: + + failover_validate_command=/path/to/script.sh %n %a + + + The %n parameter will be replaced with the node ID, and the + %a parameter will be replaced by the node name when the script is executed. + + + This script must return an exit code of 0 to indicate the node should promote itself. + Any other value will result in the promotion being aborted and the election rerun. + There is a pause of seconds before the election is rerun. + + + Sample repmgrd log file output during which the failover validation + script rejects the proposed promotion candidate: + +[2019-03-13 21:01:30] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds +[2019-03-13 21:01:30] [NOTICE] promotion candidate is "node2" (ID: 2) +[2019-03-13 21:01:30] [NOTICE] executing "failover_validation_command" +[2019-03-13 21:01:30] [DETAIL] /usr/local/bin/failover-validation.sh 2 +[2019-03-13 21:01:30] [INFO] output returned by failover validation command: +Node ID: 2 + +[2019-03-13 21:01:30] [NOTICE] failover validation command returned a non-zero value: "1" +[2019-03-13 21:01:30] [NOTICE] promotion candidate election will be rerun +[2019-03-13 21:01:30] [INFO] 1 followers to notify +[2019-03-13 21:01:30] [NOTICE] notifying node "node3" (node ID: 3) to rerun promotion candidate selection +INFO: node 3 received notification to rerun promotion candidate election +[2019-03-13 21:01:30] [NOTICE] rerunning election after 15 seconds ("election_rerun_interval") + + + + + + repmgrd cascading replication diff --git a/doc/repmgrd-configuration.sgml b/doc/repmgrd-configuration.sgml index da692e80..aa40606d 100644 --- a/doc/repmgrd-configuration.sgml +++ b/doc/repmgrd-configuration.sgml @@ -138,7 +138,7 @@ - See also repmgr.conf.sample for an annotated sample configuration file.. + See also repmgr.conf.sample for an annotated sample configuration file. @@ -341,7 +341,7 @@ One or both of the following parameter placeholders should be provided, which will be replaced by repmgrd with the appropriate - value: + value: %n: node ID @@ -351,6 +351,9 @@ + + See also: Failover validation. +