repmgrd: don't start if node is inactive and failover=automatic

If failover=automatic, it would be reasonable to expect repmgrd to consider this node as a promotion candidate, however this will not happen if it is marked inactive. This often happens when a failed primary is recloned as a standby but not re-registered, and if repmgrd would run it would give the incorrect impression that failover capability is available. Addresses GitHub #153.
2026-07-16 14:29:05 +00:00 · 2016-09-28 10:50:38 +09:00
parent bbb2e2f017
commit b7f20ee1f7
2 changed files with 46 additions and 0 deletions
@@ -151,6 +151,9 @@ General
  In `repmgr.conf`, set its priority to a value of 0 or less.
  Additionally, if `failover` is set to `manual`, the node will never
  be considered as a promotion candidate.
 - Does `repmgrd` support delayed standbys?
  `repmgrd` can monitor delayed standbys - those set up with
@@ -169,3 +172,11 @@ General
  Configure your system's `logrotate` service to do this; see example
  in README.md
 - I've recloned a failed master as a standby, but `repmgrd` refuses to start?
  Check you registered the standby after recloning. If unregistered the standby
  cannot be considered as a promotion candidate even if `failover` is set to
  `automatic`, which is probably not what you want. `repmgrd` will start if
  `failover` is set to `manual` so the node's replication status can still
  be monitored, if desired.
@@ -311,6 +311,41 @@ main(int argc, char **argv)
 	log_debug("node id is %i, upstream is %i\n", node_info.node_id, node_info.upstream_node_id);
    /*
     * Check if node record is active - if not, and `failover=automatic`, the node
     * won't be considered as a promotion candidate; this often happens when
     * a failed primary is recloned and the node was not re-registered, giving
     * the impression failover capability is there when it's not. In this case
     * abort with an error and a hint about registering.
     *
     * If `failover=manual`, repmgrd can continue to passively monitor the node, but
     * we should nevertheless issue a warning and the same hint.
     */
    if (node_info.active == false)
    {
        char *hint = "Check that 'repmgr (master|standby) register' was executed for this node";
        switch (local_options.failover)
        {
            case AUTOMATIC_FAILOVER:
                log_err(_("This node is marked as inactive and cannot be used for failover\n"));
                log_hint(_("%s\n"), hint);
                terminate(ERR_BAD_CONFIG);
            case MANUAL_FAILOVER:
                log_warning(_("This node is marked as inactive and will be passively monitored only\n"));
                log_hint(_("%s\n"), hint);
                break;
            default:
                /* This should never happen */
                log_err(_("Unknown failover mode %i\n"), local_options.failover);
                terminate(ERR_BAD_CONFIG);
        }
    }
 	/*
 	 * MAIN LOOP This loops cycles at startup and once per failover and
 	 * Requisites: