Previously repmgr would happily clone from whatever server
it found at the provided source server address. We should
ensure that a standby can only be cloned from a node which
is part of the main replication cluster.
This check fetches a list of nodes from the source server,
connects to the first non-witness server it finds, and
compares the system identifiers of the source node and the
node it has connected to. If there is a mismatch, then the
source server is clearly not part of the main replication
cluster, and is most likely the witness server.
As the witness server does not, by definition, ever have an entry in pg_stat_replication,
we need to check its "attached" status by connecting to the witness server itself
and querying the reported upstream node ID (which should be set by the witness
server repmgrd). If this matches the current primary node ID, we count it as attached.
This enables us to better determine whether a node is definitively
attached, definitively not attached, or if it was not possible to
determine the attached state.
When showing node information, check if the node's copy of its
record shows a different upstream to the one expected according
to the node where the command is executed.
This helps visualise situations where the cluster is in an
unexpected state, and provide a better idea of the actual state.
For example, if a cluster has divided somehow and a set of nodes are
following a new primary, when running "cluster show" etc., repmgr
will now show the name of the primary those nodes are actually
following, rather than the now outdated node name recorded
on the other side of the split. A warning will also be issued
about the situation.
This brings the repmgr documentation build system in line with that
used by the main PostgreSQL project, and removed the restriction
that documentation must be built against PostgreSQL 9.6 or earlier.
Main formatting changes are:
- convert empty-element tags (mainly <xref/>)
- put <indexterm> sections in the correct location
- correct usage of various entities.
It's not attached to the primary per-se, but needs to know what
the current primary is in order to correctly synchronise its
copy of the metadata.
Per GitHub #560.
Previously "repmgr standby switchover" would abort if any node was unreachable,
as that means it was unable to check if repmgrd is running.
However if the node has been marked as inactive in the repmgr metadata, it's
reasonable to assume the node is no longer part of the replication cluster
and does not need to be checked.