mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 15:16:29 +00:00
49 lines
2.3 KiB
Plaintext
49 lines
2.3 KiB
Plaintext
<chapter id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
|
|
<indexterm>
|
|
<primary>repmgrd</primary>
|
|
<secondary>network splits</secondary>
|
|
</indexterm>
|
|
|
|
<title>Handling network splits with repmgrd</title>
|
|
<para>
|
|
A common pattern for replication cluster setups is to spread servers over
|
|
more than one datacentre. This can provide benefits such as geographically-
|
|
distributed read replicas and DR (disaster recovery capability). However
|
|
this also means there is a risk of disconnection at network level between
|
|
datacentre locations, which would result in a split-brain scenario if
|
|
servers in a secondary data centre were no longer able to see the primary
|
|
in the main data centre and promoted a standby among themselves.
|
|
</para>
|
|
<para>
|
|
&repmgr; enables provision of "<xref linkend="witness-server">" to
|
|
artificially create a quorum of servers in a particular location, ensuring
|
|
that nodes in another location will not elect a new primary if they
|
|
are unable to see the majority of nodes. However this approach does not
|
|
scale well, particularly with more complex replication setups, e.g.
|
|
where the majority of nodes are located outside of the primary datacentre.
|
|
It also means the <literal>witness</literal> node needs to be managed as an
|
|
extra PostgreSQL instance outside of the main replication cluster, which
|
|
adds administrative and programming complexity.
|
|
</para>
|
|
<para>
|
|
<literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
|
|
each node is associated with an arbitrary location string (default is
|
|
<literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
|
|
<programlisting>
|
|
node_id=1
|
|
node_name=node1
|
|
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
|
|
data_directory='/var/lib/postgresql/data'
|
|
location='dc1'</programlisting>
|
|
</para>
|
|
<para>
|
|
In a failover situation, <application>repmgrd</application> will check if any servers in the
|
|
same location as the current primary node are visible. If not, <application>repmgrd</application>
|
|
will assume a network interruption and not promote any node in any
|
|
other location (it will however enter <xref linkend="repmgrd-degraded-monitoring"> mode until
|
|
a primary becomes visible).
|
|
</para>
|
|
|
|
</chapter>
|
|
|