mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-25 16:16:29 +00:00
doc: merge repmgrd split network handling description into failover section
This commit is contained in:
@@ -54,7 +54,6 @@
|
|||||||
<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.sgml">
|
<!ENTITY repmgrd-automatic-failover SYSTEM "repmgrd-automatic-failover.sgml">
|
||||||
<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.sgml">
|
<!ENTITY repmgrd-configuration SYSTEM "repmgrd-configuration.sgml">
|
||||||
<!ENTITY repmgrd-operation SYSTEM "repmgrd-operation.sgml">
|
<!ENTITY repmgrd-operation SYSTEM "repmgrd-operation.sgml">
|
||||||
<!ENTITY repmgrd-network-split SYSTEM "repmgrd-network-split.sgml">
|
|
||||||
<!ENTITY repmgrd-witness-server SYSTEM "repmgrd-witness-server.sgml">
|
<!ENTITY repmgrd-witness-server SYSTEM "repmgrd-witness-server.sgml">
|
||||||
<!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.sgml">
|
<!ENTITY repmgrd-bdr SYSTEM "repmgrd-bdr.sgml">
|
||||||
|
|
||||||
|
|||||||
@@ -84,7 +84,6 @@
|
|||||||
&repmgrd-automatic-failover;
|
&repmgrd-automatic-failover;
|
||||||
&repmgrd-configuration;
|
&repmgrd-configuration;
|
||||||
&repmgrd-operation;
|
&repmgrd-operation;
|
||||||
&repmgrd-network-split;
|
|
||||||
&repmgrd-witness-server;
|
&repmgrd-witness-server;
|
||||||
&repmgrd-bdr;
|
&repmgrd-bdr;
|
||||||
</part>
|
</part>
|
||||||
|
|||||||
@@ -43,5 +43,57 @@
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
|
<sect1 id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
|
||||||
|
<indexterm>
|
||||||
|
<primary>repmgrd</primary>
|
||||||
|
<secondary>network splits</secondary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
|
<indexterm>
|
||||||
|
<primary>network splits</primary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
|
<title>Handling network splits with repmgrd</title>
|
||||||
|
<para>
|
||||||
|
A common pattern for replication cluster setups is to spread servers over
|
||||||
|
more than one datacentre. This can provide benefits such as geographically-
|
||||||
|
distributed read replicas and DR (disaster recovery capability). However
|
||||||
|
this also means there is a risk of disconnection at network level between
|
||||||
|
datacentre locations, which would result in a split-brain scenario if
|
||||||
|
servers in a secondary data centre were no longer able to see the primary
|
||||||
|
in the main data centre and promoted a standby among themselves.
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
&repmgr; enables provision of "<xref linkend="witness-server">" to
|
||||||
|
artificially create a quorum of servers in a particular location, ensuring
|
||||||
|
that nodes in another location will not elect a new primary if they
|
||||||
|
are unable to see the majority of nodes. However this approach does not
|
||||||
|
scale well, particularly with more complex replication setups, e.g.
|
||||||
|
where the majority of nodes are located outside of the primary datacentre.
|
||||||
|
It also means the <literal>witness</literal> node needs to be managed as an
|
||||||
|
extra PostgreSQL instance outside of the main replication cluster, which
|
||||||
|
adds administrative and programming complexity.
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
<literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
|
||||||
|
each node is associated with an arbitrary location string (default is
|
||||||
|
<literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
|
||||||
|
<programlisting>
|
||||||
|
node_id=1
|
||||||
|
node_name=node1
|
||||||
|
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
|
||||||
|
data_directory='/var/lib/postgresql/data'
|
||||||
|
location='dc1'</programlisting>
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
In a failover situation, <application>repmgrd</application> will check if any servers in the
|
||||||
|
same location as the current primary node are visible. If not, <application>repmgrd</application>
|
||||||
|
will assume a network interruption and not promote any node in any
|
||||||
|
other location (it will however enter <link linkend="repmgrd-degraded-monitoring">degraded monitoring</link>
|
||||||
|
mode until a primary becomes visible).
|
||||||
|
</para>
|
||||||
|
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
|
||||||
</chapter>
|
</chapter>
|
||||||
|
|||||||
@@ -1,48 +0,0 @@
|
|||||||
<chapter id="repmgrd-network-split" xreflabel="Handling network splits with repmgrd">
|
|
||||||
<indexterm>
|
|
||||||
<primary>repmgrd</primary>
|
|
||||||
<secondary>network splits</secondary>
|
|
||||||
</indexterm>
|
|
||||||
|
|
||||||
<title>Handling network splits with repmgrd</title>
|
|
||||||
<para>
|
|
||||||
A common pattern for replication cluster setups is to spread servers over
|
|
||||||
more than one datacentre. This can provide benefits such as geographically-
|
|
||||||
distributed read replicas and DR (disaster recovery capability). However
|
|
||||||
this also means there is a risk of disconnection at network level between
|
|
||||||
datacentre locations, which would result in a split-brain scenario if
|
|
||||||
servers in a secondary data centre were no longer able to see the primary
|
|
||||||
in the main data centre and promoted a standby among themselves.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
&repmgr; enables provision of "<xref linkend="witness-server">" to
|
|
||||||
artificially create a quorum of servers in a particular location, ensuring
|
|
||||||
that nodes in another location will not elect a new primary if they
|
|
||||||
are unable to see the majority of nodes. However this approach does not
|
|
||||||
scale well, particularly with more complex replication setups, e.g.
|
|
||||||
where the majority of nodes are located outside of the primary datacentre.
|
|
||||||
It also means the <literal>witness</literal> node needs to be managed as an
|
|
||||||
extra PostgreSQL instance outside of the main replication cluster, which
|
|
||||||
adds administrative and programming complexity.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
<literal>repmgr4</literal> introduces the concept of <literal>location</literal>:
|
|
||||||
each node is associated with an arbitrary location string (default is
|
|
||||||
<literal>default</literal>); this is set in <filename>repmgr.conf</filename>, e.g.:
|
|
||||||
<programlisting>
|
|
||||||
node_id=1
|
|
||||||
node_name=node1
|
|
||||||
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
|
|
||||||
data_directory='/var/lib/postgresql/data'
|
|
||||||
location='dc1'</programlisting>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
In a failover situation, <application>repmgrd</application> will check if any servers in the
|
|
||||||
same location as the current primary node are visible. If not, <application>repmgrd</application>
|
|
||||||
will assume a network interruption and not promote any node in any
|
|
||||||
other location (it will however enter <link linkend="repmgrd-degraded-monitoring">degraded monitoring</link>
|
|
||||||
mode until a primary becomes visible).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
Reference in New Issue
Block a user