mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 07:06:30 +00:00
242 lines
8.5 KiB
Plaintext
242 lines
8.5 KiB
Plaintext
<chapter id="overview" xreflabel="Overview">
|
|
<title>repmgr overview</title>
|
|
|
|
<para>
|
|
This chapter provides a high-level overview of &repmgr;'s components and
|
|
functionality.
|
|
</para>
|
|
<sect1 id="repmgr-concepts" xreflabel="Concepts">
|
|
|
|
<title>Concepts</title>
|
|
|
|
<indexterm>
|
|
<primary>concepts</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
This guide assumes that you are familiar with PostgreSQL administration and
|
|
streaming replication concepts. For further details on streaming
|
|
replication, see the PostgreSQL documentation section on <ulink
|
|
url="https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION">
|
|
streaming replication</ulink>.
|
|
</para>
|
|
<para>
|
|
The following terms are used throughout the &repmgr; documentation.
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>replication cluster</term>
|
|
<listitem>
|
|
<simpara>
|
|
In the &repmgr; documentation, "replication cluster" refers to the network
|
|
of PostgreSQL servers connected by streaming replication.
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>node</term>
|
|
<listitem>
|
|
<simpara>
|
|
A node is a single PostgreSQL server within a replication cluster.
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>upstream node</term>
|
|
<listitem>
|
|
<simpara>
|
|
The node a standby server connects to, in order to receive streaming replication.
|
|
This is either the primary server, or in the case of cascading replication, another
|
|
standby.
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>failover</term>
|
|
<listitem>
|
|
<simpara>
|
|
This is the action which occurs if a primary server fails and a suitable standby
|
|
is promoted as the new primary. The &repmgrd; daemon supports automatic failover
|
|
to minimise downtime.
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>switchover</term>
|
|
<listitem>
|
|
<simpara>
|
|
In certain circumstances, such as hardware or operating system maintenance,
|
|
it's necessary to take a primary server offline; in this case a controlled
|
|
switchover is necessary, whereby a suitable standby is promoted and the
|
|
existing primary removed from the replication cluster in a controlled manner.
|
|
The &repmgr; command line client provides this functionality.
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>fencing</term>
|
|
<listitem>
|
|
<simpara>
|
|
In a failover situation, following the promotion of a new standby, it's
|
|
essential that the previous primary does not unexpectedly come back on
|
|
line, which would result in a split-brain situation. To prevent this,
|
|
the failed primary should be isolated from applications, i.e. "fenced off".
|
|
</simpara>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry id="witness-server">
|
|
<term>witness server</term>
|
|
<listitem>
|
|
<para>
|
|
&repmgr; provides functionality to set up a so-called "witness server" to
|
|
assist in determining a new primary server in a failover situation with more
|
|
than one standby. The witness server itself is not part of the replication
|
|
cluster, although it does contain a copy of the repmgr metadata schema.
|
|
</para>
|
|
<para>
|
|
The purpose of a witness server is to provide a "casting vote" where servers
|
|
in the replication cluster are split over more than one location. In the event
|
|
of a loss of connectivity between locations, the presence or absence of
|
|
the witness server will decide whether a server at that location is promoted
|
|
to primary; this is to prevent a "split-brain" situation where an isolated
|
|
location interprets a network outage as a failure of the (remote) primary and
|
|
promotes a (local) standby.
|
|
</para>
|
|
<para>
|
|
A witness server only needs to be created if &repmgrd;
|
|
is in use.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
</sect1>
|
|
<sect1 id="repmgr-components" xreflabel="Components">
|
|
<title>Components</title>
|
|
<para>
|
|
&repmgr; is a suite of open-source tools to manage replication and failover
|
|
within a cluster of PostgreSQL servers. It supports and enhances PostgreSQL's
|
|
built-in streaming replication, which provides a single read/write primary server
|
|
and one or more read-only standbys containing near-real time copies of the primary
|
|
server's database. It provides two main tools:
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>repmgr</term>
|
|
<listitem>
|
|
<para>
|
|
A command-line tool used to perform administrative tasks such as:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<simpara>setting up standby servers</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>promoting a standby server to primary</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>switching over primary and standby servers</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>displaying the status of servers in the replication cluster</simpara>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>repmgrd</term>
|
|
<listitem>
|
|
<para>
|
|
A daemon which actively monitors servers in a replication cluster
|
|
and performs the following tasks:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<simpara>monitoring and recording replication performance</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>performing failover by detecting failure of the primary and
|
|
promoting the most suitable standby server
|
|
</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>provide notifications about events in the cluster to a user-defined
|
|
script which can perform tasks such as sending alerts by email</simpara>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="repmgr-user-metadata" xreflabel="Repmgr user and metadata">
|
|
<title>Repmgr user and metadata</title>
|
|
<para>
|
|
In order to effectively manage a replication cluster, &repmgr; needs to store
|
|
information about the servers in the cluster in a dedicated database schema.
|
|
This schema is automatically created by the &repmgr; extension, which is installed
|
|
during the first step in initializing a &repmgr;-administered cluster
|
|
(<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
|
|
and contains the following objects:
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Tables</term>
|
|
<listitem>
|
|
<para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<simpara><literal>repmgr.events</literal>: records events of interest</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara><literal>repmgr.nodes</literal>: connection and status information for each server in the
|
|
replication cluster</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara><literal>repmgr.monitoring_history</literal>: historical standby monitoring information
|
|
written by &repmgrd;</simpara>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Views</term>
|
|
<listitem>
|
|
<para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<simpara>repmgr.show_nodes: based on the table <literal>repmgr.nodes</literal>, additionally showing the
|
|
name of the server's upstream node</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>repmgr.replication_status: when &repmgrd;'s monitoring is enabled, shows
|
|
current monitoring status for each standby.</simpara>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
The &repmgr; metadata schema can be stored in an existing database or in its own
|
|
dedicated database. Note that the &repmgr; metadata schema cannot reside on a database
|
|
server which is not part of the replication cluster managed by &repmgr;.
|
|
</para>
|
|
<para>
|
|
A database user must be available for &repmgr; to access this database and perform
|
|
necessary changes. This user does not need to be a superuser, however some operations
|
|
such as initial installation of the &repmgr; extension will require a superuser
|
|
connection (this can be specified where required with the command line option
|
|
<literal>--superuser</literal>).
|
|
</para>
|
|
</sect1>
|
|
|
|
</chapter>
|