mirror of
https://github.com/EnterpriseDB/repmgr.git
synced 2026-03-23 15:16:29 +00:00
Compare commits
94 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2615cffecc | ||
|
|
1f838f99c2 | ||
|
|
d3f119005b | ||
|
|
db6d4d8820 | ||
|
|
7a8a50e229 | ||
|
|
e188044593 | ||
|
|
636f4b03c6 | ||
|
|
bf96b383a3 | ||
|
|
3a2e40f381 | ||
|
|
c608bb28ee | ||
|
|
ca9c2e1143 | ||
|
|
3a6d6b8899 | ||
|
|
4091cb7f18 | ||
|
|
870b0a53b6 | ||
|
|
6184cc57be | ||
|
|
e1254b6773 | ||
|
|
1c9121c2d8 | ||
|
|
6da03a6157 | ||
|
|
9bb6befa25 | ||
|
|
a8e5c68d03 | ||
|
|
b83e18c503 | ||
|
|
d4b845d213 | ||
|
|
75aad9a85e | ||
|
|
e115825cd6 | ||
|
|
6cf5ab2e53 | ||
|
|
f8119d20ea | ||
|
|
0caddf2d2c | ||
|
|
a4abbc6f0c | ||
|
|
d7e489ea0a | ||
|
|
2bcacff3b3 | ||
|
|
45eb0ea5d3 | ||
|
|
c3bd02b83d | ||
|
|
8e7d110a22 | ||
|
|
43874d5576 | ||
|
|
87ff9d09ba | ||
|
|
c429b0b186 | ||
|
|
03b88178c1 | ||
|
|
5f33f4286f | ||
|
|
932f84910b | ||
|
|
1ef7f1368d | ||
|
|
640abed18f | ||
|
|
ef6b24551a | ||
|
|
42847e44d2 | ||
|
|
dd7cfce3d3 | ||
|
|
30fd111cba | ||
|
|
65e63b062e | ||
|
|
053f672caa | ||
|
|
f6d02b85d8 | ||
|
|
6ebf3a7319 | ||
|
|
7345ddcf00 | ||
|
|
eb0af7ca23 | ||
|
|
ae47e5f413 | ||
|
|
46100a9549 | ||
|
|
9bd95cabdf | ||
|
|
f1584469bf | ||
|
|
a7f46d24de | ||
|
|
462d446477 | ||
|
|
23a72f489c | ||
|
|
f3f56b0cd6 | ||
|
|
00146b7fbd | ||
|
|
faf72a2514 | ||
|
|
7010b636e0 | ||
|
|
00deff9069 | ||
|
|
5240a5723a | ||
|
|
45e29c5b28 | ||
|
|
5def293ed6 | ||
|
|
ff7b4d3f02 | ||
|
|
a54478a045 | ||
|
|
7ad9a2c28a | ||
|
|
3deb6784e7 | ||
|
|
ba275bb0c2 | ||
|
|
9735bb63a1 | ||
|
|
1e5792f8df | ||
|
|
a01fefa7d0 | ||
|
|
34eaf94b2b | ||
|
|
68e3a9d7ab | ||
|
|
2ad4f68700 | ||
|
|
00aa0c8c87 | ||
|
|
e8025c7c9f | ||
|
|
6a17360b4c | ||
|
|
9e5e843a4f | ||
|
|
734ae1825e | ||
|
|
41fe58764e | ||
|
|
58a5249b7e | ||
|
|
90c0bd4638 | ||
|
|
359e81a6d6 | ||
|
|
0037e66034 | ||
|
|
07d220cb00 | ||
|
|
4dfeffe087 | ||
|
|
18544c82ca | ||
|
|
0f86bdcd05 | ||
|
|
7d33c1e411 | ||
|
|
fec65bde3d | ||
|
|
4863ea98bc |
29
CONTRIBUTING.md
Normal file
29
CONTRIBUTING.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
License and Contributions
|
||||||
|
=========================
|
||||||
|
|
||||||
|
`repmgr` is licensed under the GPL v3. All of its code and documentation is
|
||||||
|
Copyright 2010-2015, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
|
||||||
|
details.
|
||||||
|
|
||||||
|
The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
|
||||||
|
|
||||||
|
Additional work has been sponsored by the 4CaaST project for cloud computing,
|
||||||
|
which has received funding from the European Union's Seventh Framework Programme
|
||||||
|
(FP7/2007-2013) under grant agreement 258862.
|
||||||
|
|
||||||
|
Contributions to `repmgr` are welcome, and will be listed in the file `CREDITS`.
|
||||||
|
2ndQuadrant Limited requires that any contributions provide a copyright
|
||||||
|
assignment and a disclaimer of any work-for-hire ownership claims from the
|
||||||
|
employer of the developer. This lets us make sure that all of the repmgr
|
||||||
|
distribution remains free code. Please contact info@2ndQuadrant.com for a
|
||||||
|
copy of the relevant Copyright Assignment Form.
|
||||||
|
|
||||||
|
Code style
|
||||||
|
----------
|
||||||
|
|
||||||
|
Code in repmgr is formatted to a consistent style using the following command:
|
||||||
|
|
||||||
|
astyle --style=ansi --indent=tab --suffix=none *.c *.h
|
||||||
|
|
||||||
|
Contributors should reformat their code similarly before submitting code to
|
||||||
|
the project, in order to minimize merge conflicts with other work.
|
||||||
@@ -203,6 +203,12 @@ repmgr will also ask for the superuser password on the witness database so
|
|||||||
it can reconnect when needed (the command line option --initdb-no-pwprompt
|
it can reconnect when needed (the command line option --initdb-no-pwprompt
|
||||||
will set up a password-less superuser).
|
will set up a password-less superuser).
|
||||||
|
|
||||||
|
By default the witness server will listen on port 5499; this value can be
|
||||||
|
overridden by explicitly providing the port number in the conninfo string
|
||||||
|
in repmgr.conf. (Note that it is also possible to specify the port number
|
||||||
|
with the -l/--local-port option, however this option is now deprecated and
|
||||||
|
will be overridden by a port setting in the conninfo string).
|
||||||
|
|
||||||
Start the repmgrd daemons
|
Start the repmgrd daemons
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
|
|||||||
19
FAQ.md
19
FAQ.md
@@ -90,6 +90,23 @@ General
|
|||||||
|
|
||||||
This option is only available when using the `--rsync-only` option.
|
This option is only available when using the `--rsync-only` option.
|
||||||
|
|
||||||
|
- How can I make the witness server use a particular port?
|
||||||
|
|
||||||
|
By default the witness server is configured to use port 5499; this
|
||||||
|
is intended to support running the witness server as a separate
|
||||||
|
instance on a normal node server, rather than on its own dedicated server.
|
||||||
|
|
||||||
|
To specify a port for the witness server, supply the port number to
|
||||||
|
repmgr with the `-l/--local-port` command line option.
|
||||||
|
|
||||||
|
- Do I need to include `shared_preload_libraries = 'repmgr_funcs'`
|
||||||
|
in `postgresql.conf` if I'm not using `repmgrd`?
|
||||||
|
|
||||||
|
No, the `repmgr_funcs` library is only needed when running `repmgrd`.
|
||||||
|
If you later decide to run `repmgrd`, you just need to add
|
||||||
|
`shared_preload_libraries = 'repmgr_funcs'` and restart PostgreSQL.
|
||||||
|
|
||||||
|
|
||||||
`repmgrd`
|
`repmgrd`
|
||||||
---------
|
---------
|
||||||
|
|
||||||
@@ -102,7 +119,7 @@ General
|
|||||||
|
|
||||||
- How can I prevent a node from ever being promoted to master?
|
- How can I prevent a node from ever being promoted to master?
|
||||||
|
|
||||||
In `rempgr.conf`, set its priority to a value of 0 or less.
|
In `repmgr.conf`, set its priority to a value of 0 or less.
|
||||||
|
|
||||||
- Does `repmgrd` support delayed standbys?
|
- Does `repmgrd` support delayed standbys?
|
||||||
|
|
||||||
|
|||||||
30
HISTORY
30
HISTORY
@@ -1,4 +1,27 @@
|
|||||||
3.0
|
3.0.2 2015-09-
|
||||||
|
Improve handling of --help/--version options; and improve help output (Ian)
|
||||||
|
Improve handling of situation where logfile can't be opened (Ian)
|
||||||
|
Always pass -D/--pgdata option to pg_basebackup (Ian)
|
||||||
|
Bugfix: standby clone --force does not empty pg_xlog (Gianni)
|
||||||
|
Bugfix: autofailover with reconnect_attempts > 1 (Gianni)
|
||||||
|
Bugfix: ignore comments after values (soxwellfb)
|
||||||
|
Bugfix: handle string values in 'node' parameter correctly (Gregory Duchatelet)
|
||||||
|
Allow repmgr to be compiled with a newer libpq (Marco)
|
||||||
|
Bugfix: call update_node_record_set_upstream() for STANDBY FOLLOW (Tomas)
|
||||||
|
Update `repmgr --help` output (per Github report from renard)
|
||||||
|
Update tablespace remapping in --rsync-only mode for 9.5 and later (Ian)
|
||||||
|
Deprecate `-l/--local-port` option - the port can be extracted
|
||||||
|
from the conninfo string in repmgr.conf (Ian)
|
||||||
|
Add STANDBY UNREGISTE (Vik Fearing)
|
||||||
|
|
||||||
|
3.0.1 2015-04-16
|
||||||
|
Prevent repmgrd from looping infinitely if node was not registered (Ian)
|
||||||
|
When promoting a standby, have repmgr (not repmgrd) handle metadata updates (Ian)
|
||||||
|
Re-use replication slot if it already exists (Ian)
|
||||||
|
Prevent a test SSH connection being made when not needed (Ian)
|
||||||
|
Correct monitoring table column names (Ian)
|
||||||
|
|
||||||
|
3.0 2015-03-27
|
||||||
Require PostgreSQL 9.3 or later (Ian)
|
Require PostgreSQL 9.3 or later (Ian)
|
||||||
Use `pg_basebackup` by default (instead of `rsync`) to clone standby servers (Ian)
|
Use `pg_basebackup` by default (instead of `rsync`) to clone standby servers (Ian)
|
||||||
Use `pg_ctl promote` to promote a standby to primary
|
Use `pg_ctl promote` to promote a standby to primary
|
||||||
@@ -11,6 +34,11 @@
|
|||||||
General usability and logging message improvements (Ian)
|
General usability and logging message improvements (Ian)
|
||||||
Code consolidation and cleanup (Ian)
|
Code consolidation and cleanup (Ian)
|
||||||
|
|
||||||
|
2.0.3 2015-04-16
|
||||||
|
Add -S/--superuser option for witness database creation Ian)
|
||||||
|
Add -c/--fast-checkpoint option for cloning (Christoph)
|
||||||
|
Add option "--initdb-no-pwprompt" (Ian)
|
||||||
|
|
||||||
2.0.2 2015-02-17
|
2.0.2 2015-02-17
|
||||||
Add "--checksum" in rsync when using "--force" (Jaime)
|
Add "--checksum" in rsync when using "--force" (Jaime)
|
||||||
Use createdb/createuser instead of psql (Jaime)
|
Use createdb/createuser instead of psql (Jaime)
|
||||||
|
|||||||
96
PACKAGES.md
96
PACKAGES.md
@@ -4,10 +4,10 @@ Packaging
|
|||||||
Notes on RedHat Linux, Fedora, and CentOS Builds
|
Notes on RedHat Linux, Fedora, and CentOS Builds
|
||||||
------------------------------------------------
|
------------------------------------------------
|
||||||
|
|
||||||
The RPM packages of PostgreSQL put ``pg_config`` into the ``postgresql-devel``
|
The RPM packages of PostgreSQL put `pg_config` into the `postgresql-devel`
|
||||||
package, not the main server one. And if you have a RPM install of PostgreSQL
|
package, not the main server one. And if you have a RPM install of PostgreSQL
|
||||||
9.0, the entire PostgreSQL binary directory will not be in your PATH by default
|
9.0, the entire PostgreSQL binary directory will not be in your PATH by default
|
||||||
either. Individual utilities are made available via the ``alternatives``
|
either. Individual utilities are made available via the `alternatives`
|
||||||
mechanism, but not all commands will be wrapped that way. The files installed
|
mechanism, but not all commands will be wrapped that way. The files installed
|
||||||
by repmgr will certainly not be in the default PATH for the postgres user
|
by repmgr will certainly not be in the default PATH for the postgres user
|
||||||
on such a system. They will instead be in /usr/pgsql-9.0/bin/ on this
|
on such a system. They will instead be in /usr/pgsql-9.0/bin/ on this
|
||||||
@@ -15,57 +15,61 @@ type of system.
|
|||||||
|
|
||||||
When building repmgr against a RPM packaged build, you may discover that some
|
When building repmgr against a RPM packaged build, you may discover that some
|
||||||
development packages are needed as well. The following build errors can
|
development packages are needed as well. The following build errors can
|
||||||
occur::
|
occur:
|
||||||
|
|
||||||
/usr/bin/ld: cannot find -lxslt
|
/usr/bin/ld: cannot find -lxslt
|
||||||
/usr/bin/ld: cannot find -lpam
|
/usr/bin/ld: cannot find -lpam
|
||||||
|
|
||||||
Install the following packages to correct those::
|
Install the following packages to correct those:
|
||||||
|
|
||||||
yum install libxslt-devel
|
|
||||||
yum install pam-devel
|
yum install libxslt-devel
|
||||||
|
yum install pam-devel
|
||||||
|
|
||||||
If building repmgr as a regular user, then doing the install into the system
|
If building repmgr as a regular user, then doing the install into the system
|
||||||
directories using sudo, the syntax is hard. ``pg_config`` won't be in root's
|
directories using sudo, the syntax is hard. `pg_config` won't be in root's
|
||||||
path either. The following recipe should work::
|
path either. The following recipe should work:
|
||||||
|
|
||||||
|
sudo PATH="/usr/pgsql-9.0/bin:$PATH" make USE_PGXS=1 install
|
||||||
|
|
||||||
sudo PATH="/usr/pgsql-9.0/bin:$PATH" make USE_PGXS=1 install
|
|
||||||
|
|
||||||
Issues with 32 and 64 bit RPMs
|
Issues with 32 and 64 bit RPMs
|
||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
If when building, you receive a series of errors of this form::
|
If when building, you receive a series of errors of this form:
|
||||||
|
|
||||||
/usr/bin/ld: skipping incompatible /usr/pgsql-9.0/lib/libpq.so when searching for -lpq
|
/usr/bin/ld: skipping incompatible /usr/pgsql-9.0/lib/libpq.so when searching for -lpq
|
||||||
|
|
||||||
This is likely because you have both the 32 and 64 bit versions of the
|
This is likely because you have both the 32 and 64 bit versions of the
|
||||||
``postgresql90-devel`` package installed. You can check that like this::
|
`postgresql90-devel` package installed. You can check that like this:
|
||||||
|
|
||||||
rpm -qa --queryformat '%{NAME}\t%{ARCH}\n' | grep postgresql90-devel
|
rpm -qa --queryformat '%{NAME}\t%{ARCH}\n' | grep postgresql90-devel
|
||||||
|
|
||||||
And if two packages appear, one for i386 and one for x86_64, that's not supposed
|
And if two packages appear, one for i386 and one for x86_64, that's not supposed
|
||||||
to be allowed.
|
to be allowed.
|
||||||
|
|
||||||
This can happen when using the PGDG repo to install that package;
|
This can happen when using the PGDG repo to install that package;
|
||||||
here is an example sessions demonstrating the problem case appearing::
|
here is an example sessions demonstrating the problem case appearing:
|
||||||
|
|
||||||
# yum install postgresql-devel
|
|
||||||
..
|
|
||||||
Setting up Install Process
|
|
||||||
Resolving Dependencies
|
|
||||||
--> Running transaction check
|
|
||||||
---> Package postgresql90-devel.i386 0:9.0.2-2PGDG.rhel5 set to be updated
|
|
||||||
---> Package postgresql90-devel.x86_64 0:9.0.2-2PGDG.rhel5 set to be updated
|
|
||||||
--> Finished Dependency Resolution
|
|
||||||
|
|
||||||
Dependencies Resolved
|
# yum install postgresql-devel
|
||||||
|
..
|
||||||
|
Setting up Install Process
|
||||||
|
Resolving Dependencies
|
||||||
|
--> Running transaction check
|
||||||
|
---> Package postgresql90-devel.i386 0:9.0.2-2PGDG.rhel5 set to be updated
|
||||||
|
---> Package postgresql90-devel.x86_64 0:9.0.2-2PGDG.rhel5 set to be updated
|
||||||
|
--> Finished Dependency Resolution
|
||||||
|
|
||||||
|
Dependencies Resolved
|
||||||
|
|
||||||
|
=========================================================================
|
||||||
|
Package Arch Version Repository Size
|
||||||
|
=========================================================================
|
||||||
|
Installing:
|
||||||
|
postgresql90-devel i386 9.0.2-2PGDG.rhel5 pgdg90 1.5 M
|
||||||
|
postgresql90-devel x86_64 9.0.2-2PGDG.rhel5 pgdg90 1.6 M
|
||||||
|
|
||||||
=========================================================================
|
|
||||||
Package Arch Version Repository Size
|
|
||||||
=========================================================================
|
|
||||||
Installing:
|
|
||||||
postgresql90-devel i386 9.0.2-2PGDG.rhel5 pgdg90 1.5 M
|
|
||||||
postgresql90-devel x86_64 9.0.2-2PGDG.rhel5 pgdg90 1.6 M
|
|
||||||
|
|
||||||
Note how both the i386 and x86_64 platform architectures are selected for
|
Note how both the i386 and x86_64 platform architectures are selected for
|
||||||
installation. Your main PostgreSQL package will only be compatible with one of
|
installation. Your main PostgreSQL package will only be compatible with one of
|
||||||
@@ -73,14 +77,14 @@ those, and if the repmgr build finds the wrong postgresql90-devel these
|
|||||||
"skipping incompatible" messages appear.
|
"skipping incompatible" messages appear.
|
||||||
|
|
||||||
In this case, you can temporarily remove both packages, then just install the
|
In this case, you can temporarily remove both packages, then just install the
|
||||||
correct one for your architecture. Example::
|
correct one for your architecture. Example:
|
||||||
|
|
||||||
rpm -e postgresql90-devel --allmatches
|
rpm -e postgresql90-devel --allmatches
|
||||||
yum install postgresql90-devel-9.0.2-2PGDG.rhel5.x86_64
|
yum install postgresql90-devel-9.0.2-2PGDG.rhel5.x86_64
|
||||||
|
|
||||||
Instead just deleting the package from the wrong platform might not leave behind
|
Instead just deleting the package from the wrong platform might not leave behind
|
||||||
the correct files, due to the way in which these accidentally happen to interact.
|
the correct files, due to the way in which these accidentally happen to interact.
|
||||||
If you already tried to build repmgr before doing this, you'll need to do::
|
If you already tried to build repmgr before doing this, you'll need to do:
|
||||||
|
|
||||||
make USE_PGXS=1 clean
|
make USE_PGXS=1 clean
|
||||||
|
|
||||||
@@ -89,19 +93,19 @@ to get rid of leftover files from the wrong architecture.
|
|||||||
Notes on Ubuntu, Debian or other Debian-based Builds
|
Notes on Ubuntu, Debian or other Debian-based Builds
|
||||||
----------------------------------------------------
|
----------------------------------------------------
|
||||||
|
|
||||||
The Debian packages of PostgreSQL put ``pg_config`` into the development package
|
The Debian packages of PostgreSQL put `pg_config` into the development package
|
||||||
called ``postgresql-server-dev-$version``.
|
called `postgresql-server-dev-$version`.
|
||||||
|
|
||||||
When building repmgr against a Debian packages build, you may discover that some
|
When building repmgr against a Debian packages build, you may discover that some
|
||||||
development packages are needed as well. You will need the following development
|
development packages are needed as well. You will need the following development
|
||||||
packages installed::
|
packages installed:
|
||||||
|
|
||||||
sudo apt-get install libxslt-dev libxml2-dev libpam-dev libedit-dev
|
sudo apt-get install libxslt-dev libxml2-dev libpam-dev libedit-dev
|
||||||
|
|
||||||
If your using Debian packages for PostgreSQL and are building repmgr with the
|
If you're using Debian packages for PostgreSQL and are building repmgr with the
|
||||||
USE_PGXS option you also need to install the corresponding development package::
|
USE_PGXS option you also need to install the corresponding development package:
|
||||||
|
|
||||||
sudo apt-get install postgresql-server-dev-9.0
|
sudo apt-get install postgresql-server-dev-9.0
|
||||||
|
|
||||||
If you build and install repmgr manually it will not be on the system path. The
|
If you build and install repmgr manually it will not be on the system path. The
|
||||||
binaries will be installed in /usr/lib/postgresql/$version/bin/ which is not on
|
binaries will be installed in /usr/lib/postgresql/$version/bin/ which is not on
|
||||||
@@ -110,14 +114,14 @@ multiple installed versions of PostgreSQL on the same system through a wrapper
|
|||||||
called pg_wrapper and repmgr is not (yet) known to this wrapper.
|
called pg_wrapper and repmgr is not (yet) known to this wrapper.
|
||||||
|
|
||||||
You can solve this in many different ways, the most Debian like is to make an
|
You can solve this in many different ways, the most Debian like is to make an
|
||||||
alternate for repmgr and repmgrd::
|
alternate for repmgr and repmgrd:
|
||||||
|
|
||||||
sudo update-alternatives --install /usr/bin/repmgr repmgr /usr/lib/postgresql/9.0/bin/repmgr 10
|
sudo update-alternatives --install /usr/bin/repmgr repmgr /usr/lib/postgresql/9.0/bin/repmgr 10
|
||||||
sudo update-alternatives --install /usr/bin/repmgrd repmgrd /usr/lib/postgresql/9.0/bin/repmgrd 10
|
sudo update-alternatives --install /usr/bin/repmgrd repmgrd /usr/lib/postgresql/9.0/bin/repmgrd 10
|
||||||
|
|
||||||
You can also make a deb package of repmgr using::
|
You can also make a deb package of repmgr using:
|
||||||
|
|
||||||
make USE_PGXS=1 deb
|
make USE_PGXS=1 deb
|
||||||
|
|
||||||
This will build a Debian package one level up from where you build, normally the
|
This will build a Debian package one level up from where you build, normally the
|
||||||
same directory that you have your repmgr/ directory in.
|
same directory that you have your repmgr/ directory in.
|
||||||
|
|||||||
@@ -21,7 +21,8 @@ Master setup
|
|||||||
CREATE DATABASE repmgr_db OWNER repmgr_usr;
|
CREATE DATABASE repmgr_db OWNER repmgr_usr;
|
||||||
```
|
```
|
||||||
|
|
||||||
- configure `postgresql.conf` for replication (see above)
|
- configure `postgresql.conf` for replication (see README.md for sample
|
||||||
|
settings)
|
||||||
|
|
||||||
- update `pg_hba.conf`, e.g.:
|
- update `pg_hba.conf`, e.g.:
|
||||||
|
|
||||||
@@ -71,7 +72,10 @@ Standby setup
|
|||||||
[2015-03-03 18:18:23] [NOTICE] HINT: You can now start your postgresql server
|
[2015-03-03 18:18:23] [NOTICE] HINT: You can now start your postgresql server
|
||||||
[2015-03-03 18:18:23] [NOTICE] for example : pg_ctl -D /path/to/standby/data start
|
[2015-03-03 18:18:23] [NOTICE] for example : pg_ctl -D /path/to/standby/data start
|
||||||
|
|
||||||
Note that at this point it does not matter if the `repmgr.conf` file is not found.
|
Note that the `repmgr.conf` file is not required when cloning a standby.
|
||||||
|
However we recommend providing a valid `repmgr.conf` if you wish to use
|
||||||
|
replication slots, or want `repmgr` to log the clone event to the
|
||||||
|
`repl_events` table.
|
||||||
|
|
||||||
This will clone the PostgreSQL database files from the master, including its
|
This will clone the PostgreSQL database files from the master, including its
|
||||||
`postgresql.conf` and `pg_hba.conf` files, and additionally automatically create
|
`postgresql.conf` and `pg_hba.conf` files, and additionally automatically create
|
||||||
@@ -107,8 +111,8 @@ This concludes the basic `repmgr` setup of master and standby. The records
|
|||||||
created in the `repl_nodes` table should look something like this:
|
created in the `repl_nodes` table should look something like this:
|
||||||
|
|
||||||
repmgr_db=# SELECT * from repmgr_test.repl_nodes;
|
repmgr_db=# SELECT * from repmgr_test.repl_nodes;
|
||||||
id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active
|
id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active
|
||||||
----+---------+------------------+---------+-------+-------------------------------------------------+-----------+----------+--------
|
----+---------+------------------+---------+-------+----------------------------------------------------+-----------+----------+--------
|
||||||
1 | primary | | test | node1 | host=localhost user=repmgr_usr dbname=repmgr_db | | 0 | t
|
1 | primary | | test | node1 | host=repmgr_node1 user=repmgr_usr dbname=repmgr_db | | 0 | t
|
||||||
2 | standby | 1 | test | node2 | host=localhost user=repmgr_usr dbname=repmgr_db | | 0 | t
|
2 | standby | 1 | test | node2 | host=repmgr_node2 user=repmgr_usr dbname=repmgr_db | | 0 | t
|
||||||
(2 rows)
|
(2 rows)
|
||||||
|
|||||||
45
README.md
45
README.md
@@ -7,7 +7,7 @@ hot-standby capabilities with tools to set up standby servers, monitor
|
|||||||
replication, and perform administrative tasks such as failover or manual
|
replication, and perform administrative tasks such as failover or manual
|
||||||
switchover operations.
|
switchover operations.
|
||||||
|
|
||||||
This document covers `repmgr 3`, which supports PostgreSQL 9.4 and 9.3.
|
This document covers `repmgr 3`, which supports PostgreSQL 9.3 and later.
|
||||||
This version can use `pg_basebackup` to clone standby servers, supports
|
This version can use `pg_basebackup` to clone standby servers, supports
|
||||||
replication slots and cascading replication, doesn't require a restart
|
replication slots and cascading replication, doesn't require a restart
|
||||||
after promotion, and has many usability improvements.
|
after promotion, and has many usability improvements.
|
||||||
@@ -53,7 +53,7 @@ on any UNIX-like system which PostgreSQL itself supports.
|
|||||||
|
|
||||||
All nodes must be running the same major version of PostgreSQL, and we
|
All nodes must be running the same major version of PostgreSQL, and we
|
||||||
recommend that they also run the same minor version. This version of
|
recommend that they also run the same minor version. This version of
|
||||||
`repmgr` (v3) supports PostgreSQL 9.3 and 9.4.
|
`repmgr` (v3) supports PostgreSQL 9.3 and later.
|
||||||
|
|
||||||
Earlier versions of `repmgr` needed password-less SSH access between
|
Earlier versions of `repmgr` needed password-less SSH access between
|
||||||
nodes in order to clone standby servers using `rsync`. `repmgr 3` can
|
nodes in order to clone standby servers using `rsync`. `repmgr 3` can
|
||||||
@@ -98,8 +98,8 @@ for details.
|
|||||||
|
|
||||||
### PostgreSQL configuration
|
### PostgreSQL configuration
|
||||||
|
|
||||||
The primary server needs to be configured for replication with the
|
The primary server needs to be configured for replication with settings
|
||||||
following settings in `postgresql.conf`:
|
like the following in `postgresql.conf`:
|
||||||
|
|
||||||
# Allow read-only queries on standby servers. The number of WAL
|
# Allow read-only queries on standby servers. The number of WAL
|
||||||
# senders should be larger than the number of standby servers.
|
# senders should be larger than the number of standby servers.
|
||||||
@@ -121,13 +121,18 @@ following settings in `postgresql.conf`:
|
|||||||
archive_mode = on
|
archive_mode = on
|
||||||
archive_command = 'cd .'
|
archive_command = 'cd .'
|
||||||
|
|
||||||
# You can also set additional replication parameters here, such as
|
# If you plan to use repmgrd, ensure that shared_preload_libraries
|
||||||
# hot_standby_feedback or synchronous_standby_names.
|
# is configured to load 'repmgr_funcs'
|
||||||
|
|
||||||
|
shared_preload_libraries = 'repmgr_funcs'
|
||||||
|
|
||||||
PostgreSQL 9.4 makes it possible to use replication slots, which means
|
PostgreSQL 9.4 makes it possible to use replication slots, which means
|
||||||
the value of wal_keep_segments need no longer be set. With 9.3, `repmgr`
|
the value of `wal_keep_segments` need no longer be set. See section
|
||||||
expects it to be set to at least 5000 (= 80GB of WAL) by default, though
|
"Replication slots" below for more details.
|
||||||
this can be overriden with the `-w N` argument.
|
|
||||||
|
With PostgreSQL 9.3, `repmgr` expects `wal_keep_segments` to be set to
|
||||||
|
at least 5000 (= 80GB of WAL) by default, though this can be overriden
|
||||||
|
with the `-w N` argument.
|
||||||
|
|
||||||
A dedicated PostgreSQL superuser account and a database in which to
|
A dedicated PostgreSQL superuser account and a database in which to
|
||||||
store monitoring and replication data are required. Create them by
|
store monitoring and replication data are required. Create them by
|
||||||
@@ -223,7 +228,7 @@ The node can then be restarted.
|
|||||||
The node will then need to be re-registered with `repmgr`; again
|
The node will then need to be re-registered with `repmgr`; again
|
||||||
the `--force` option is required to update the existing record:
|
the `--force` option is required to update the existing record:
|
||||||
|
|
||||||
repmgr -f /etc/repmgr/repmgr.conf
|
repmgr -f /etc/repmgr/repmgr.conf \
|
||||||
--force \
|
--force \
|
||||||
standby register
|
standby register
|
||||||
|
|
||||||
@@ -345,6 +350,7 @@ Following event types currently exist:
|
|||||||
|
|
||||||
master_register
|
master_register
|
||||||
standby_register
|
standby_register
|
||||||
|
standby_unregister
|
||||||
standby_clone
|
standby_clone
|
||||||
standby_promote
|
standby_promote
|
||||||
witness_create
|
witness_create
|
||||||
@@ -398,6 +404,18 @@ stored in the `repl_nodes` table.
|
|||||||
Note that `repmgr` will fail with an error if this option is specified when
|
Note that `repmgr` will fail with an error if this option is specified when
|
||||||
working with PostgreSQL 9.3.
|
working with PostgreSQL 9.3.
|
||||||
|
|
||||||
|
Be aware that when initially cloning a standby, you will need to ensure
|
||||||
|
that all required WAL files remain available while the cloning is taking
|
||||||
|
place. If using the default `pg_basebackup` method, we recommend setting
|
||||||
|
`pg_basebackup`'s `--xlog-method` parameter to `stream` like this:
|
||||||
|
|
||||||
|
pg_basebackup_options='--xlog-method=stream'
|
||||||
|
|
||||||
|
See the `pg_basebackup` documentation [*] for details. Otherwise you'll need
|
||||||
|
to set `wal_keep_segments` to an appropriately high value.
|
||||||
|
|
||||||
|
[*] http://www.postgresql.org/docs/current/static/app-pgbasebackup.html
|
||||||
|
|
||||||
Further reading:
|
Further reading:
|
||||||
* http://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS
|
* http://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS
|
||||||
* http://blog.2ndquadrant.com/postgresql-9-4-slots/
|
* http://blog.2ndquadrant.com/postgresql-9-4-slots/
|
||||||
@@ -435,12 +453,19 @@ its port if is different from the default one.
|
|||||||
Registers a master in a cluster. This command needs to be executed before any
|
Registers a master in a cluster. This command needs to be executed before any
|
||||||
standby nodes are registered.
|
standby nodes are registered.
|
||||||
|
|
||||||
|
`primary register` can be used as an alias for `master register`.
|
||||||
|
|
||||||
* `standby register`
|
* `standby register`
|
||||||
|
|
||||||
Registers a standby with `repmgr`. This command needs to be executed to enable
|
Registers a standby with `repmgr`. This command needs to be executed to enable
|
||||||
promote/follow operations and to allow `repmgrd` to work with the node.
|
promote/follow operations and to allow `repmgrd` to work with the node.
|
||||||
An existing standby can be registered using this command.
|
An existing standby can be registered using this command.
|
||||||
|
|
||||||
|
* `standby unregister`
|
||||||
|
|
||||||
|
Unregisters a standby with `repmgr`. This command does not affect the actual
|
||||||
|
replication.
|
||||||
|
|
||||||
* `standby clone [node to be cloned]`
|
* `standby clone [node to be cloned]`
|
||||||
|
|
||||||
Clones a new standby node from the data directory of the master (or
|
Clones a new standby node from the data directory of the master (or
|
||||||
|
|||||||
@@ -1,89 +1,114 @@
|
|||||||
#!/bin/bash
|
#!/bin/sh
|
||||||
#
|
#
|
||||||
# repmgrd Start up the repmgrd daemon
|
# chkconfig: - 75 16
|
||||||
# repmrgd (replication manager daemon)
|
# description: Enable repmgrd replication management and monitoring daemon for PostgreSQL
|
||||||
#
|
# processname: repmgrd
|
||||||
# chkconfig: - 75 16
|
# pidfile="/var/run/${NAME}.pid"
|
||||||
# description: repmgrd is the repliation manager daemon \
|
|
||||||
# The repmgrd replication management and monitoring daemon for PostgreSQL.
|
|
||||||
|
|
||||||
### BEGIN INIT INFO
|
|
||||||
# Provides: repmgrd
|
|
||||||
# Required-Start: $local_fs $remote_fs $network $syslog postgresql
|
|
||||||
# Required-Stop: $local_fs $remote_fs $network $syslog postgresql
|
|
||||||
# Should-Start: $syslog postgresql-9.3
|
|
||||||
# Should-Stop: $syslog postgresql-9.3
|
|
||||||
# Short-Description: start and stop repmrgd
|
|
||||||
# Description: Enable repmgrd replication management and monitoring daemon for PostgreSQL
|
|
||||||
# this is used to monitor a postgresql cluster.
|
|
||||||
### END INIT INFO
|
|
||||||
|
|
||||||
# Source function library.
|
# Source function library.
|
||||||
. /etc/init.d/functions
|
INITD=/etc/rc.d/init.d
|
||||||
|
. $INITD/functions
|
||||||
|
|
||||||
# Source networking configuration.
|
# Get function listing for cross-distribution logic.
|
||||||
|
TYPESET=`typeset -f|grep "declare"`
|
||||||
|
|
||||||
|
# Get network config.
|
||||||
. /etc/sysconfig/network
|
. /etc/sysconfig/network
|
||||||
|
|
||||||
prog=repmgrd
|
DESC="PostgreSQL replication management and monitoring daemon"
|
||||||
REPMGRD_ENABLED=yes
|
NAME=repmgrd
|
||||||
|
|
||||||
|
REPMGRD_ENABLED=no
|
||||||
REPMGRD_OPTS=
|
REPMGRD_OPTS=
|
||||||
REPMGRD_USER=postgres
|
REPMGRD_USER=postgres
|
||||||
DAEMONIZE="-d"
|
REPMGRD_BIN=/usr/pgsql-9.3/bin/repmgrd
|
||||||
|
REPMGRD_PIDFILE=/var/run/repmgrd.pid
|
||||||
|
REPMGRD_LOCK=/var/lock/subsys/${NAME}
|
||||||
|
REPMGRD_LOG=/var/lib/pgsql/9.3/data/pg_log/repmgrd.log
|
||||||
|
|
||||||
# pull in sysconfig settings
|
# Read configuration variable file if it is present
|
||||||
[ -f /etc/sysconfig/repmgrd ] && . /etc/sysconfig/repmgrd
|
[ -r /etc/sysconfig/$NAME ] && . /etc/sysconfig/$NAME
|
||||||
|
|
||||||
LOCKFILE=/var/lock/subsys/$prog
|
# For SELinux we need to use 'runuser' not 'su'
|
||||||
RETVAL=0
|
if [ -x /sbin/runuser ]
|
||||||
|
then
|
||||||
|
SU=runuser
|
||||||
|
else
|
||||||
|
SU=su
|
||||||
|
fi
|
||||||
|
|
||||||
|
test -x $REPMGRD_BIN || exit 0
|
||||||
|
|
||||||
case "$REPMGRD_ENABLED" in
|
case "$REPMGRD_ENABLED" in
|
||||||
[Yy]*)
|
[Yy]*)
|
||||||
#nothing to do here
|
break
|
||||||
;;
|
;;
|
||||||
*)
|
*)
|
||||||
exit 2
|
exit 0
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
|
|
||||||
if [ -z "$REPMGRD_OPTS" ]
|
if [ -z "${REPMGRD_OPTS}" ]
|
||||||
then
|
then
|
||||||
echo "Not starting $prog, REPMGRD_OPTS not set in /etc/sysconfig/$prog"
|
echo "Not starting ${NAME}, REPMGRD_OPTS not set in /etc/sysconfig/${NAME}"
|
||||||
exit 2
|
exit 0
|
||||||
fi
|
fi
|
||||||
|
|
||||||
start() {
|
start()
|
||||||
[ "$EUID" != "0" ] && exit 4
|
{
|
||||||
[ "$NETWORKING" = "no" ] && exit 1
|
REPMGRD_START=$"Starting ${NAME} service: "
|
||||||
|
|
||||||
# Start daemons.
|
# Make sure startup-time log file is valid
|
||||||
echo -n $"Starting $prog: "
|
if [ ! -e "${REPMGRD_LOG}" -a ! -h "${REPMGRD_LOG}" ]
|
||||||
daemon --user $REPMGRD_USER $prog $DAEMONIZE $REPMGRD_OPTS
|
then
|
||||||
RETVAL=$?
|
touch "${REPMGRD_LOG}" || exit 1
|
||||||
|
chown ${REPMGRD_USER}:postgres "${REPMGRD_LOG}"
|
||||||
|
chmod go-rwx "${REPMGRD_LOG}"
|
||||||
|
[ -x /sbin/restorecon ] && /sbin/restorecon "${REPMGRD_LOG}"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo -n "${REPMGRD_START}"
|
||||||
|
$SU -l $REPMGRD_USER -c "${REPMGRD_BIN} ${REPMGRD_OPTS} -p ${REPMGRD_PIDFILE} &" >> "${REPMGRD_LOG}" 2>&1 < /dev/null
|
||||||
|
sleep 2
|
||||||
|
pid=`head -n 1 "${REPMGRD_PIDFILE}" 2>/dev/null`
|
||||||
|
if [ "x${pid}" != "x" ]
|
||||||
|
then
|
||||||
|
success "${REPMGRD_START}"
|
||||||
|
touch "${REPMGRD_LOCK}"
|
||||||
|
echo $pid > "${REPMGRD_PIDFILE}"
|
||||||
echo
|
echo
|
||||||
[ $RETVAL -eq 0 ] && touch $LOCKFILE
|
else
|
||||||
return $RETVAL
|
failure "${REPMGRD_START}"
|
||||||
|
echo
|
||||||
|
script_result=1
|
||||||
|
fi
|
||||||
}
|
}
|
||||||
|
|
||||||
stop() {
|
stop()
|
||||||
[ "$EUID" != "0" ] && exit 4
|
{
|
||||||
echo -n $"Shutting down $prog: "
|
echo -n $"Stopping ${NAME} service: "
|
||||||
killproc $prog
|
if [ -e "${REPMGRD_LOCK}" ]
|
||||||
RETVAL=$?
|
then
|
||||||
echo
|
killproc ${NAME}
|
||||||
[ $RETVAL -eq 0 ] && rm -f $LOCKFILE
|
ret=$?
|
||||||
return $RETVAL
|
if [ $ret -eq 0 ]
|
||||||
}
|
then
|
||||||
status() {
|
echo_success
|
||||||
if [ -f "$LOCKFILE" ]; then
|
rm -f "${REPMGRD_PIDFILE}"
|
||||||
echo "$prog is running"
|
rm -f "${REPMGRD_LOCK}"
|
||||||
else
|
else
|
||||||
RETVAL=3
|
echo_failure
|
||||||
echo "$prog is stopped"
|
script_result=1
|
||||||
fi
|
fi
|
||||||
return $RETVAL
|
else
|
||||||
|
# not running; per LSB standards this is "ok"
|
||||||
|
echo_success
|
||||||
|
fi
|
||||||
|
echo
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
# See how we were called.
|
# See how we were called.
|
||||||
case "$1" in
|
case "$1" in
|
||||||
start)
|
start)
|
||||||
@@ -93,22 +118,16 @@ case "$1" in
|
|||||||
stop
|
stop
|
||||||
;;
|
;;
|
||||||
status)
|
status)
|
||||||
status $prog
|
status -p $REPMGRD_PIDFILE $NAME
|
||||||
|
script_result=$?
|
||||||
;;
|
;;
|
||||||
restart|force-reload)
|
restart)
|
||||||
stop
|
stop
|
||||||
start
|
start
|
||||||
;;
|
|
||||||
try-restart|condrestart)
|
|
||||||
if status $prog > /dev/null; then
|
|
||||||
stop
|
|
||||||
start
|
|
||||||
fi
|
|
||||||
;;
|
|
||||||
reload)
|
|
||||||
exit 3
|
|
||||||
;;
|
;;
|
||||||
*)
|
*)
|
||||||
echo $"Usage: $0 {start|stop|status|restart|try-restart|force-reload}"
|
echo $"Usage: $0 {start|stop|status|restart}"
|
||||||
exit 2
|
exit 2
|
||||||
esac
|
esac
|
||||||
|
|
||||||
|
exit $script_result
|
||||||
|
|||||||
@@ -1,4 +1,21 @@
|
|||||||
#default sysconfig file for repmrgd
|
# default settings for repmgrd. This file is source by /bin/sh from
|
||||||
#custom overrides can be placed here
|
# /etc/init.d/repmgrd
|
||||||
|
|
||||||
REPMGRD_OPTS="-f /etc/repmgr/repmgr.conf"
|
# disable repmgrd by default so it won't get started upon installation
|
||||||
|
# valid values: yes/no
|
||||||
|
REPMGRD_ENABLED=no
|
||||||
|
|
||||||
|
# Options for repmgrd (required)
|
||||||
|
#REPMGRD_OPTS="--verbose -d -f /var/lib/pgsql/repmgr/repmgr.conf"
|
||||||
|
|
||||||
|
# User to run repmgrd as
|
||||||
|
#REPMGRD_USER=postgres
|
||||||
|
|
||||||
|
# repmgrd binary
|
||||||
|
#REPMGRD_BIN=/usr/bin/repmgrd
|
||||||
|
|
||||||
|
# pid file
|
||||||
|
#REPMGRD_PIDFILE=/var/lib/pgsql/repmgr/repmgrd.pid
|
||||||
|
|
||||||
|
# log file
|
||||||
|
#REPMGRD_LOG=/var/lib/pgsql/repmgr/repmgrd.log
|
||||||
|
|||||||
49
SSH-RSYNC.md
49
SSH-RSYNC.md
@@ -1,35 +1,36 @@
|
|||||||
Set up trusted copy between postgres accounts
|
Set up trusted copy between postgres accounts
|
||||||
---------------------------------------------
|
---------------------------------------------
|
||||||
|
|
||||||
If you need to use rsync to clone standby servers, the postgres account
|
If you need to use `rsync` to clone standby servers, the `postgres` account
|
||||||
on your master and standby servers must be each able to access the other
|
on your primary and standby servers must be each able to access the other
|
||||||
using SSH without a password.
|
using SSH without a password.
|
||||||
|
|
||||||
First generate a ssh key, using an empty passphrase, and copy the resulting
|
First generate an ssh key, using an empty passphrase, and copy the resulting
|
||||||
keys and a maching authorization file to a privledged user on the other system::
|
keys and a matching authorization file to a privileged user account on the other
|
||||||
|
system:
|
||||||
|
|
||||||
[postgres@node1]$ ssh-keygen -t rsa
|
[postgres@node1]$ ssh-keygen -t rsa
|
||||||
Generating public/private rsa key pair.
|
Generating public/private rsa key pair.
|
||||||
Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa):
|
Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa):
|
||||||
Enter passphrase (empty for no passphrase):
|
Enter passphrase (empty for no passphrase):
|
||||||
Enter same passphrase again:
|
Enter same passphrase again:
|
||||||
Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa.
|
Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa.
|
||||||
Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub.
|
Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub.
|
||||||
The key fingerprint is:
|
The key fingerprint is:
|
||||||
aa:bb:cc:dd:ee:ff:aa:11:22:33:44:55:66:77:88:99 postgres@db1.domain.com
|
aa:bb:cc:dd:ee:ff:aa:11:22:33:44:55:66:77:88:99 postgres@db1.domain.com
|
||||||
[postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
|
[postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
|
||||||
[postgres@node1]$ chmod go-rwx ~/.ssh/*
|
[postgres@node1]$ chmod go-rwx ~/.ssh/*
|
||||||
[postgres@node1]$ cd ~/.ssh
|
[postgres@node1]$ cd ~/.ssh
|
||||||
[postgres@node1]$ scp id_rsa.pub id_rsa authorized_keys user@node2:
|
[postgres@node1]$ scp id_rsa.pub id_rsa authorized_keys user@node2:
|
||||||
|
|
||||||
Login as a user on the other system, and install the files into the postgres
|
Login as a user on the other system, and install the files into the `postgres`
|
||||||
user's account::
|
user's account:
|
||||||
|
|
||||||
[user@node2 ~]$ sudo chown postgres.postgres authorized_keys id_rsa.pub id_rsa
|
[user@node2 ~]$ sudo chown postgres.postgres authorized_keys id_rsa.pub id_rsa
|
||||||
[user@node2 ~]$ sudo mkdir -p ~postgres/.ssh
|
[user@node2 ~]$ sudo mkdir -p ~postgres/.ssh
|
||||||
[user@node2 ~]$ sudo chown postgres.postgres ~postgres/.ssh
|
[user@node2 ~]$ sudo chown postgres.postgres ~postgres/.ssh
|
||||||
[user@node2 ~]$ sudo mv authorized_keys id_rsa.pub id_rsa ~postgres/.ssh
|
[user@node2 ~]$ sudo mv authorized_keys id_rsa.pub id_rsa ~postgres/.ssh
|
||||||
[user@node2 ~]$ sudo chmod -R go-rwx ~postgres/.ssh
|
[user@node2 ~]$ sudo chmod -R go-rwx ~postgres/.ssh
|
||||||
|
|
||||||
Now test that ssh in both directions works. You may have to accept some new
|
Now test that ssh in both directions works. You may have to accept some new
|
||||||
known hosts in the process.
|
known hosts in the process.
|
||||||
|
|||||||
27
TODO
27
TODO
@@ -5,9 +5,15 @@ Known issues in repmgr
|
|||||||
the database server using the ``pg_ctl`` command may accidentally
|
the database server using the ``pg_ctl`` command may accidentally
|
||||||
terminate after their associated ssh session ends.
|
terminate after their associated ssh session ends.
|
||||||
|
|
||||||
|
* PGPASSFILE may not be passed to pg_basebackup
|
||||||
|
|
||||||
Planned feature improvements
|
Planned feature improvements
|
||||||
============================
|
============================
|
||||||
|
|
||||||
|
* Use 'primary' instead of 'master' in documentation and log output
|
||||||
|
for consistency with PostgreSQL documentation. See also commit
|
||||||
|
870b0a53b627eeb9aca1fc14cbafe25b5beafe12.
|
||||||
|
|
||||||
* A better check which standby did receive most of the data
|
* A better check which standby did receive most of the data
|
||||||
|
|
||||||
* Make the fact that a standby may be delayed a factor in the voting
|
* Make the fact that a standby may be delayed a factor in the voting
|
||||||
@@ -18,8 +24,21 @@ Planned feature improvements
|
|||||||
* Create the repmgr user/database on "master register".
|
* Create the repmgr user/database on "master register".
|
||||||
|
|
||||||
* Use pg_basebackup for the data directory, and ALSO rsync for the
|
* Use pg_basebackup for the data directory, and ALSO rsync for the
|
||||||
configuration files.
|
configuration files.
|
||||||
|
|
||||||
* Use pg_basebackup -X s
|
* If no configuration file supplied, search in sensible default locations
|
||||||
NOTE: this can be used by including `-X s` in the configuration parameter
|
(currently: current directory and `pg_config --sysconfdir`); if
|
||||||
`pg_basebackup_options`
|
possible this should include the location provided by the package,
|
||||||
|
if installed.
|
||||||
|
|
||||||
|
* repmgrd: if connection to the upstream node fails on startup, optionally
|
||||||
|
retry for a certain period before giving up; this will cover cases when
|
||||||
|
e.g. primary and standby are both starting up, and the standby comes up
|
||||||
|
before the primary. See github issue #80.
|
||||||
|
|
||||||
|
* make old master node ID available for event notification commands
|
||||||
|
(See github issue #80).
|
||||||
|
|
||||||
|
* Have pg_basebackup use replication slots, if and when support for
|
||||||
|
this is added; see:
|
||||||
|
http://www.postgresql.org/message-id/555DD2B2.7020000@gmx.net
|
||||||
|
|||||||
53
check_dir.c
53
check_dir.c
@@ -23,14 +23,19 @@
|
|||||||
#include <errno.h>
|
#include <errno.h>
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
#include <ftw.h>
|
||||||
|
|
||||||
/* NB: postgres_fe must be included BEFORE check_dir */
|
/* NB: postgres_fe must be included BEFORE check_dir */
|
||||||
#include "postgres_fe.h"
|
#include <libpq-fe.h>
|
||||||
#include "check_dir.h"
|
#include <postgres_fe.h>
|
||||||
|
|
||||||
|
#include "check_dir.h"
|
||||||
#include "strutil.h"
|
#include "strutil.h"
|
||||||
#include "log.h"
|
#include "log.h"
|
||||||
|
|
||||||
|
static bool _create_pg_dir(char *dir, bool force, bool for_witness);
|
||||||
|
static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* make sure the directory either doesn't exist or is empty
|
* make sure the directory either doesn't exist or is empty
|
||||||
* we use this function to check the new data directory and
|
* we use this function to check the new data directory and
|
||||||
@@ -243,6 +248,19 @@ is_pg_dir(char *dir)
|
|||||||
|
|
||||||
bool
|
bool
|
||||||
create_pg_dir(char *dir, bool force)
|
create_pg_dir(char *dir, bool force)
|
||||||
|
{
|
||||||
|
return _create_pg_dir(dir, force, false);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool
|
||||||
|
create_witness_pg_dir(char *dir, bool force)
|
||||||
|
{
|
||||||
|
return _create_pg_dir(dir, force, true);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static bool
|
||||||
|
_create_pg_dir(char *dir, bool force, bool for_witness)
|
||||||
{
|
{
|
||||||
bool pg_dir = false;
|
bool pg_dir = false;
|
||||||
|
|
||||||
@@ -279,12 +297,24 @@ create_pg_dir(char *dir, bool force)
|
|||||||
|
|
||||||
pg_dir = is_pg_dir(dir);
|
pg_dir = is_pg_dir(dir);
|
||||||
|
|
||||||
/*
|
|
||||||
* we use force to reduce the time needed to restore a node which
|
|
||||||
* turn async after a failover or anything else
|
|
||||||
*/
|
|
||||||
if (pg_dir && force)
|
if (pg_dir && force)
|
||||||
{
|
{
|
||||||
|
|
||||||
|
/*
|
||||||
|
* The witness server does not store any data other than a copy of the
|
||||||
|
* repmgr metadata, so in --force mode we can simply overwrite the
|
||||||
|
* directory.
|
||||||
|
*
|
||||||
|
* For non-witness servers, we'll leave the data in place, both to reduce
|
||||||
|
* the risk of unintentional data loss and to make it possible for the
|
||||||
|
* data directory to be brought up-to-date with rsync.
|
||||||
|
*/
|
||||||
|
if (for_witness)
|
||||||
|
{
|
||||||
|
log_notice(_("deleting existing data directory \"%s\"\n"), dir);
|
||||||
|
nftw(dir, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
|
||||||
|
}
|
||||||
/* Let it continue */
|
/* Let it continue */
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
@@ -306,3 +336,14 @@ create_pg_dir(char *dir, bool force)
|
|||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int
|
||||||
|
unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf)
|
||||||
|
{
|
||||||
|
int rv = remove(fpath);
|
||||||
|
|
||||||
|
if (rv)
|
||||||
|
perror(fpath);
|
||||||
|
|
||||||
|
return rv;
|
||||||
|
}
|
||||||
|
|||||||
@@ -26,5 +26,6 @@ bool create_dir(char *dir);
|
|||||||
bool set_dir_permissions(char *dir);
|
bool set_dir_permissions(char *dir);
|
||||||
bool is_pg_dir(char *dir);
|
bool is_pg_dir(char *dir);
|
||||||
bool create_pg_dir(char *dir, bool force);
|
bool create_pg_dir(char *dir, bool force);
|
||||||
|
bool create_witness_pg_dir(char *dir, bool force);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
201
config.c
201
config.c
@@ -27,9 +27,11 @@
|
|||||||
static void parse_event_notifications_list(t_configuration_options *options, const char *arg);
|
static void parse_event_notifications_list(t_configuration_options *options, const char *arg);
|
||||||
static void tablespace_list_append(t_configuration_options *options, const char *arg);
|
static void tablespace_list_append(t_configuration_options *options, const char *arg);
|
||||||
|
|
||||||
|
static char config_file_path[MAXPGPATH];
|
||||||
|
static bool config_file_provided = false;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* parse_config()
|
* load_config()
|
||||||
*
|
*
|
||||||
* Set default options and overwrite with values from provided configuration
|
* Set default options and overwrite with values from provided configuration
|
||||||
* file.
|
* file.
|
||||||
@@ -40,30 +42,21 @@ static void tablespace_list_append(t_configuration_options *options, const char
|
|||||||
* reload_config()
|
* reload_config()
|
||||||
*/
|
*/
|
||||||
bool
|
bool
|
||||||
parse_config(const char *config_file, t_configuration_options *options)
|
load_config(const char *config_file, t_configuration_options *options, char *argv0)
|
||||||
{
|
{
|
||||||
char *s,
|
struct stat config;
|
||||||
buff[MAXLINELENGTH];
|
|
||||||
char config_file_buf[MAXLEN];
|
|
||||||
char name[MAXLEN];
|
|
||||||
char value[MAXLEN];
|
|
||||||
bool config_file_provided = false;
|
|
||||||
FILE *fp;
|
|
||||||
|
|
||||||
/* Sanity checks */
|
/* Sanity checks */
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If a configuration file was provided, check it exists, otherwise
|
* If a configuration file was provided, check it exists, otherwise
|
||||||
* emit an error
|
* emit an error and terminate
|
||||||
*/
|
*/
|
||||||
if (config_file[0])
|
if (config_file[0])
|
||||||
{
|
{
|
||||||
struct stat config;
|
strncpy(config_file_path, config_file, MAXPGPATH);
|
||||||
|
canonicalize_path(config_file_path);
|
||||||
|
|
||||||
strncpy(config_file_buf, config_file, MAXLEN);
|
if (stat(config_file_path, &config) != 0)
|
||||||
canonicalize_path(config_file_buf);
|
|
||||||
|
|
||||||
if(stat(config_file_buf, &config) != 0)
|
|
||||||
{
|
{
|
||||||
log_err(_("provided configuration file '%s' not found: %s\n"),
|
log_err(_("provided configuration file '%s' not found: %s\n"),
|
||||||
config_file,
|
config_file,
|
||||||
@@ -76,16 +69,53 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If no configuration file was provided, set to a default file
|
* If no configuration file was provided, attempt to find a default file
|
||||||
* which `parse_config()` will attempt to read if it exists
|
|
||||||
*/
|
*/
|
||||||
else
|
if (config_file_provided == false)
|
||||||
{
|
{
|
||||||
strncpy(config_file_buf, DEFAULT_CONFIG_FILE, MAXLEN);
|
char my_exec_path[MAXPGPATH];
|
||||||
|
char etc_path[MAXPGPATH];
|
||||||
|
|
||||||
|
/* First check if one is in the default sysconfdir */
|
||||||
|
if (find_my_exec(argv0, my_exec_path) < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, _("%s: could not find own program executable\n"), argv0);
|
||||||
|
exit(EXIT_FAILURE);
|
||||||
|
}
|
||||||
|
|
||||||
|
get_etc_path(my_exec_path, etc_path);
|
||||||
|
|
||||||
|
snprintf(config_file_path, MAXPGPATH, "%s/repmgr.conf", etc_path);
|
||||||
|
|
||||||
|
log_debug(_("Looking for configuration file in %s\n"), etc_path);
|
||||||
|
|
||||||
|
if (stat(config_file_path, &config) != 0)
|
||||||
|
{
|
||||||
|
/* Not found - default to ./repmgr.conf */
|
||||||
|
strncpy(config_file_path, DEFAULT_CONFIG_FILE, MAXPGPATH);
|
||||||
|
canonicalize_path(config_file_path);
|
||||||
|
log_debug(_("Looking for configuration file in %s\n"), config_file_path);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
return parse_config(options);
|
||||||
|
}
|
||||||
|
|
||||||
fp = fopen(config_file_buf, "r");
|
|
||||||
|
bool
|
||||||
|
parse_config(t_configuration_options *options)
|
||||||
|
{
|
||||||
|
FILE *fp;
|
||||||
|
char *s,
|
||||||
|
buff[MAXLINELENGTH];
|
||||||
|
char name[MAXLEN];
|
||||||
|
char value[MAXLEN];
|
||||||
|
|
||||||
|
/* For sanity-checking provided conninfo string */
|
||||||
|
PQconninfoOption *conninfo_options;
|
||||||
|
char *conninfo_errmsg = NULL;
|
||||||
|
|
||||||
|
fp = fopen(config_file_path, "r");
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Since some commands don't require a config file at all, not having one
|
* Since some commands don't require a config file at all, not having one
|
||||||
@@ -99,9 +129,9 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
*/
|
*/
|
||||||
if (fp == NULL)
|
if (fp == NULL)
|
||||||
{
|
{
|
||||||
if(config_file_provided)
|
if (config_file_provided)
|
||||||
{
|
{
|
||||||
log_err(_("unable to open provided configuration file '%s'; terminating\n"), config_file_buf);
|
log_err(_("unable to open provided configuration file '%s'; terminating\n"), config_file_path);
|
||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -149,13 +179,17 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
{
|
{
|
||||||
bool known_parameter = true;
|
bool known_parameter = true;
|
||||||
|
|
||||||
/* Skip blank lines and comments */
|
|
||||||
if (buff[0] == '\n' || buff[0] == '#')
|
|
||||||
continue;
|
|
||||||
|
|
||||||
/* Parse name/value pair from line */
|
/* Parse name/value pair from line */
|
||||||
parse_line(buff, name, value);
|
parse_line(buff, name, value);
|
||||||
|
|
||||||
|
/* Skip blank lines */
|
||||||
|
if (!strlen(name))
|
||||||
|
continue;
|
||||||
|
|
||||||
|
/* Skip comments */
|
||||||
|
if (name[0] == '#')
|
||||||
|
continue;
|
||||||
|
|
||||||
/* Copy into correct entry in parameters struct */
|
/* Copy into correct entry in parameters struct */
|
||||||
if (strcmp(name, "cluster") == 0)
|
if (strcmp(name, "cluster") == 0)
|
||||||
strncpy(options->cluster_name, value, MAXLEN);
|
strncpy(options->cluster_name, value, MAXLEN);
|
||||||
@@ -239,7 +273,7 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
* we want to accept those, we'd need to add stricter default checking,
|
* we want to accept those, we'd need to add stricter default checking,
|
||||||
* as currently e.g. an empty `node` value will be converted to '0'.
|
* as currently e.g. an empty `node` value will be converted to '0'.
|
||||||
*/
|
*/
|
||||||
if(known_parameter == true && !strlen(value)) {
|
if (known_parameter == true && !strlen(value)) {
|
||||||
log_err(_("no value provided for parameter '%s'\n"), name);
|
log_err(_("no value provided for parameter '%s'\n"), name);
|
||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
@@ -262,6 +296,12 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (options->node == 0)
|
||||||
|
{
|
||||||
|
log_err(_("'node' must be an integer greater than zero\n"));
|
||||||
|
exit(ERR_BAD_CONFIG);
|
||||||
|
}
|
||||||
|
|
||||||
if (*options->node_name == '\0')
|
if (*options->node_name == '\0')
|
||||||
{
|
{
|
||||||
log_err(_("required parameter 'node_name' was not found\n"));
|
log_err(_("required parameter 'node_name' was not found\n"));
|
||||||
@@ -274,6 +314,19 @@ parse_config(const char *config_file, t_configuration_options *options)
|
|||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Sanity check the provided conninfo string
|
||||||
|
*
|
||||||
|
* NOTE: this verifies the string format and checks for valid options
|
||||||
|
* but does not sanity check values
|
||||||
|
*/
|
||||||
|
conninfo_options = PQconninfoParse(options->conninfo, &conninfo_errmsg);
|
||||||
|
if (conninfo_options == NULL)
|
||||||
|
{
|
||||||
|
log_err(_("Parameter 'conninfo' is invalid: %s"), conninfo_errmsg);
|
||||||
|
exit(ERR_BAD_CONFIG);
|
||||||
|
}
|
||||||
|
PQconninfoFree(conninfo_options);
|
||||||
|
|
||||||
/* The following checks are for valid parameter values */
|
/* The following checks are for valid parameter values */
|
||||||
if (options->master_response_timeout <= 0)
|
if (options->master_response_timeout <= 0)
|
||||||
{
|
{
|
||||||
@@ -305,7 +358,7 @@ trim(char *s)
|
|||||||
*s2 = &s[strlen(s) - 1];
|
*s2 = &s[strlen(s) - 1];
|
||||||
|
|
||||||
/* If string is empty, no action needed */
|
/* If string is empty, no action needed */
|
||||||
if(s2 < s1)
|
if (s2 < s1)
|
||||||
return s;
|
return s;
|
||||||
|
|
||||||
/* Trim and delimit right side */
|
/* Trim and delimit right side */
|
||||||
@@ -331,24 +384,50 @@ parse_line(char *buff, char *name, char *value)
|
|||||||
int j = 0;
|
int j = 0;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* first we find the name of the parameter
|
* Extract parameter name, if present
|
||||||
*/
|
*/
|
||||||
for (; i < MAXLEN; ++i)
|
for (; i < MAXLEN; ++i)
|
||||||
{
|
{
|
||||||
if (buff[i] != '=')
|
|
||||||
name[j++] = buff[i];
|
if (buff[i] == '=')
|
||||||
else
|
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
switch(buff[i])
|
||||||
|
{
|
||||||
|
/* Ignore whitespace */
|
||||||
|
case ' ':
|
||||||
|
case '\n':
|
||||||
|
case '\r':
|
||||||
|
case '\t':
|
||||||
|
continue;
|
||||||
|
default:
|
||||||
|
name[j++] = buff[i];
|
||||||
|
}
|
||||||
}
|
}
|
||||||
name[j] = '\0';
|
name[j] = '\0';
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Now the value
|
* Ignore any whitespace following the '=' sign
|
||||||
|
*/
|
||||||
|
for (; i < MAXLEN; ++i)
|
||||||
|
{
|
||||||
|
if (buff[i+1] == ' ')
|
||||||
|
continue;
|
||||||
|
if (buff[i+1] == '\t')
|
||||||
|
continue;
|
||||||
|
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Extract parameter value
|
||||||
*/
|
*/
|
||||||
j = 0;
|
j = 0;
|
||||||
for (++i; i < MAXLEN; ++i)
|
for (++i; i < MAXLEN; ++i)
|
||||||
if (buff[i] == '\'')
|
if (buff[i] == '\'')
|
||||||
continue;
|
continue;
|
||||||
|
else if (buff[i] == '#')
|
||||||
|
break;
|
||||||
else if (buff[i] != '\n')
|
else if (buff[i] != '\n')
|
||||||
value[j++] = buff[i];
|
value[j++] = buff[i];
|
||||||
else
|
else
|
||||||
@@ -358,7 +437,7 @@ parse_line(char *buff, char *name, char *value)
|
|||||||
}
|
}
|
||||||
|
|
||||||
bool
|
bool
|
||||||
reload_config(char *config_file, t_configuration_options * orig_options)
|
reload_config(t_configuration_options *orig_options)
|
||||||
{
|
{
|
||||||
PGconn *conn;
|
PGconn *conn;
|
||||||
t_configuration_options new_options;
|
t_configuration_options new_options;
|
||||||
@@ -369,7 +448,7 @@ reload_config(char *config_file, t_configuration_options * orig_options)
|
|||||||
*/
|
*/
|
||||||
log_info(_("reloading configuration file and updating repmgr tables\n"));
|
log_info(_("reloading configuration file and updating repmgr tables\n"));
|
||||||
|
|
||||||
parse_config(config_file, &new_options);
|
parse_config(&new_options);
|
||||||
if (new_options.node == -1)
|
if (new_options.node == -1)
|
||||||
{
|
{
|
||||||
log_warning(_("unable to parse new configuration, retaining current configuration\n"));
|
log_warning(_("unable to parse new configuration, retaining current configuration\n"));
|
||||||
@@ -418,7 +497,7 @@ reload_config(char *config_file, t_configuration_options * orig_options)
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(strcmp(orig_options->conninfo, new_options.conninfo) != 0)
|
if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
|
||||||
{
|
{
|
||||||
/* Test conninfo string */
|
/* Test conninfo string */
|
||||||
conn = establish_db_connection(new_options.conninfo, false);
|
conn = establish_db_connection(new_options.conninfo, false);
|
||||||
@@ -438,56 +517,56 @@ reload_config(char *config_file, t_configuration_options * orig_options)
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
/* cluster_name */
|
/* cluster_name */
|
||||||
if(strcmp(orig_options->cluster_name, new_options.cluster_name) != 0)
|
if (strcmp(orig_options->cluster_name, new_options.cluster_name) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->cluster_name, new_options.cluster_name);
|
strcpy(orig_options->cluster_name, new_options.cluster_name);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* conninfo */
|
/* conninfo */
|
||||||
if(strcmp(orig_options->conninfo, new_options.conninfo) != 0)
|
if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->conninfo, new_options.conninfo);
|
strcpy(orig_options->conninfo, new_options.conninfo);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* node */
|
/* node */
|
||||||
if(orig_options->node != new_options.node)
|
if (orig_options->node != new_options.node)
|
||||||
{
|
{
|
||||||
orig_options->node = new_options.node;
|
orig_options->node = new_options.node;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* failover */
|
/* failover */
|
||||||
if(orig_options->failover != new_options.failover)
|
if (orig_options->failover != new_options.failover)
|
||||||
{
|
{
|
||||||
orig_options->failover = new_options.failover;
|
orig_options->failover = new_options.failover;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* priority */
|
/* priority */
|
||||||
if(orig_options->priority != new_options.priority)
|
if (orig_options->priority != new_options.priority)
|
||||||
{
|
{
|
||||||
orig_options->priority = new_options.priority;
|
orig_options->priority = new_options.priority;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* node_name */
|
/* node_name */
|
||||||
if(strcmp(orig_options->node_name, new_options.node_name) != 0)
|
if (strcmp(orig_options->node_name, new_options.node_name) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->node_name, new_options.node_name);
|
strcpy(orig_options->node_name, new_options.node_name);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* promote_command */
|
/* promote_command */
|
||||||
if(strcmp(orig_options->promote_command, new_options.promote_command) != 0)
|
if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->promote_command, new_options.promote_command);
|
strcpy(orig_options->promote_command, new_options.promote_command);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* follow_command */
|
/* follow_command */
|
||||||
if(strcmp(orig_options->follow_command, new_options.follow_command) != 0)
|
if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->follow_command, new_options.follow_command);
|
strcpy(orig_options->follow_command, new_options.follow_command);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
@@ -504,76 +583,76 @@ reload_config(char *config_file, t_configuration_options * orig_options)
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
/* rsync_options */
|
/* rsync_options */
|
||||||
if(strcmp(orig_options->rsync_options, new_options.rsync_options) != 0)
|
if (strcmp(orig_options->rsync_options, new_options.rsync_options) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->rsync_options, new_options.rsync_options);
|
strcpy(orig_options->rsync_options, new_options.rsync_options);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ssh_options */
|
/* ssh_options */
|
||||||
if(strcmp(orig_options->ssh_options, new_options.ssh_options) != 0)
|
if (strcmp(orig_options->ssh_options, new_options.ssh_options) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->ssh_options, new_options.ssh_options);
|
strcpy(orig_options->ssh_options, new_options.ssh_options);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* master_response_timeout */
|
/* master_response_timeout */
|
||||||
if(orig_options->master_response_timeout != new_options.master_response_timeout)
|
if (orig_options->master_response_timeout != new_options.master_response_timeout)
|
||||||
{
|
{
|
||||||
orig_options->master_response_timeout = new_options.master_response_timeout;
|
orig_options->master_response_timeout = new_options.master_response_timeout;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* reconnect_attempts */
|
/* reconnect_attempts */
|
||||||
if(orig_options->reconnect_attempts != new_options.reconnect_attempts)
|
if (orig_options->reconnect_attempts != new_options.reconnect_attempts)
|
||||||
{
|
{
|
||||||
orig_options->reconnect_attempts = new_options.reconnect_attempts;
|
orig_options->reconnect_attempts = new_options.reconnect_attempts;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* reconnect_intvl */
|
/* reconnect_intvl */
|
||||||
if(orig_options->reconnect_intvl != new_options.reconnect_intvl)
|
if (orig_options->reconnect_intvl != new_options.reconnect_intvl)
|
||||||
{
|
{
|
||||||
orig_options->reconnect_intvl = new_options.reconnect_intvl;
|
orig_options->reconnect_intvl = new_options.reconnect_intvl;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* pg_ctl_options */
|
/* pg_ctl_options */
|
||||||
if(strcmp(orig_options->pg_ctl_options, new_options.pg_ctl_options) != 0)
|
if (strcmp(orig_options->pg_ctl_options, new_options.pg_ctl_options) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->pg_ctl_options, new_options.pg_ctl_options);
|
strcpy(orig_options->pg_ctl_options, new_options.pg_ctl_options);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* pg_basebackup_options */
|
/* pg_basebackup_options */
|
||||||
if(strcmp(orig_options->pg_basebackup_options, new_options.pg_basebackup_options) != 0)
|
if (strcmp(orig_options->pg_basebackup_options, new_options.pg_basebackup_options) != 0)
|
||||||
{
|
{
|
||||||
strcpy(orig_options->pg_basebackup_options, new_options.pg_basebackup_options);
|
strcpy(orig_options->pg_basebackup_options, new_options.pg_basebackup_options);
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* monitor_interval_secs */
|
/* monitor_interval_secs */
|
||||||
if(orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
|
if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs)
|
||||||
{
|
{
|
||||||
orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
|
orig_options->monitor_interval_secs = new_options.monitor_interval_secs;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* retry_promote_interval_secs */
|
/* retry_promote_interval_secs */
|
||||||
if(orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs)
|
if (orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs)
|
||||||
{
|
{
|
||||||
orig_options->retry_promote_interval_secs = new_options.retry_promote_interval_secs;
|
orig_options->retry_promote_interval_secs = new_options.retry_promote_interval_secs;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* use_replication_slots */
|
/* use_replication_slots */
|
||||||
if(orig_options->use_replication_slots != new_options.use_replication_slots)
|
if (orig_options->use_replication_slots != new_options.use_replication_slots)
|
||||||
{
|
{
|
||||||
orig_options->use_replication_slots = new_options.use_replication_slots;
|
orig_options->use_replication_slots = new_options.use_replication_slots;
|
||||||
config_changed = true;
|
config_changed = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(config_changed == true)
|
if (config_changed == true)
|
||||||
{
|
{
|
||||||
log_debug(_("reload_config(): configuration has changed\n"));
|
log_debug(_("reload_config(): configuration has changed\n"));
|
||||||
}
|
}
|
||||||
@@ -602,7 +681,7 @@ tablespace_list_append(t_configuration_options *options, const char *arg)
|
|||||||
const char *arg_ptr;
|
const char *arg_ptr;
|
||||||
|
|
||||||
cell = (TablespaceListCell *) pg_malloc0(sizeof(TablespaceListCell));
|
cell = (TablespaceListCell *) pg_malloc0(sizeof(TablespaceListCell));
|
||||||
if(cell == NULL)
|
if (cell == NULL)
|
||||||
{
|
{
|
||||||
log_err(_("unable to allocate memory; terminating\n"));
|
log_err(_("unable to allocate memory; terminating\n"));
|
||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
@@ -670,7 +749,7 @@ parse_event_notifications_list(t_configuration_options *options, const char *arg
|
|||||||
for (arg_ptr = arg; arg_ptr <= (arg + strlen(arg)); arg_ptr++)
|
for (arg_ptr = arg; arg_ptr <= (arg + strlen(arg)); arg_ptr++)
|
||||||
{
|
{
|
||||||
/* ignore whitespace */
|
/* ignore whitespace */
|
||||||
if(*arg_ptr == ' ' || *arg_ptr == '\t')
|
if (*arg_ptr == ' ' || *arg_ptr == '\t')
|
||||||
{
|
{
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
@@ -679,13 +758,13 @@ parse_event_notifications_list(t_configuration_options *options, const char *arg
|
|||||||
* comma (or end-of-string) should mark the end of an event type -
|
* comma (or end-of-string) should mark the end of an event type -
|
||||||
* just as long as there was something preceding it
|
* just as long as there was something preceding it
|
||||||
*/
|
*/
|
||||||
if((*arg_ptr == ',' || *arg_ptr == '\0') && event_type_buf[0] != '\0')
|
if ((*arg_ptr == ',' || *arg_ptr == '\0') && event_type_buf[0] != '\0')
|
||||||
{
|
{
|
||||||
EventNotificationListCell *cell;
|
EventNotificationListCell *cell;
|
||||||
|
|
||||||
cell = (EventNotificationListCell *) pg_malloc0(sizeof(EventNotificationListCell));
|
cell = (EventNotificationListCell *) pg_malloc0(sizeof(EventNotificationListCell));
|
||||||
|
|
||||||
if(cell == NULL)
|
if (cell == NULL)
|
||||||
{
|
{
|
||||||
log_err(_("unable to allocate memory; terminating\n"));
|
log_err(_("unable to allocate memory; terminating\n"));
|
||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
@@ -708,7 +787,7 @@ parse_event_notifications_list(t_configuration_options *options, const char *arg
|
|||||||
dst_ptr = event_type_buf;
|
dst_ptr = event_type_buf;
|
||||||
}
|
}
|
||||||
/* ignore duplicated commas */
|
/* ignore duplicated commas */
|
||||||
else if(*arg_ptr == ',')
|
else if (*arg_ptr == ',')
|
||||||
{
|
{
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|||||||
5
config.h
5
config.h
@@ -83,9 +83,10 @@ typedef struct
|
|||||||
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", 0, 0, 0, "", { NULL, NULL }, {NULL, NULL} }
|
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", 0, 0, 0, "", { NULL, NULL }, {NULL, NULL} }
|
||||||
|
|
||||||
|
|
||||||
bool parse_config(const char *config_file, t_configuration_options *options);
|
bool load_config(const char *config_file, t_configuration_options *options, char *argv0);
|
||||||
|
bool reload_config(t_configuration_options *orig_options);
|
||||||
|
bool parse_config(t_configuration_options *options);
|
||||||
void parse_line(char *buff, char *name, char *value);
|
void parse_line(char *buff, char *name, char *value);
|
||||||
char *trim(char *s);
|
char *trim(char *s);
|
||||||
bool reload_config(char *config_file, t_configuration_options *orig_options);
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
217
dbutils.c
217
dbutils.c
@@ -82,6 +82,72 @@ establish_db_connection_by_params(const char *keywords[], const char *values[],
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool
|
||||||
|
begin_transaction(PGconn *conn)
|
||||||
|
{
|
||||||
|
PGresult *res;
|
||||||
|
|
||||||
|
res = PQexec(conn, "BEGIN");
|
||||||
|
|
||||||
|
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
||||||
|
{
|
||||||
|
log_err(_("Unable to begin transaction: %s\n"),
|
||||||
|
PQerrorMessage(conn));
|
||||||
|
|
||||||
|
PQclear(res);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool
|
||||||
|
commit_transaction(PGconn *conn)
|
||||||
|
{
|
||||||
|
PGresult *res;
|
||||||
|
|
||||||
|
res = PQexec(conn, "COMMIT");
|
||||||
|
|
||||||
|
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
||||||
|
{
|
||||||
|
log_err(_("Unable to commit transaction: %s\n"),
|
||||||
|
PQerrorMessage(conn));
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool
|
||||||
|
rollback_transaction(PGconn *conn)
|
||||||
|
{
|
||||||
|
PGresult *res;
|
||||||
|
|
||||||
|
res = PQexec(conn, "ROLLBACK");
|
||||||
|
|
||||||
|
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
||||||
|
{
|
||||||
|
log_err(_("Unable to rollback transaction: %s\n"),
|
||||||
|
PQerrorMessage(conn));
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
bool
|
bool
|
||||||
check_cluster_schema(PGconn *conn)
|
check_cluster_schema(PGconn *conn)
|
||||||
{
|
{
|
||||||
@@ -197,7 +263,7 @@ is_pgup(PGconn *conn, int timeout)
|
|||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Return the id of the active master node, or -1 if no
|
* Return the id of the active master node, or NODE_NOT_FOUND if no
|
||||||
* record available.
|
* record available.
|
||||||
*
|
*
|
||||||
* This reports the value stored in the database only and
|
* This reports the value stored in the database only and
|
||||||
@@ -224,12 +290,12 @@ get_master_node_id(PGconn *conn, char *cluster)
|
|||||||
{
|
{
|
||||||
log_err(_("get_master_node_id(): query failed\n%s\n"),
|
log_err(_("get_master_node_id(): query failed\n%s\n"),
|
||||||
PQerrorMessage(conn));
|
PQerrorMessage(conn));
|
||||||
retval = -1;
|
retval = NODE_NOT_FOUND;
|
||||||
}
|
}
|
||||||
else if (PQntuples(res) == 0)
|
else if (PQntuples(res) == 0)
|
||||||
{
|
{
|
||||||
log_warning(_("get_master_node_id(): no active primary found\n"));
|
log_warning(_("get_master_node_id(): no active primary found\n"));
|
||||||
retval = -1;
|
retval = NODE_NOT_FOUND;
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
@@ -260,7 +326,7 @@ get_server_version(PGconn *conn, char *server_version)
|
|||||||
return -1;
|
return -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(server_version != NULL)
|
if (server_version != NULL)
|
||||||
strcpy(server_version, PQgetvalue(res, 0, 0));
|
strcpy(server_version, PQgetvalue(res, 0, 0));
|
||||||
|
|
||||||
return atoi(PQgetvalue(res, 0, 0));
|
return atoi(PQgetvalue(res, 0, 0));
|
||||||
@@ -399,7 +465,7 @@ get_pg_setting(PGconn *conn, const char *setting, char *output)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if(success == true)
|
if (success == true)
|
||||||
{
|
{
|
||||||
log_debug(_("get_pg_setting(): returned value is '%s'\n"), output);
|
log_debug(_("get_pg_setting(): returned value is '%s'\n"), output);
|
||||||
}
|
}
|
||||||
@@ -458,7 +524,7 @@ get_upstream_connection(PGconn *standby_conn, char *cluster, int node_id,
|
|||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(!PQntuples(res))
|
if (!PQntuples(res))
|
||||||
{
|
{
|
||||||
log_notice(_("no record found for upstream server"));
|
log_notice(_("no record found for upstream server"));
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
@@ -467,7 +533,7 @@ get_upstream_connection(PGconn *standby_conn, char *cluster, int node_id,
|
|||||||
|
|
||||||
strncpy(upstream_conninfo, PQgetvalue(res, 0, 0), MAXCONNINFO);
|
strncpy(upstream_conninfo, PQgetvalue(res, 0, 0), MAXCONNINFO);
|
||||||
|
|
||||||
if(upstream_node_id_ptr != NULL)
|
if (upstream_node_id_ptr != NULL)
|
||||||
*upstream_node_id_ptr = atoi(PQgetvalue(res, 0, 1));
|
*upstream_node_id_ptr = atoi(PQgetvalue(res, 0, 1));
|
||||||
|
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
@@ -509,9 +575,9 @@ get_master_connection(PGconn *standby_conn, char *cluster,
|
|||||||
int i,
|
int i,
|
||||||
node_id;
|
node_id;
|
||||||
|
|
||||||
if(master_id != NULL)
|
if (master_id != NULL)
|
||||||
{
|
{
|
||||||
*master_id = -1;
|
*master_id = NODE_NOT_FOUND;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* find all nodes belonging to this cluster */
|
/* find all nodes belonging to this cluster */
|
||||||
@@ -570,7 +636,7 @@ get_master_connection(PGconn *standby_conn, char *cluster,
|
|||||||
PQclear(res1);
|
PQclear(res1);
|
||||||
log_debug(_("get_master_connection(): current master node is %i\n"), node_id);
|
log_debug(_("get_master_connection(): current master node is %i\n"), node_id);
|
||||||
|
|
||||||
if(master_id != NULL)
|
if (master_id != NULL)
|
||||||
{
|
{
|
||||||
*master_id = node_id;
|
*master_id = node_id;
|
||||||
}
|
}
|
||||||
@@ -709,7 +775,7 @@ get_repmgr_schema(void)
|
|||||||
char *
|
char *
|
||||||
get_repmgr_schema_quoted(PGconn *conn)
|
get_repmgr_schema_quoted(PGconn *conn)
|
||||||
{
|
{
|
||||||
if(strcmp(repmgr_schema_quoted, "") == 0)
|
if (strcmp(repmgr_schema_quoted, "") == 0)
|
||||||
{
|
{
|
||||||
char *identifier = PQescapeIdentifier(conn, repmgr_schema,
|
char *identifier = PQescapeIdentifier(conn, repmgr_schema,
|
||||||
strlen(repmgr_schema));
|
strlen(repmgr_schema));
|
||||||
@@ -728,6 +794,49 @@ create_replication_slot(PGconn *conn, char *slot_name)
|
|||||||
char sqlquery[QUERY_STR_LEN];
|
char sqlquery[QUERY_STR_LEN];
|
||||||
PGresult *res;
|
PGresult *res;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Check whether slot exists already; if it exists and is active, that
|
||||||
|
* means another active standby is using it, which creates an error situation;
|
||||||
|
* if not we can reuse it as-is
|
||||||
|
*/
|
||||||
|
|
||||||
|
sqlquery_snprintf(sqlquery,
|
||||||
|
"SELECT active, slot_type "
|
||||||
|
" FROM pg_replication_slots "
|
||||||
|
" WHERE slot_name = '%s' ",
|
||||||
|
slot_name);
|
||||||
|
|
||||||
|
res = PQexec(conn, sqlquery);
|
||||||
|
if (!res || PQresultStatus(res) != PGRES_TUPLES_OK)
|
||||||
|
{
|
||||||
|
log_err(_("unable to query pg_replication_slots: %s\n"),
|
||||||
|
PQerrorMessage(conn));
|
||||||
|
PQclear(res);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (PQntuples(res))
|
||||||
|
{
|
||||||
|
if (strcmp(PQgetvalue(res, 0, 1), "physical") != 0)
|
||||||
|
{
|
||||||
|
log_err(_("Slot '%s' exists and is not a physical slot\n"),
|
||||||
|
slot_name);
|
||||||
|
PQclear(res);
|
||||||
|
}
|
||||||
|
if (strcmp(PQgetvalue(res, 0, 0), "f") == 0)
|
||||||
|
{
|
||||||
|
PQclear(res);
|
||||||
|
log_debug(_("Replication slot '%s' exists but is inactive; reusing\n"),
|
||||||
|
slot_name);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
PQclear(res);
|
||||||
|
log_err(_("Slot '%s' already exists as an active slot\n"),
|
||||||
|
slot_name);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
sqlquery_snprintf(sqlquery,
|
sqlquery_snprintf(sqlquery,
|
||||||
"SELECT * FROM pg_create_physical_replication_slot('%s')",
|
"SELECT * FROM pg_create_physical_replication_slot('%s')",
|
||||||
slot_name);
|
slot_name);
|
||||||
@@ -930,13 +1039,13 @@ create_node_record(PGconn *conn, char *action, int node, char *type, int upstrea
|
|||||||
char slot_name_buf[MAXLEN];
|
char slot_name_buf[MAXLEN];
|
||||||
PGresult *res;
|
PGresult *res;
|
||||||
|
|
||||||
if(upstream_node == NO_UPSTREAM_NODE)
|
if (upstream_node == NO_UPSTREAM_NODE)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* No explicit upstream node id provided for standby - attempt to
|
* No explicit upstream node id provided for standby - attempt to
|
||||||
* get primary node id
|
* get primary node id
|
||||||
*/
|
*/
|
||||||
if(strcmp(type, "standby") == 0)
|
if (strcmp(type, "standby") == 0)
|
||||||
{
|
{
|
||||||
int primary_node_id = get_master_node_id(conn, cluster_name);
|
int primary_node_id = get_master_node_id(conn, cluster_name);
|
||||||
maxlen_snprintf(upstream_node_id, "%i", primary_node_id);
|
maxlen_snprintf(upstream_node_id, "%i", primary_node_id);
|
||||||
@@ -951,7 +1060,7 @@ create_node_record(PGconn *conn, char *action, int node, char *type, int upstrea
|
|||||||
maxlen_snprintf(upstream_node_id, "%i", upstream_node);
|
maxlen_snprintf(upstream_node_id, "%i", upstream_node);
|
||||||
}
|
}
|
||||||
|
|
||||||
if(slot_name != NULL && slot_name[0])
|
if (slot_name != NULL && slot_name[0])
|
||||||
{
|
{
|
||||||
maxlen_snprintf(slot_name_buf, "'%s'", slot_name);
|
maxlen_snprintf(slot_name_buf, "'%s'", slot_name);
|
||||||
}
|
}
|
||||||
@@ -975,7 +1084,7 @@ create_node_record(PGconn *conn, char *action, int node, char *type, int upstrea
|
|||||||
slot_name_buf,
|
slot_name_buf,
|
||||||
priority);
|
priority);
|
||||||
|
|
||||||
if(action != NULL)
|
if (action != NULL)
|
||||||
{
|
{
|
||||||
log_debug(_("%s: %s\n"), action, sqlquery);
|
log_debug(_("%s: %s\n"), action, sqlquery);
|
||||||
}
|
}
|
||||||
@@ -1006,7 +1115,7 @@ delete_node_record(PGconn *conn, int node, char *action)
|
|||||||
" WHERE id = %d",
|
" WHERE id = %d",
|
||||||
get_repmgr_schema_quoted(conn),
|
get_repmgr_schema_quoted(conn),
|
||||||
node);
|
node);
|
||||||
if(action != NULL)
|
if (action != NULL)
|
||||||
{
|
{
|
||||||
log_debug(_("%s: %s\n"), action, sqlquery);
|
log_debug(_("%s: %s\n"), action, sqlquery);
|
||||||
}
|
}
|
||||||
@@ -1037,8 +1146,8 @@ delete_node_record(PGconn *conn, int node, char *action)
|
|||||||
*
|
*
|
||||||
* Note this function may be called with `conn` set to NULL in cases where
|
* Note this function may be called with `conn` set to NULL in cases where
|
||||||
* the master node is not available and it's therefore not possible to write
|
* the master node is not available and it's therefore not possible to write
|
||||||
* an event record. In this case, if `event_notification_command` is set a user-
|
* an event record. In this case, if `event_notification_command` is set, a
|
||||||
* defined notification to be generated; if not, this function will have
|
* user-defined notification to be generated; if not, this function will have
|
||||||
* no effect.
|
* no effect.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
@@ -1051,7 +1160,12 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
bool success = true;
|
bool success = true;
|
||||||
struct tm ts;
|
struct tm ts;
|
||||||
|
|
||||||
if(conn != NULL)
|
/* Only attempt to write a record if a connection handle was provided.
|
||||||
|
Also check that the repmgr schema has been properly intialised - if
|
||||||
|
not it means no configuration file was provided, which can happen with
|
||||||
|
e.g. `repmgr standby clone`, and we won't know which schema to write to.
|
||||||
|
*/
|
||||||
|
if (conn != NULL && strcmp(repmgr_schema, DEFAULT_REPMGR_SCHEMA_PREFIX) != 0)
|
||||||
{
|
{
|
||||||
int n_node_id = htonl(node_id);
|
int n_node_id = htonl(node_id);
|
||||||
char *t_successful = successful ? "TRUE" : "FALSE";
|
char *t_successful = successful ? "TRUE" : "FALSE";
|
||||||
@@ -1114,7 +1228,7 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
* current timestamp ourselves. This isn't quite the same
|
* current timestamp ourselves. This isn't quite the same
|
||||||
* format as PostgreSQL, but is close enough for diagnostic use.
|
* format as PostgreSQL, but is close enough for diagnostic use.
|
||||||
*/
|
*/
|
||||||
if(!strlen(event_timestamp))
|
if (!strlen(event_timestamp))
|
||||||
{
|
{
|
||||||
time_t now;
|
time_t now;
|
||||||
|
|
||||||
@@ -1124,7 +1238,7 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* an event notification command was provided - parse and execute it */
|
/* an event notification command was provided - parse and execute it */
|
||||||
if(strlen(options->event_notification_command))
|
if (strlen(options->event_notification_command))
|
||||||
{
|
{
|
||||||
char parsed_command[MAXPGPATH];
|
char parsed_command[MAXPGPATH];
|
||||||
const char *src_ptr;
|
const char *src_ptr;
|
||||||
@@ -1140,14 +1254,14 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
* (If 'event_notifications' was not provided, we assume the script
|
* (If 'event_notifications' was not provided, we assume the script
|
||||||
* should be executed for all events).
|
* should be executed for all events).
|
||||||
*/
|
*/
|
||||||
if(options->event_notifications.head != NULL)
|
if (options->event_notifications.head != NULL)
|
||||||
{
|
{
|
||||||
EventNotificationListCell *cell;
|
EventNotificationListCell *cell;
|
||||||
bool notify_ok = false;
|
bool notify_ok = false;
|
||||||
|
|
||||||
for (cell = options->event_notifications.head; cell; cell = cell->next)
|
for (cell = options->event_notifications.head; cell; cell = cell->next)
|
||||||
{
|
{
|
||||||
if(strcmp(event, cell->event_type) == 0)
|
if (strcmp(event, cell->event_type) == 0)
|
||||||
{
|
{
|
||||||
notify_ok = true;
|
notify_ok = true;
|
||||||
break;
|
break;
|
||||||
@@ -1157,7 +1271,7 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
/*
|
/*
|
||||||
* Event type not found in the 'event_notifications' list - return early
|
* Event type not found in the 'event_notifications' list - return early
|
||||||
*/
|
*/
|
||||||
if(notify_ok == false)
|
if (notify_ok == false)
|
||||||
{
|
{
|
||||||
log_debug(_("Not executing notification script for event type '%s'\n"), event);
|
log_debug(_("Not executing notification script for event type '%s'\n"), event);
|
||||||
return success;
|
return success;
|
||||||
@@ -1189,7 +1303,7 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
case 'd':
|
case 'd':
|
||||||
/* %d: details */
|
/* %d: details */
|
||||||
src_ptr++;
|
src_ptr++;
|
||||||
if(details != NULL)
|
if (details != NULL)
|
||||||
{
|
{
|
||||||
strlcpy(dst_ptr, details, end_ptr - dst_ptr);
|
strlcpy(dst_ptr, details, end_ptr - dst_ptr);
|
||||||
dst_ptr += strlen(dst_ptr);
|
dst_ptr += strlen(dst_ptr);
|
||||||
@@ -1235,3 +1349,56 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
|
|||||||
|
|
||||||
return success;
|
return success;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
bool
|
||||||
|
update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id)
|
||||||
|
{
|
||||||
|
PGresult *res;
|
||||||
|
char sqlquery[QUERY_STR_LEN];
|
||||||
|
|
||||||
|
log_debug(_("update_node_record_set_upstream(): Updating node %i's upstream node to %i\n"), this_node_id, new_upstream_node_id);
|
||||||
|
|
||||||
|
sqlquery_snprintf(sqlquery,
|
||||||
|
" UPDATE %s.repl_nodes "
|
||||||
|
" SET upstream_node_id = %i "
|
||||||
|
" WHERE cluster = '%s' "
|
||||||
|
" AND id = %i ",
|
||||||
|
get_repmgr_schema_quoted(conn),
|
||||||
|
new_upstream_node_id,
|
||||||
|
cluster_name,
|
||||||
|
this_node_id);
|
||||||
|
res = PQexec(conn, sqlquery);
|
||||||
|
|
||||||
|
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
||||||
|
{
|
||||||
|
log_err(_("Unable to set new upstream node id: %s\n"),
|
||||||
|
PQerrorMessage(conn));
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
PQclear(res);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
PGresult *
|
||||||
|
get_node_record(PGconn *conn, char *cluster, int node_id)
|
||||||
|
{
|
||||||
|
char sqlquery[QUERY_STR_LEN];
|
||||||
|
|
||||||
|
sprintf(sqlquery,
|
||||||
|
"SELECT id, upstream_node_id, conninfo, type, slot_name, active "
|
||||||
|
" FROM %s.repl_nodes "
|
||||||
|
" WHERE cluster = '%s' "
|
||||||
|
" AND id = %i",
|
||||||
|
get_repmgr_schema_quoted(conn),
|
||||||
|
cluster,
|
||||||
|
node_id);
|
||||||
|
|
||||||
|
log_debug("get_node_record(): %s\n", sqlquery);
|
||||||
|
|
||||||
|
return PQexec(conn, sqlquery);
|
||||||
|
}
|
||||||
|
|||||||
@@ -30,6 +30,9 @@ PGconn *establish_db_connection(const char *conninfo,
|
|||||||
PGconn *establish_db_connection_by_params(const char *keywords[],
|
PGconn *establish_db_connection_by_params(const char *keywords[],
|
||||||
const char *values[],
|
const char *values[],
|
||||||
const bool exit_on_error);
|
const bool exit_on_error);
|
||||||
|
bool begin_transaction(PGconn *conn);
|
||||||
|
bool commit_transaction(PGconn *conn);
|
||||||
|
bool rollback_transaction(PGconn *conn);
|
||||||
bool check_cluster_schema(PGconn *conn);
|
bool check_cluster_schema(PGconn *conn);
|
||||||
int is_standby(PGconn *conn);
|
int is_standby(PGconn *conn);
|
||||||
bool is_pgup(PGconn *conn, int timeout);
|
bool is_pgup(PGconn *conn, int timeout);
|
||||||
@@ -63,6 +66,7 @@ bool copy_configuration(PGconn *masterconn, PGconn *witnessconn, char *cluster_
|
|||||||
bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name);
|
bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name);
|
||||||
bool delete_node_record(PGconn *conn, int node, char *action);
|
bool delete_node_record(PGconn *conn, int node, char *action);
|
||||||
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
|
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
|
||||||
|
bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id);
|
||||||
|
PGresult * get_node_record(PGconn *conn, char *cluster, int node_id);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
4
debian/repmgr.repmgrd.default
vendored
4
debian/repmgr.repmgrd.default
vendored
@@ -12,7 +12,7 @@ REPMGRD_ENABLED=no
|
|||||||
#REPMGRD_USER=postgres
|
#REPMGRD_USER=postgres
|
||||||
|
|
||||||
# repmgrd binary
|
# repmgrd binary
|
||||||
#REPMGR_BIN=/usr/bin/repmgr
|
#REPMGRD_BIN=/usr/bin/repmgrd
|
||||||
|
|
||||||
# pid file
|
# pid file
|
||||||
#REPMGR_PIDFILE=/var/run/repmgrd.pid
|
#REPMGRD_PIDFILE=/var/run/repmgrd.pid
|
||||||
|
|||||||
@@ -35,5 +35,6 @@
|
|||||||
#define ERR_BAD_SSH 12
|
#define ERR_BAD_SSH 12
|
||||||
#define ERR_SYS_FAILURE 13
|
#define ERR_SYS_FAILURE 13
|
||||||
#define ERR_BAD_BASEBACKUP 14
|
#define ERR_BAD_BASEBACKUP 14
|
||||||
|
#define ERR_INTERNAL 15
|
||||||
|
|
||||||
#endif /* _ERRCODE_H_ */
|
#endif /* _ERRCODE_H_ */
|
||||||
|
|||||||
26
log.c
26
log.c
@@ -144,12 +144,32 @@ logger_init(t_configuration_options * opts, const char *ident, const char *level
|
|||||||
{
|
{
|
||||||
FILE *fd;
|
FILE *fd;
|
||||||
|
|
||||||
fd = freopen(opts->logfile, "a", stderr);
|
/* Check if we can write to the specified file before redirecting
|
||||||
|
* stderr - if freopen() fails, stderr output will vanish into
|
||||||
|
* the ether and the user won't know what's going on.
|
||||||
|
*/
|
||||||
|
|
||||||
|
fd = fopen(opts->logfile, "a");
|
||||||
if (fd == NULL)
|
if (fd == NULL)
|
||||||
{
|
{
|
||||||
fprintf(stderr, "error reopening stderr to '%s': %s",
|
stderr_log_err(_("Unable to open specified logfile '%s' for writing: %s\n"), opts->logfile, strerror(errno));
|
||||||
opts->logfile, strerror(errno));
|
stderr_log_err(_("Terminating\n"));
|
||||||
|
exit(ERR_BAD_CONFIG);
|
||||||
|
}
|
||||||
|
fclose(fd);
|
||||||
|
|
||||||
|
stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile);
|
||||||
|
fd = freopen(opts->logfile, "a", stderr);
|
||||||
|
|
||||||
|
/* It's possible freopen() may still fail due to e.g. a race condition;
|
||||||
|
as it's not feasible to restore stderr after a failed freopen(),
|
||||||
|
we'll write to stdout as a last resort.
|
||||||
|
*/
|
||||||
|
if (fd == NULL)
|
||||||
|
{
|
||||||
|
printf(_("Unable to open specified logfile %s for writing: %s\n"), opts->logfile, strerror(errno));
|
||||||
|
printf(_("Terminating\n"));
|
||||||
|
exit(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -7,8 +7,11 @@
|
|||||||
#
|
#
|
||||||
# repmgr and repmgrd require these items to be configured:
|
# repmgr and repmgrd require these items to be configured:
|
||||||
|
|
||||||
# Cluster name
|
# Cluster name - this will be used by repmgr to generate its internal
|
||||||
cluster=test
|
# schema (pattern: "repmgr_{cluster}"); while this name will be quoted
|
||||||
|
# to preserve case, we recommend using lower case and avoiding whitespace
|
||||||
|
# to facilitate easier querying of the repmgr views and tables.
|
||||||
|
cluster=example_cluster
|
||||||
|
|
||||||
# Node ID and name
|
# Node ID and name
|
||||||
# (Note: we recommend to avoid naming nodes after their initial
|
# (Note: we recommend to avoid naming nodes after their initial
|
||||||
|
|||||||
9
repmgr.h
9
repmgr.h
@@ -20,11 +20,9 @@
|
|||||||
#ifndef _REPMGR_H_
|
#ifndef _REPMGR_H_
|
||||||
#define _REPMGR_H_
|
#define _REPMGR_H_
|
||||||
|
|
||||||
#include "postgres_fe.h"
|
#include <libpq-fe.h>
|
||||||
#include "libpq-fe.h"
|
#include <postgres_fe.h>
|
||||||
|
#include <getopt_long.h>
|
||||||
|
|
||||||
#include "getopt_long.h"
|
|
||||||
|
|
||||||
#include "strutil.h"
|
#include "strutil.h"
|
||||||
#include "dbutils.h"
|
#include "dbutils.h"
|
||||||
@@ -49,6 +47,7 @@
|
|||||||
|
|
||||||
#define MANUAL_FAILOVER 0
|
#define MANUAL_FAILOVER 0
|
||||||
#define AUTOMATIC_FAILOVER 1
|
#define AUTOMATIC_FAILOVER 1
|
||||||
|
#define NODE_NOT_FOUND -1
|
||||||
#define NO_UPSTREAM_NODE -1
|
#define NO_UPSTREAM_NODE -1
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
357
repmgrd.c
357
repmgrd.c
@@ -88,12 +88,9 @@ static void check_node_configuration(void);
|
|||||||
|
|
||||||
static void standby_monitor(void);
|
static void standby_monitor(void);
|
||||||
static void witness_monitor(void);
|
static void witness_monitor(void);
|
||||||
static bool check_connection(PGconn *conn, const char *type);
|
static bool check_connection(PGconn **conn, const char *type, const char *conninfo);
|
||||||
static bool set_local_node_failed(void);
|
static bool set_local_node_failed(void);
|
||||||
|
|
||||||
static bool update_node_record_set_master(PGconn *conn, int this_node_id, int old_master_node_id);
|
|
||||||
static bool update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id);
|
|
||||||
|
|
||||||
static void update_shared_memory(char *last_wal_standby_applied);
|
static void update_shared_memory(char *last_wal_standby_applied);
|
||||||
static void update_registration(void);
|
static void update_registration(void);
|
||||||
static void do_master_failover(void);
|
static void do_master_failover(void);
|
||||||
@@ -148,6 +145,8 @@ main(int argc, char **argv)
|
|||||||
{"monitoring-history", no_argument, NULL, 'm'},
|
{"monitoring-history", no_argument, NULL, 'm'},
|
||||||
{"daemonize", no_argument, NULL, 'd'},
|
{"daemonize", no_argument, NULL, 'd'},
|
||||||
{"pid-file", required_argument, NULL, 'p'},
|
{"pid-file", required_argument, NULL, 'p'},
|
||||||
|
{"help", no_argument, NULL, '?'},
|
||||||
|
{"version", no_argument, NULL, 'V'},
|
||||||
{NULL, 0, NULL, 0}
|
{NULL, 0, NULL, 0}
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -161,21 +160,7 @@ main(int argc, char **argv)
|
|||||||
int server_version_num = 0;
|
int server_version_num = 0;
|
||||||
progname = get_progname(argv[0]);
|
progname = get_progname(argv[0]);
|
||||||
|
|
||||||
if (argc > 1)
|
while ((c = getopt_long(argc, argv, "?Vf:v:mdp:", long_options, &optindex)) != -1)
|
||||||
{
|
|
||||||
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
|
|
||||||
{
|
|
||||||
help(progname);
|
|
||||||
exit(SUCCESS);
|
|
||||||
}
|
|
||||||
if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
|
|
||||||
{
|
|
||||||
printf("%s %s (PostgreSQL %s)\n", progname, REPMGR_VERSION, PG_VERSION);
|
|
||||||
exit(SUCCESS);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
while ((c = getopt_long(argc, argv, "f:v:mdp:", long_options, &optindex)) != -1)
|
|
||||||
{
|
{
|
||||||
switch (c)
|
switch (c)
|
||||||
{
|
{
|
||||||
@@ -194,6 +179,12 @@ main(int argc, char **argv)
|
|||||||
case 'p':
|
case 'p':
|
||||||
pid_file = optarg;
|
pid_file = optarg;
|
||||||
break;
|
break;
|
||||||
|
case '?':
|
||||||
|
help(progname);
|
||||||
|
exit(SUCCESS);
|
||||||
|
case 'V':
|
||||||
|
printf("%s %s (PostgreSQL %s)\n", progname, REPMGR_VERSION, PG_VERSION);
|
||||||
|
exit(SUCCESS);
|
||||||
default:
|
default:
|
||||||
usage();
|
usage();
|
||||||
exit(ERR_BAD_CONFIG);
|
exit(ERR_BAD_CONFIG);
|
||||||
@@ -209,7 +200,7 @@ main(int argc, char **argv)
|
|||||||
* which case we'll need to refactor parse_config() not to abort,
|
* which case we'll need to refactor parse_config() not to abort,
|
||||||
* and return the error message.
|
* and return the error message.
|
||||||
*/
|
*/
|
||||||
parse_config(config_file, &local_options);
|
load_config(config_file, &local_options, argv[0]);
|
||||||
|
|
||||||
if (daemonize)
|
if (daemonize)
|
||||||
{
|
{
|
||||||
@@ -268,7 +259,7 @@ main(int argc, char **argv)
|
|||||||
|
|
||||||
server_version_num = get_server_version(my_local_conn, NULL);
|
server_version_num = get_server_version(my_local_conn, NULL);
|
||||||
|
|
||||||
if(server_version_num < MIN_SUPPORTED_VERSION_NUM)
|
if (server_version_num < MIN_SUPPORTED_VERSION_NUM)
|
||||||
{
|
{
|
||||||
if (server_version_num > 0)
|
if (server_version_num > 0)
|
||||||
{
|
{
|
||||||
@@ -284,9 +275,17 @@ main(int argc, char **argv)
|
|||||||
terminate(ERR_BAD_CONFIG);
|
terminate(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Retrieve record for this node from the database */
|
/* Retrieve record for this node from the local database */
|
||||||
node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node);
|
node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node);
|
||||||
|
|
||||||
|
/* No node record found - exit gracefully */
|
||||||
|
if (node_info.node_id == NODE_NOT_FOUND)
|
||||||
|
{
|
||||||
|
log_err(_("No metadata record found for this node - terminating\n"));
|
||||||
|
log_notice(_("HINT: was this node registered with 'repmgr (master|standby) register'?\n"));
|
||||||
|
terminate(ERR_BAD_CONFIG);
|
||||||
|
}
|
||||||
|
|
||||||
log_debug("node id is %i, upstream is %i\n", node_info.node_id, node_info.upstream_node_id);
|
log_debug("node id is %i, upstream is %i\n", node_info.node_id, node_info.upstream_node_id);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@@ -312,7 +311,7 @@ main(int argc, char **argv)
|
|||||||
check_cluster_configuration(my_local_conn);
|
check_cluster_configuration(my_local_conn);
|
||||||
check_node_configuration();
|
check_node_configuration();
|
||||||
|
|
||||||
if (reload_config(config_file, &local_options))
|
if (reload_config(&local_options))
|
||||||
{
|
{
|
||||||
PQfinish(my_local_conn);
|
PQfinish(my_local_conn);
|
||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
@@ -321,7 +320,7 @@ main(int argc, char **argv)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* Log startup event */
|
/* Log startup event */
|
||||||
if(startup_event_logged == false)
|
if (startup_event_logged == false)
|
||||||
{
|
{
|
||||||
create_event_record(master_conn,
|
create_event_record(master_conn,
|
||||||
&local_options,
|
&local_options,
|
||||||
@@ -335,9 +334,9 @@ main(int argc, char **argv)
|
|||||||
log_info(_("starting continuous master connection check\n"));
|
log_info(_("starting continuous master connection check\n"));
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Check that master is still alive.
|
* Check that master is still alive.
|
||||||
* XXX We should also check that the
|
* XXX We should also check that the
|
||||||
* standby servers are sending info
|
* standby servers are sending info
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@@ -346,7 +345,7 @@ main(int argc, char **argv)
|
|||||||
*/
|
*/
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
if (check_connection(master_conn, "master"))
|
if (check_connection(&master_conn, "master", NULL))
|
||||||
{
|
{
|
||||||
sleep(local_options.monitor_interval_secs);
|
sleep(local_options.monitor_interval_secs);
|
||||||
}
|
}
|
||||||
@@ -361,10 +360,10 @@ main(int argc, char **argv)
|
|||||||
if (got_SIGHUP)
|
if (got_SIGHUP)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* if we can reload, then could need to change
|
* if we can reload the configuration file, then could need to change
|
||||||
* my_local_conn
|
* my_local_conn
|
||||||
*/
|
*/
|
||||||
if (reload_config(config_file, &local_options))
|
if (reload_config(&local_options))
|
||||||
{
|
{
|
||||||
PQfinish(my_local_conn);
|
PQfinish(my_local_conn);
|
||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
@@ -425,14 +424,14 @@ main(int argc, char **argv)
|
|||||||
check_cluster_configuration(my_local_conn);
|
check_cluster_configuration(my_local_conn);
|
||||||
check_node_configuration();
|
check_node_configuration();
|
||||||
|
|
||||||
if (reload_config(config_file, &local_options))
|
if (reload_config(&local_options))
|
||||||
{
|
{
|
||||||
PQfinish(my_local_conn);
|
PQfinish(my_local_conn);
|
||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
update_registration();
|
update_registration();
|
||||||
}
|
}
|
||||||
/* Log startup event */
|
/* Log startup event */
|
||||||
if(startup_event_logged == false)
|
if (startup_event_logged == false)
|
||||||
{
|
{
|
||||||
create_event_record(master_conn,
|
create_event_record(master_conn,
|
||||||
&local_options,
|
&local_options,
|
||||||
@@ -476,7 +475,7 @@ main(int argc, char **argv)
|
|||||||
* if we can reload, then could need to change
|
* if we can reload, then could need to change
|
||||||
* my_local_conn
|
* my_local_conn
|
||||||
*/
|
*/
|
||||||
if (reload_config(config_file, &local_options))
|
if (reload_config(&local_options))
|
||||||
{
|
{
|
||||||
PQfinish(my_local_conn);
|
PQfinish(my_local_conn);
|
||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
@@ -484,7 +483,7 @@ main(int argc, char **argv)
|
|||||||
}
|
}
|
||||||
got_SIGHUP = false;
|
got_SIGHUP = false;
|
||||||
}
|
}
|
||||||
if(failover_done)
|
if (failover_done)
|
||||||
{
|
{
|
||||||
log_debug(_("standby check loop will terminate\n"));
|
log_debug(_("standby check loop will terminate\n"));
|
||||||
}
|
}
|
||||||
@@ -529,9 +528,9 @@ witness_monitor(void)
|
|||||||
* of a missing master and promotion of a standby by that standby's
|
* of a missing master and promotion of a standby by that standby's
|
||||||
* repmgrd, so we'll loop for a while before giving up.
|
* repmgrd, so we'll loop for a while before giving up.
|
||||||
*/
|
*/
|
||||||
connection_ok = check_connection(master_conn, "master");
|
connection_ok = check_connection(&master_conn, "master", NULL);
|
||||||
|
|
||||||
if(connection_ok == false)
|
if (connection_ok == false)
|
||||||
{
|
{
|
||||||
int connection_retries;
|
int connection_retries;
|
||||||
log_debug(_("old master node ID: %i\n"), master_options.node);
|
log_debug(_("old master node ID: %i\n"), master_options.node);
|
||||||
@@ -582,7 +581,7 @@ witness_monitor(void)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if(connection_ok == false)
|
if (connection_ok == false)
|
||||||
{
|
{
|
||||||
PQExpBufferData errmsg;
|
PQExpBufferData errmsg;
|
||||||
initPQExpBuffer(&errmsg);
|
initPQExpBuffer(&errmsg);
|
||||||
@@ -637,9 +636,9 @@ witness_monitor(void)
|
|||||||
*/
|
*/
|
||||||
sqlquery_snprintf(sqlquery,
|
sqlquery_snprintf(sqlquery,
|
||||||
"INSERT INTO %s.repl_monitor "
|
"INSERT INTO %s.repl_monitor "
|
||||||
" (master_node, standby_node, "
|
" (primary_node, standby_node, "
|
||||||
" last_monitor_time, last_apply_time, "
|
" last_monitor_time, last_apply_time, "
|
||||||
" last_wal_master_location, last_wal_standby_location, "
|
" last_wal_primary_location, last_wal_standby_location, "
|
||||||
" replication_lag, apply_lag )"
|
" replication_lag, apply_lag )"
|
||||||
" VALUES(%d, %d, "
|
" VALUES(%d, %d, "
|
||||||
" '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
|
" '%s'::TIMESTAMP WITH TIME ZONE, NULL, "
|
||||||
@@ -686,6 +685,7 @@ standby_monitor(void)
|
|||||||
bool did_retry = false;
|
bool did_retry = false;
|
||||||
|
|
||||||
PGconn *upstream_conn;
|
PGconn *upstream_conn;
|
||||||
|
char upstream_conninfo[MAXCONNINFO];
|
||||||
int upstream_node_id;
|
int upstream_node_id;
|
||||||
t_node_info upstream_node;
|
t_node_info upstream_node;
|
||||||
|
|
||||||
@@ -697,7 +697,7 @@ standby_monitor(void)
|
|||||||
* no point in doing much else anyway
|
* no point in doing much else anyway
|
||||||
*/
|
*/
|
||||||
|
|
||||||
if (!check_connection(my_local_conn, "standby"))
|
if (!check_connection(&my_local_conn, "standby", NULL))
|
||||||
{
|
{
|
||||||
PQExpBufferData errmsg;
|
PQExpBufferData errmsg;
|
||||||
|
|
||||||
@@ -723,7 +723,7 @@ standby_monitor(void)
|
|||||||
upstream_conn = get_upstream_connection(my_local_conn,
|
upstream_conn = get_upstream_connection(my_local_conn,
|
||||||
local_options.cluster_name,
|
local_options.cluster_name,
|
||||||
local_options.node,
|
local_options.node,
|
||||||
&upstream_node_id, NULL);
|
&upstream_node_id, upstream_conninfo);
|
||||||
|
|
||||||
type = upstream_node_id == master_options.node
|
type = upstream_node_id == master_options.node
|
||||||
? "master"
|
? "master"
|
||||||
@@ -735,11 +735,11 @@ standby_monitor(void)
|
|||||||
* we cannot reconnect, try to get a new upstream node.
|
* we cannot reconnect, try to get a new upstream node.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
check_connection(upstream_conn, type); /* this takes up to
|
check_connection(&upstream_conn, type, upstream_conninfo);
|
||||||
* local_options.reconnect_attempts
|
/*
|
||||||
* local_options.reconnect_intvl seconds
|
* This takes up to local_options.reconnect_attempts *
|
||||||
*/
|
* local_options.reconnect_intvl seconds
|
||||||
|
*/
|
||||||
|
|
||||||
if (PQstatus(upstream_conn) != CONNECTION_OK)
|
if (PQstatus(upstream_conn) != CONNECTION_OK)
|
||||||
{
|
{
|
||||||
@@ -809,7 +809,7 @@ standby_monitor(void)
|
|||||||
*/
|
*/
|
||||||
upstream_node = get_node_info(my_local_conn, local_options.cluster_name, node_info.upstream_node_id);
|
upstream_node = get_node_info(my_local_conn, local_options.cluster_name, node_info.upstream_node_id);
|
||||||
|
|
||||||
if(upstream_node.type == MASTER)
|
if (upstream_node.type == MASTER)
|
||||||
{
|
{
|
||||||
log_debug(_("failure detected on master node (%i); attempting to promote a standby\n"),
|
log_debug(_("failure detected on master node (%i); attempting to promote a standby\n"),
|
||||||
node_info.upstream_node_id);
|
node_info.upstream_node_id);
|
||||||
@@ -820,7 +820,7 @@ standby_monitor(void)
|
|||||||
log_debug(_("failure detected on upstream node %i; attempting to reconnect to new upstream node\n"),
|
log_debug(_("failure detected on upstream node %i; attempting to reconnect to new upstream node\n"),
|
||||||
node_info.upstream_node_id);
|
node_info.upstream_node_id);
|
||||||
|
|
||||||
if(!do_upstream_standby_failover(upstream_node))
|
if (!do_upstream_standby_failover(upstream_node))
|
||||||
{
|
{
|
||||||
PQExpBufferData errmsg;
|
PQExpBufferData errmsg;
|
||||||
initPQExpBuffer(&errmsg);
|
initPQExpBuffer(&errmsg);
|
||||||
@@ -872,7 +872,7 @@ standby_monitor(void)
|
|||||||
log_err(_("standby node has disappeared, trying to reconnect...\n"));
|
log_err(_("standby node has disappeared, trying to reconnect...\n"));
|
||||||
did_retry = true;
|
did_retry = true;
|
||||||
|
|
||||||
if (!check_connection(my_local_conn, "standby"))
|
if (!check_connection(&my_local_conn, "standby", NULL))
|
||||||
{
|
{
|
||||||
set_local_node_failed();
|
set_local_node_failed();
|
||||||
terminate(0);
|
terminate(0);
|
||||||
@@ -917,7 +917,7 @@ standby_monitor(void)
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(PQntuples(res) == 0)
|
if (PQntuples(res) == 0)
|
||||||
{
|
{
|
||||||
log_err(_("standby_monitor(): no active master found\n"));
|
log_err(_("standby_monitor(): no active master found\n"));
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
@@ -927,18 +927,19 @@ standby_monitor(void)
|
|||||||
active_master_id = atoi(PQgetvalue(res, 0, 0));
|
active_master_id = atoi(PQgetvalue(res, 0, 0));
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
|
|
||||||
if(active_master_id != master_options.node)
|
if (active_master_id != master_options.node)
|
||||||
{
|
{
|
||||||
log_notice(_("connecting to active master (node %i)...\n"), active_master_id); \
|
log_notice(_("connecting to active master (node %i)...\n"), active_master_id); \
|
||||||
if(master_conn != NULL)
|
if (master_conn != NULL)
|
||||||
{
|
{
|
||||||
PQfinish(master_conn);
|
PQfinish(master_conn);
|
||||||
}
|
}
|
||||||
master_conn = get_master_connection(my_local_conn,
|
master_conn = get_master_connection(my_local_conn,
|
||||||
local_options.cluster_name,
|
local_options.cluster_name,
|
||||||
&master_options.node, NULL);
|
&master_options.node, NULL);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
if (PQstatus(master_conn) != CONNECTION_OK)
|
||||||
|
PQreset(master_conn);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Cancel any query that is still being executed, so i can insert the
|
* Cancel any query that is still being executed, so i can insert the
|
||||||
@@ -993,9 +994,9 @@ standby_monitor(void)
|
|||||||
*/
|
*/
|
||||||
sqlquery_snprintf(sqlquery,
|
sqlquery_snprintf(sqlquery,
|
||||||
"INSERT INTO %s.repl_monitor "
|
"INSERT INTO %s.repl_monitor "
|
||||||
" (master_node, standby_node, "
|
" (primary_node, standby_node, "
|
||||||
" last_monitor_time, last_apply_time, "
|
" last_monitor_time, last_apply_time, "
|
||||||
" last_wal_master_location, last_wal_standby_location, "
|
" last_wal_primary_location, last_wal_standby_location, "
|
||||||
" replication_lag, apply_lag ) "
|
" replication_lag, apply_lag ) "
|
||||||
" VALUES(%d, %d, "
|
" VALUES(%d, %d, "
|
||||||
" '%s'::TIMESTAMP WITH TIME ZONE, '%s'::TIMESTAMP WITH TIME ZONE, "
|
" '%s'::TIMESTAMP WITH TIME ZONE, '%s'::TIMESTAMP WITH TIME ZONE, "
|
||||||
@@ -1003,7 +1004,7 @@ standby_monitor(void)
|
|||||||
" %llu, %llu) ",
|
" %llu, %llu) ",
|
||||||
get_repmgr_schema_quoted(master_conn),
|
get_repmgr_schema_quoted(master_conn),
|
||||||
master_options.node, local_options.node,
|
master_options.node, local_options.node,
|
||||||
monitor_standby_timestamp, last_wal_standby_applied_timestamp,
|
monitor_standby_timestamp, last_wal_standby_applied_timestamp,
|
||||||
last_wal_master_location, last_wal_standby_received,
|
last_wal_master_location, last_wal_standby_received,
|
||||||
(long long unsigned int)(lsn_master - lsn_standby_received),
|
(long long unsigned int)(lsn_master - lsn_standby_received),
|
||||||
(long long unsigned int)(lsn_standby_received - lsn_standby_applied));
|
(long long unsigned int)(lsn_standby_received - lsn_standby_applied));
|
||||||
@@ -1101,7 +1102,7 @@ do_master_failover(void)
|
|||||||
|
|
||||||
/* Copy details of the failed node */
|
/* Copy details of the failed node */
|
||||||
/* XXX only node_id is actually used later */
|
/* XXX only node_id is actually used later */
|
||||||
if(nodes[i].type == MASTER)
|
if (nodes[i].type == MASTER)
|
||||||
{
|
{
|
||||||
failed_master.node_id = nodes[i].node_id;
|
failed_master.node_id = nodes[i].node_id;
|
||||||
failed_master.xlog_location = nodes[i].xlog_location;
|
failed_master.xlog_location = nodes[i].xlog_location;
|
||||||
@@ -1145,8 +1146,8 @@ do_master_failover(void)
|
|||||||
total_nodes, visible_nodes);
|
total_nodes, visible_nodes);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* am i on the group that should keep alive? if i see less than half of
|
* Am I on the group that should keep alive? If I see less than half of
|
||||||
* total_nodes then i should do nothing
|
* total_nodes then I should do nothing
|
||||||
*/
|
*/
|
||||||
if (visible_nodes < (total_nodes / 2.0))
|
if (visible_nodes < (total_nodes / 2.0))
|
||||||
{
|
{
|
||||||
@@ -1207,7 +1208,7 @@ do_master_failover(void)
|
|||||||
|
|
||||||
/* If position is 0/0, error */
|
/* If position is 0/0, error */
|
||||||
/* XXX do we need to terminate ourselves if the queried node has a problem? */
|
/* XXX do we need to terminate ourselves if the queried node has a problem? */
|
||||||
if(xlog_recptr == InvalidXLogRecPtr)
|
if (xlog_recptr == InvalidXLogRecPtr)
|
||||||
{
|
{
|
||||||
log_err(_("InvalidXLogRecPtr detected on standby node %i\n"), nodes[i].node_id);
|
log_err(_("InvalidXLogRecPtr detected on standby node %i\n"), nodes[i].node_id);
|
||||||
terminate(ERR_FAILOVER_FAIL);
|
terminate(ERR_FAILOVER_FAIL);
|
||||||
@@ -1297,12 +1298,12 @@ do_master_failover(void)
|
|||||||
* empty string; otherwise position is 0/0 and we need to continue
|
* empty string; otherwise position is 0/0 and we need to continue
|
||||||
* looping until a valid LSN is reported
|
* looping until a valid LSN is reported
|
||||||
*/
|
*/
|
||||||
if(xlog_recptr == InvalidXLogRecPtr)
|
if (xlog_recptr == InvalidXLogRecPtr)
|
||||||
{
|
{
|
||||||
if(lsn_format_ok == false)
|
if (lsn_format_ok == false)
|
||||||
{
|
{
|
||||||
/* Unable to parse value returned by `repmgr_get_last_standby_location()` */
|
/* Unable to parse value returned by `repmgr_get_last_standby_location()` */
|
||||||
if(*PQgetvalue(res, 0, 0) == '\0')
|
if (*PQgetvalue(res, 0, 0) == '\0')
|
||||||
{
|
{
|
||||||
log_crit(
|
log_crit(
|
||||||
_("unable to obtain LSN from node %i"), nodes[i].node_id
|
_("unable to obtain LSN from node %i"), nodes[i].node_id
|
||||||
@@ -1434,25 +1435,6 @@ do_master_failover(void)
|
|||||||
/* and reconnect to the local database */
|
/* and reconnect to the local database */
|
||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
|
|
||||||
/* update node information to reflect new status */
|
|
||||||
if(update_node_record_set_master(my_local_conn, node_info.node_id, failed_master.node_id) == false)
|
|
||||||
{
|
|
||||||
appendPQExpBuffer(&event_details,
|
|
||||||
_("unable to update node record for node %i (promoted to master following failure of node %i)"),
|
|
||||||
node_info.node_id,
|
|
||||||
failed_master.node_id);
|
|
||||||
|
|
||||||
log_err("%s\n", event_details.data);
|
|
||||||
|
|
||||||
create_event_record(NULL,
|
|
||||||
&local_options,
|
|
||||||
node_info.node_id,
|
|
||||||
"repmgrd_failover_promote",
|
|
||||||
false,
|
|
||||||
event_details.data);
|
|
||||||
|
|
||||||
terminate(ERR_DB_QUERY);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* update internal record for this node */
|
/* update internal record for this node */
|
||||||
node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node);
|
node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node);
|
||||||
@@ -1501,9 +1483,9 @@ do_master_failover(void)
|
|||||||
*/
|
*/
|
||||||
new_master_conn = establish_db_connection(best_candidate.conninfo_str, true);
|
new_master_conn = establish_db_connection(best_candidate.conninfo_str, true);
|
||||||
|
|
||||||
if(local_options.use_replication_slots)
|
if (local_options.use_replication_slots)
|
||||||
{
|
{
|
||||||
if(create_replication_slot(new_master_conn, node_info.slot_name) == false)
|
if (create_replication_slot(new_master_conn, node_info.slot_name) == false)
|
||||||
{
|
{
|
||||||
|
|
||||||
appendPQExpBuffer(&event_details,
|
appendPQExpBuffer(&event_details,
|
||||||
@@ -1536,7 +1518,7 @@ do_master_failover(void)
|
|||||||
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
my_local_conn = establish_db_connection(local_options.conninfo, true);
|
||||||
|
|
||||||
/* update node information to reflect new status */
|
/* update node information to reflect new status */
|
||||||
if(update_node_record_set_upstream(new_master_conn, node_info.node_id, best_candidate.node_id) == false)
|
if (update_node_record_set_upstream(new_master_conn, local_options.cluster_name, node_info.node_id, best_candidate.node_id) == false)
|
||||||
{
|
{
|
||||||
appendPQExpBuffer(&event_details,
|
appendPQExpBuffer(&event_details,
|
||||||
_("Unable to update node record for node %i (following new upstream node %i)"),
|
_("Unable to update node record for node %i (following new upstream node %i)"),
|
||||||
@@ -1604,7 +1586,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
* Verify that we can still talk to the cluster master even though
|
* Verify that we can still talk to the cluster master even though
|
||||||
* node upstream is not available
|
* node upstream is not available
|
||||||
*/
|
*/
|
||||||
if (!check_connection(master_conn, "master"))
|
if (!check_connection(&master_conn, "master", NULL))
|
||||||
{
|
{
|
||||||
log_err(_("do_upstream_standby_failover(): Unable to connect to last known master node\n"));
|
log_err(_("do_upstream_standby_failover(): Unable to connect to last known master node\n"));
|
||||||
return false;
|
return false;
|
||||||
@@ -1628,7 +1610,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(PQntuples(res) == 0)
|
if (PQntuples(res) == 0)
|
||||||
{
|
{
|
||||||
log_err(_("no node with id %i found"), upstream_node_id);
|
log_err(_("no node with id %i found"), upstream_node_id);
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
@@ -1636,7 +1618,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* upstream node is inactive */
|
/* upstream node is inactive */
|
||||||
if(strcmp(PQgetvalue(res, 0, 1), "f") == 0)
|
if (strcmp(PQgetvalue(res, 0, 1), "f") == 0)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* Upstream node is an inactive master, meaning no there are no direct
|
* Upstream node is an inactive master, meaning no there are no direct
|
||||||
@@ -1646,7 +1628,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
* provide an option to either try and find the current master and/or
|
* provide an option to either try and find the current master and/or
|
||||||
* a strategy to connect to a different upstream node
|
* a strategy to connect to a different upstream node
|
||||||
*/
|
*/
|
||||||
if(strcmp(PQgetvalue(res, 0, 4), "master") == 0)
|
if (strcmp(PQgetvalue(res, 0, 4), "master") == 0)
|
||||||
{
|
{
|
||||||
log_err(_("unable to find active master node\n"));
|
log_err(_("unable to find active master node\n"));
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
@@ -1680,7 +1662,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
terminate(ERR_BAD_CONFIG);
|
terminate(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
|
|
||||||
if(update_node_record_set_upstream(master_conn, node_info.node_id, upstream_node_id) == false)
|
if (update_node_record_set_upstream(master_conn, local_options.cluster_name, node_info.node_id, upstream_node_id) == false)
|
||||||
{
|
{
|
||||||
terminate(ERR_BAD_CONFIG);
|
terminate(ERR_BAD_CONFIG);
|
||||||
}
|
}
|
||||||
@@ -1693,7 +1675,7 @@ do_upstream_standby_failover(t_node_info upstream_node)
|
|||||||
|
|
||||||
|
|
||||||
static bool
|
static bool
|
||||||
check_connection(PGconn *conn, const char *type)
|
check_connection(PGconn **conn, const char *type, const char *conninfo)
|
||||||
{
|
{
|
||||||
int connection_retries;
|
int connection_retries;
|
||||||
|
|
||||||
@@ -1704,7 +1686,16 @@ check_connection(PGconn *conn, const char *type)
|
|||||||
*/
|
*/
|
||||||
for (connection_retries = 0; connection_retries < local_options.reconnect_attempts; connection_retries++)
|
for (connection_retries = 0; connection_retries < local_options.reconnect_attempts; connection_retries++)
|
||||||
{
|
{
|
||||||
if (!is_pgup(conn, local_options.master_response_timeout))
|
if (*conn == NULL)
|
||||||
|
{
|
||||||
|
if (conninfo == NULL)
|
||||||
|
{
|
||||||
|
log_err("INTERNAL ERROR: *conn == NULL && conninfo == NULL");
|
||||||
|
terminate(ERR_INTERNAL);
|
||||||
|
}
|
||||||
|
*conn = establish_db_connection(conninfo, false);
|
||||||
|
}
|
||||||
|
if (!is_pgup(*conn, local_options.master_response_timeout))
|
||||||
{
|
{
|
||||||
log_warning(_("connection to %s has been lost, trying to recover... %i seconds before failover decision\n"),
|
log_warning(_("connection to %s has been lost, trying to recover... %i seconds before failover decision\n"),
|
||||||
type,
|
type,
|
||||||
@@ -1722,9 +1713,9 @@ check_connection(PGconn *conn, const char *type)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!is_pgup(conn, local_options.master_response_timeout))
|
if (!is_pgup(*conn, local_options.master_response_timeout))
|
||||||
{
|
{
|
||||||
log_err(_("unable to reconnect to %s after %i seconds...\n"),
|
log_err(_("unable to reconnect to %s (timeout %i seconds)...\n"),
|
||||||
type,
|
type,
|
||||||
local_options.master_response_timeout
|
local_options.master_response_timeout
|
||||||
);
|
);
|
||||||
@@ -1749,10 +1740,10 @@ set_local_node_failed(void)
|
|||||||
{
|
{
|
||||||
PGresult *res;
|
PGresult *res;
|
||||||
char sqlquery[QUERY_STR_LEN];
|
char sqlquery[QUERY_STR_LEN];
|
||||||
int active_master_node_id = -1;
|
int active_master_node_id = NODE_NOT_FOUND;
|
||||||
char master_conninfo[MAXLEN];
|
char master_conninfo[MAXLEN];
|
||||||
|
|
||||||
if (!check_connection(master_conn, "master"))
|
if (!check_connection(&master_conn, "master", NULL))
|
||||||
{
|
{
|
||||||
log_err(_("set_local_node_failed(): Unable to connect to last known master node\n"));
|
log_err(_("set_local_node_failed(): Unable to connect to last known master node\n"));
|
||||||
return false;
|
return false;
|
||||||
@@ -1780,7 +1771,7 @@ set_local_node_failed(void)
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(!PQntuples(res))
|
if (!PQntuples(res))
|
||||||
{
|
{
|
||||||
log_err(_("no active master record found\n"));
|
log_err(_("no active master record found\n"));
|
||||||
return false;
|
return false;
|
||||||
@@ -1790,14 +1781,14 @@ set_local_node_failed(void)
|
|||||||
strncpy(master_conninfo, PQgetvalue(res, 0, 1), MAXLEN);
|
strncpy(master_conninfo, PQgetvalue(res, 0, 1), MAXLEN);
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
|
|
||||||
if(active_master_node_id != master_options.node)
|
if (active_master_node_id != master_options.node)
|
||||||
{
|
{
|
||||||
log_notice(_("current active master is %i; attempting to connect\n"),
|
log_notice(_("current active master is %i; attempting to connect\n"),
|
||||||
active_master_node_id);
|
active_master_node_id);
|
||||||
PQfinish(master_conn);
|
PQfinish(master_conn);
|
||||||
master_conn = establish_db_connection(master_conninfo, false);
|
master_conn = establish_db_connection(master_conninfo, false);
|
||||||
|
|
||||||
if(PQstatus(master_conn) != CONNECTION_OK)
|
if (PQstatus(master_conn) != CONNECTION_OK)
|
||||||
{
|
{
|
||||||
log_err(_("unable to connect to active master\n"));
|
log_err(_("unable to connect to active master\n"));
|
||||||
return false;
|
return false;
|
||||||
@@ -1955,13 +1946,13 @@ lsn_to_xlogrecptr(char *lsn, bool *format_ok)
|
|||||||
|
|
||||||
if (sscanf(lsn, "%X/%X", &xlogid, &xrecoff) != 2)
|
if (sscanf(lsn, "%X/%X", &xlogid, &xrecoff) != 2)
|
||||||
{
|
{
|
||||||
if(format_ok != NULL)
|
if (format_ok != NULL)
|
||||||
*format_ok = false;
|
*format_ok = false;
|
||||||
log_err(_("incorrect log location format: %s\n"), lsn);
|
log_err(_("incorrect log location format: %s\n"), lsn);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
if(format_ok != NULL)
|
if (format_ok != NULL)
|
||||||
*format_ok = true;
|
*format_ok = true;
|
||||||
|
|
||||||
return (((XLogRecPtr) xlogid * 16 * 1024 * 1024 * 255) + xrecoff);
|
return (((XLogRecPtr) xlogid * 16 * 1024 * 1024 * 255) + xrecoff);
|
||||||
@@ -1978,17 +1969,21 @@ usage(void)
|
|||||||
void
|
void
|
||||||
help(const char *progname)
|
help(const char *progname)
|
||||||
{
|
{
|
||||||
printf(_("Usage: %s [OPTIONS]\n"), progname);
|
printf(_("%s: replication management daemon for PostgreSQL\n"), progname);
|
||||||
printf(_("Replicator manager daemon for PostgreSQL.\n"));
|
printf(_("\n"));
|
||||||
printf(_("\nOptions:\n"));
|
printf(_("Usage:\n"));
|
||||||
printf(_(" --help show this help, then exit\n"));
|
printf(_(" %s [OPTIONS]\n"), progname);
|
||||||
printf(_(" --version output version information, then exit\n"));
|
printf(_("\n"));
|
||||||
|
printf(_("Options:\n"));
|
||||||
|
printf(_(" -?, --help show this help, then exit\n"));
|
||||||
|
printf(_(" -V, --version output version information, then exit\n"));
|
||||||
printf(_(" -v, --verbose output verbose activity information\n"));
|
printf(_(" -v, --verbose output verbose activity information\n"));
|
||||||
printf(_(" -m, --monitoring-history track advance or lag of the replication in every standby in repl_monitor\n"));
|
printf(_(" -m, --monitoring-history track advance or lag of the replication in every standby in repl_monitor\n"));
|
||||||
printf(_(" -f, --config-file=PATH path to the configuration file\n"));
|
printf(_(" -f, --config-file=PATH path to the configuration file\n"));
|
||||||
printf(_(" -d, --daemonize detach process from foreground\n"));
|
printf(_(" -d, --daemonize detach process from foreground\n"));
|
||||||
printf(_(" -p, --pid-file=PATH write a PID file\n"));
|
printf(_(" -p, --pid-file=PATH write a PID file\n"));
|
||||||
printf(_("\n%s monitors a cluster of servers.\n"), progname);
|
printf(_("\n"));
|
||||||
|
printf(_("%s monitors a cluster of servers and optionally performs failover.\n"), progname);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@@ -2089,7 +2084,7 @@ update_registration(void)
|
|||||||
|
|
||||||
log_err("%s\n", errmsg.data);
|
log_err("%s\n", errmsg.data);
|
||||||
|
|
||||||
create_event_record(my_local_conn,
|
create_event_record(master_conn,
|
||||||
&local_options,
|
&local_options,
|
||||||
local_options.node,
|
local_options.node,
|
||||||
"repmgrd_shutdown",
|
"repmgrd_shutdown",
|
||||||
@@ -2231,23 +2226,12 @@ check_and_create_pid_file(const char *pid_file)
|
|||||||
t_node_info
|
t_node_info
|
||||||
get_node_info(PGconn *conn, char *cluster, int node_id)
|
get_node_info(PGconn *conn, char *cluster, int node_id)
|
||||||
{
|
{
|
||||||
char sqlquery[QUERY_STR_LEN];
|
|
||||||
PGresult *res;
|
PGresult *res;
|
||||||
|
|
||||||
t_node_info node_info = {-1, NO_UPSTREAM_NODE, "", InvalidXLogRecPtr, UNKNOWN, false, false};
|
t_node_info node_info = { NODE_NOT_FOUND, NO_UPSTREAM_NODE, "", InvalidXLogRecPtr, UNKNOWN, false, false};
|
||||||
|
|
||||||
sprintf(sqlquery,
|
res = get_node_record(conn, cluster, node_id);
|
||||||
"SELECT id, upstream_node_id, conninfo, type, slot_name, active "
|
|
||||||
" FROM %s.repl_nodes "
|
|
||||||
" WHERE cluster = '%s' "
|
|
||||||
" AND id = %i",
|
|
||||||
get_repmgr_schema_quoted(conn),
|
|
||||||
local_options.cluster_name,
|
|
||||||
node_id);
|
|
||||||
|
|
||||||
log_debug("get_node_info(): %s\n", sqlquery);
|
|
||||||
|
|
||||||
res = PQexec(my_local_conn, sqlquery);
|
|
||||||
if (PQresultStatus(res) != PGRES_TUPLES_OK)
|
if (PQresultStatus(res) != PGRES_TUPLES_OK)
|
||||||
{
|
{
|
||||||
PQExpBufferData errmsg;
|
PQExpBufferData errmsg;
|
||||||
@@ -2260,7 +2244,7 @@ get_node_info(PGconn *conn, char *cluster, int node_id)
|
|||||||
|
|
||||||
log_err("%s\n", errmsg.data);
|
log_err("%s\n", errmsg.data);
|
||||||
|
|
||||||
create_event_record(my_local_conn,
|
create_event_record(NULL,
|
||||||
&local_options,
|
&local_options,
|
||||||
local_options.node,
|
local_options.node,
|
||||||
"repmgrd_shutdown",
|
"repmgrd_shutdown",
|
||||||
@@ -2274,7 +2258,7 @@ get_node_info(PGconn *conn, char *cluster, int node_id)
|
|||||||
if (!PQntuples(res)) {
|
if (!PQntuples(res)) {
|
||||||
log_warning(_("No record found record for node %i\n"), node_id);
|
log_warning(_("No record found record for node %i\n"), node_id);
|
||||||
PQclear(res);
|
PQclear(res);
|
||||||
node_info.node_id = -1;
|
node_info.node_id = NODE_NOT_FOUND;
|
||||||
return node_info;
|
return node_info;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2296,139 +2280,18 @@ get_node_info(PGconn *conn, char *cluster, int node_id)
|
|||||||
static t_server_type
|
static t_server_type
|
||||||
parse_node_type(const char *type)
|
parse_node_type(const char *type)
|
||||||
{
|
{
|
||||||
if(strcmp(type, "master") == 0)
|
if (strcmp(type, "master") == 0)
|
||||||
{
|
{
|
||||||
return MASTER;
|
return MASTER;
|
||||||
}
|
}
|
||||||
else if(strcmp(type, "standby") == 0)
|
else if (strcmp(type, "standby") == 0)
|
||||||
{
|
{
|
||||||
return STANDBY;
|
return STANDBY;
|
||||||
}
|
}
|
||||||
else if(strcmp(type, "witness") == 0)
|
else if (strcmp(type, "witness") == 0)
|
||||||
{
|
{
|
||||||
return WITNESS;
|
return WITNESS;
|
||||||
}
|
}
|
||||||
|
|
||||||
return UNKNOWN;
|
return UNKNOWN;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
static bool
|
|
||||||
update_node_record_set_master(PGconn *conn, int this_node_id, int old_master_node_id)
|
|
||||||
{
|
|
||||||
PGresult *res;
|
|
||||||
char sqlquery[QUERY_STR_LEN];
|
|
||||||
|
|
||||||
log_debug(_("Setting failed node %i inactive; marking node %i as master\n"), old_master_node_id, this_node_id);
|
|
||||||
|
|
||||||
res = PQexec(conn, "BEGIN");
|
|
||||||
|
|
||||||
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
|
||||||
{
|
|
||||||
log_err(_("Unable to begin transaction: %s\n"),
|
|
||||||
PQerrorMessage(conn));
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
sqlquery_snprintf(sqlquery,
|
|
||||||
" UPDATE %s.repl_nodes "
|
|
||||||
" SET active = FALSE "
|
|
||||||
" WHERE cluster = '%s' "
|
|
||||||
" AND id = %i ",
|
|
||||||
get_repmgr_schema_quoted(conn),
|
|
||||||
local_options.cluster_name,
|
|
||||||
old_master_node_id);
|
|
||||||
|
|
||||||
res = PQexec(conn, sqlquery);
|
|
||||||
|
|
||||||
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
|
||||||
{
|
|
||||||
log_err(_("Unable to set old master node %i as inactive: %s\n"),
|
|
||||||
old_master_node_id,
|
|
||||||
PQerrorMessage(conn));
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
PQexec(conn, "ROLLBACK");
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
sqlquery_snprintf(sqlquery,
|
|
||||||
" UPDATE %s.repl_nodes "
|
|
||||||
" SET type = 'master', "
|
|
||||||
" upstream_node_id = NULL "
|
|
||||||
" WHERE cluster = '%s' "
|
|
||||||
" AND id = %i ",
|
|
||||||
get_repmgr_schema_quoted(conn),
|
|
||||||
local_options.cluster_name,
|
|
||||||
this_node_id);
|
|
||||||
|
|
||||||
res = PQexec(conn, sqlquery);
|
|
||||||
|
|
||||||
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
|
||||||
{
|
|
||||||
log_err(_("Unable to set current node %i as active master: %s\n"),
|
|
||||||
this_node_id,
|
|
||||||
PQerrorMessage(conn));
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
PQexec(conn, "ROLLBACK");
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
res = PQexec(conn, "COMMIT");
|
|
||||||
|
|
||||||
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
|
||||||
{
|
|
||||||
log_err(_("Unable to set commit transaction: %s\n"),
|
|
||||||
PQerrorMessage(conn));
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
static bool
|
|
||||||
update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id)
|
|
||||||
{
|
|
||||||
PGresult *res;
|
|
||||||
char sqlquery[QUERY_STR_LEN];
|
|
||||||
|
|
||||||
log_debug(_("update_node_record_set_upstream(): Updating node %i's upstream node to %i\n"), this_node_id, new_upstream_node_id);
|
|
||||||
|
|
||||||
sqlquery_snprintf(sqlquery,
|
|
||||||
" UPDATE %s.repl_nodes "
|
|
||||||
" SET upstream_node_id = %i "
|
|
||||||
" WHERE cluster = '%s' "
|
|
||||||
" AND id = %i ",
|
|
||||||
get_repmgr_schema_quoted(conn),
|
|
||||||
new_upstream_node_id,
|
|
||||||
local_options.cluster_name,
|
|
||||||
this_node_id);
|
|
||||||
res = PQexec(conn, sqlquery);
|
|
||||||
|
|
||||||
if (PQresultStatus(res) != PGRES_COMMAND_OK)
|
|
||||||
{
|
|
||||||
log_err(_("Unable to set new upstream node id: %s\n"),
|
|
||||||
PQerrorMessage(conn));
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
PQclear(res);
|
|
||||||
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|||||||
Reference in New Issue
Block a user