Compare commits

..

42 Commits

Author SHA1 Message Date
Jaime Casanova
efd50f11ac Fix HISTORY to show from newest to oldest 2012-07-27 11:30:35 -05:00
Jaime Casanova
45a39084ed Prepare release notes for release 2012-07-21 11:56:00 -05:00
Jaime Casanova
94c73a016f Add a release note that was missing 2012-07-05 09:36:42 -05:00
Jaime Casanova
be5cbe4ddd Improve the version message to actually show the repmgr version not
only postgresql's one
2012-06-25 22:53:16 -05:00
Jaime Casanova
30d35d5b4c Names on history file are without surnames when they are well-known
so, keep it that way
2012-06-13 00:48:34 -05:00
Jaime Casanova
fa889a11ac Remove now finished TODO item about having a sanity check for ssh 2012-06-13 00:43:00 -05:00
Jaime Casanova
f4087d0a32 Add a \n in a message 2012-06-12 23:41:31 -05:00
Jaime Casanova
a55d7a4bd3 getMasterConnection() cannot avoid checking the same node that asks
to find the master.
This was a micro optimization based on the fact that all commands
that needed to detect the master were executed from the standby
but now that we have CLUSTER level commands that is not true anymore
2012-06-12 23:23:49 -05:00
Jaime Casanova
5d8cf6abe0 Allow repmgr to obtain tablespace's locations from pg 9.2 and later
in which we no longer have a spclocation column in pg_tablespaces
2012-06-12 10:49:23 -05:00
Jaime Casanova
9caa243354 Moving the 'Starting backup' message to a better place 2012-06-12 09:44:06 -05:00
Jaime Casanova
6880483947 STANDBY CLONE should be run by a SUPERUSER, otherwise we won't be able
to retrieve data_directory and the other parameters we need by
querying the database.
2012-06-12 09:37:03 -05:00
Jaime Casanova
3d89fdadab Fix a typo in a message 2012-06-12 09:28:27 -05:00
Cédric Villemain
6e9e4e05ae Add test_ssh_connection
The feature was written by Jaime and reworked by me to fix
https://github.com/greg2ndQuadrant/repmgr/issues/5
2012-05-09 15:24:38 -05:00
Jaime Casanova
17a160e970 Correct credits 2012-05-09 14:59:00 -05:00
Jaime Casanova
e0e01aa9db Add Carlo to CREDITS 2012-04-27 02:09:48 -05:00
Jaime Casanova
b09eff9f76 Avoid the possibility of a double free. Fix by Carlo Ascani 2012-04-27 02:08:40 -05:00
Jaime Casanova
3c5d82b9ef Complete CREDITS and HISTORY for release 2012-04-27 02:07:42 -05:00
Jaime Casanova
257dbc4f42 Cleanup of patch that introduces write_primary_conninfo() 2012-04-27 01:23:37 -05:00
Jaime Casanova
2a64099163 Added function "write_primary_conninfo" which now adds the username to the primary_conninfo parameter in recovery.conf 2012-04-27 01:10:24 -05:00
Jaime Casanova
41c05bea7b Fix CLUSTER SHOW, that i have broken 2012-04-26 13:58:39 -05:00
Jaime Casanova
7d76d86e19 Add debug information to CLUSTER SHOW y CLUSTER CLEANUP 2012-04-26 13:31:54 -05:00
Jaime Casanova
36d5b5bc24 I need a local connection to get the master of the cluster 2012-04-26 12:29:04 -05:00
Jaime Casanova
c543402d65 A typo that escaped to my previous review 2012-04-26 12:23:24 -05:00
Jaime Casanova
d0959b953e Cleanup patch about CLUSTER CLEANUP 2012-04-26 12:21:41 -05:00
Jaime Casanova
0660bded0b Fix a switch in which a "break" was missing that makes always that --force option was used end up in the default section and error. 2012-04-19 12:21:49 -05:00
Jaime Casanova
209a0c64d2 Add documentation about both CLUSTER SHOW and CLUSTER CLEANUP commands 2012-04-14 21:56:58 -05:00
Jaime Casanova
fd76ec6283 Adds a CLUSTER CLEANUP command to clean monitor's history,
also include a --keep-history (-k) option to indicate how many
days of history to keep
2012-04-14 21:34:06 -05:00
Jaime Casanova
7d579cf71f Add CLUSTER SHOW command to show the current nodes configured 2012-04-14 21:11:51 -05:00
Jaime Casanova
d790ef740b Add a paragraph in the docs describing how to clean history 2012-04-11 10:54:22 -05:00
Jaime Casanova
aa6633b027 Complete the lists of error codes that repmgr can return in the README.rst 2012-04-11 10:38:22 -05:00
Jaime Casanova
c3bffce379 Run astyle to format code before tagging the release 2012-04-11 10:35:37 -05:00
Jaime Casanova
78aea00a6d Avoid to show what segments are needed for this backup if the rsync failed 2012-04-11 10:34:38 -05:00
Jaime Casanova
91601204b5 Remove last argument from log_err, left in commit 9b8fb7e960.
Also rephrase the sentence

Reported by Jeroen Dekkers
2011-11-28 17:26:19 -05:00
Jaime Casanova
c91ddc2f5e Fix a wrong message.
It was saying the problem is the version of the PostgreSQL server while
it actually is because the MASTER REGISTER command was running on a
standby node
2011-11-10 09:30:42 -05:00
Jaime Casanova
72f74dd7a7 Fix a typo introduced in commit 94c9c3a5c6 2011-11-03 12:54:55 -05:00
Jaime Casanova
901d07fa92 Improve performance of the repl_status view 2011-10-20 23:20:03 -05:00
Greg Smith
f0e609bcd4 Add strnlen on platforms that don't have it, such as OS X 2011-10-20 17:04:29 -05:00
Jaime Casanova
94c9c3a5c6 Let the clone happen in a session with synchronous_commit off. This
is because in pg 9.1 the default configuration can easily allow sync
rep to be activated even if no standby is present and will block
pg_start_backup() and pg_stop_backup() in that case.

Also remove a second connection we were opening to execute
pg_stop_backup(), i'm not sure why that was there but now it was
a problem because it was another session and not the one we set here.
2011-10-03 13:56:31 -05:00
Cédric Villemain
3af5243bcc Fix rsync return code test 2011-08-24 09:14:22 -05:00
Cédric Villemain
85bbae462a Add --ignore-rsync-warning to README 2011-08-22 00:34:01 -05:00
Cédric Villemain
14e49d41c2 Add --ignore-rsync-warning command line option
This fix the rsync return code in case there are vanished files.

Common situation are DROPed tables and TEMPorary object deletion and
are handled by PostgreSQL.
But as it may exist situation where an external process delete files in
the PGDATA the flag is off by default.

XXX 2 items :

 * is -I a good choice ? maybe we need to prevent future --ignore-foo and
   add something like : --ignore=rsync_warning -I rsync_warning
 * the warning message is not enough explicit with the risk involved by
   --force usage
2011-08-22 00:32:40 -05:00
Cédric Villemain
1bd8a703c8 Fix getopt for ignore-rsync-warning
The change was loosed during merge and not checked in master/
2011-06-06 20:56:45 -04:00
38 changed files with 1439 additions and 4907 deletions

3
.gitignore vendored
View File

@@ -1,9 +1,6 @@
*~ *~
*.o *.o
*.so
repmgr repmgr
repmgrd repmgrd
README.htm* README.htm*
README.pdf README.pdf
sql/repmgr_funcs.so
sql/repmgr_funcs.sql

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2014, 2ndQuadrant Limited Copyright (c) 2010-2011, 2ndQuadrant Limited
All rights reserved. All rights reserved.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify

View File

@@ -10,7 +10,5 @@ Hannu Krosing <hannu@2ndQuadrant.com>
Cédric Villemain <cedric@2ndquadrant.com> Cédric Villemain <cedric@2ndquadrant.com>
Charles Duffy <charles@dyfis.net> Charles Duffy <charles@dyfis.net>
Daniel Farina <daniel@heroku.com> Daniel Farina <daniel@heroku.com>
Shawn Ellis <shawn.ellis17@gmail.com> Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Jay Taylor <jay@jaytaylor.com> Carlo Ascani <carlo.ascani@2ndquadrant.it>
Christian Kruse <christian@2ndQuadrant.com>
Krzysztof Gajdemski <songo@debian.org.pl>

37
HISTORY
View File

@@ -1,38 +1,3 @@
2.0.1 2014-07-16
Documentation fixes and new QUICKSTART file (Ian)
Explicitly specify directories to ignore when cloning (Ian)
Fix log level for some log messages (Ian)
RHEL/CentOS specfile, init script and Makefile fixes (Nathan Van Overloop)
Debian init script and config file documentation fixes (József Kószó)
Typo fixes (Riegie Godwin Jeyaranchen, PriceChild)
2.0stable 2014-01-30
Documentation fixes (Christian)
General refactoring, code quality improvements and stabilization work (Christian)
Added proper daemonizing (-d/--daemonize) (Christian)
Added PID file handling (-p/--pid-file) (Christian)
New config option: monitor_interval_secs (Christian)
New config option: retry_promote_interval (Christian)
New config option: logfile (Christian)
New config option: pg_bindir (Christian)
New config option: pgctl_options (Christian)
2.0beta2 2013-12-19
Improve autofailover logic and algorithms (Jaime, Andres)
Ignore pg_log when cloning (Jaime)
Add timestamps to log line in stderr (Christian)
Correctly check wal_keep_segments (Jay Taylor)
Add a ssh_options parameter (Jay Taylor)
2.0beta1 2012-07-27
Make CLONE command try to make an exact copy including $PGDATA location (Cedric)
Add detection of master failure (Jaime)
Add the notion of a witness server (Jaime)
Add autofailover capabilities (Jaime)
Add a configuration parameter to indicate the script to execute on failover or follow (Jaime)
Make the monitoring optional and turned off by default, it can be turned on with --monitoring-history switch (Jaime)
Add tunables to specify number of retries to reconnect to master and the time between them (Jaime)
1.2.0 2012-07-27 1.2.0 2012-07-27
Test ssh connection before trying to rsync (Cédric) Test ssh connection before trying to rsync (Cédric)
Add CLUSTER SHOW command (Carlo) Add CLUSTER SHOW command (Carlo)
@@ -44,7 +9,7 @@
1.1.1 2012-04-18 1.1.1 2012-04-18
Add --ignore-rsync-warning (Cédric) Add --ignore-rsync-warning (Cédric)
Add strnlen for compatibility with OS X (Greg) Add strnlen for compatibility with OS X (Greg)
Improve performance of the repl_status view (Jaime) Improve performance of repl_status view (Jaime)
Remove last argument from log_err (Jaime, Reported by Jeroen Dekkers) Remove last argument from log_err (Jaime, Reported by Jeroen Dekkers)
Complete documentation about possible error conditions (Jaime) Complete documentation about possible error conditions (Jaime)
Document how to clean history (Jaime) Document how to clean history (Jaime)

View File

@@ -1,6 +1,6 @@
# #
# Makefile # Makefile
# Copyright (c) 2ndQuadrant, 2010-2014 # Copyright (c) 2ndQuadrant, 2010-2011
repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o
repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o
@@ -11,18 +11,15 @@ PG_CPPFLAGS = -I$(libpq_srcdir)
PG_LIBS = $(libpq_pgport) PG_LIBS = $(libpq_pgport)
all: repmgrd repmgr all: repmgrd repmgr
$(MAKE) -C sql
repmgrd: $(repmgrd_OBJS) repmgrd: $(repmgrd_OBJS)
$(CC) $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgrd $(CC) $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgrd
$(MAKE) -C sql
repmgr: $(repmgr_OBJS) repmgr: $(repmgr_OBJS)
$(CC) $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgr $(CC) $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgr
ifdef USE_PGXS ifdef USE_PGXS
PG_CONFIG = pg_config PGXS := $(shell pg_config --pgxs)
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS) include $(PGXS)
else else
subdir = contrib/repmgr subdir = contrib/repmgr
@@ -33,26 +30,9 @@ endif
# XXX: Try to use PROGRAM construct (see pgxs.mk) someday. Right now # XXX: Try to use PROGRAM construct (see pgxs.mk) someday. Right now
# is overriding pgxs install. # is overriding pgxs install.
install: install_prog install_ext install:
$(INSTALL_PROGRAM) repmgrd$(X) '$(DESTDIR)$(bindir)'
install_prog: $(INSTALL_PROGRAM) repmgr$(X) '$(DESTDIR)$(bindir)'
mkdir -p '$(DESTDIR)$(bindir)'
$(INSTALL_PROGRAM) repmgrd$(X) '$(DESTDIR)$(bindir)/'
$(INSTALL_PROGRAM) repmgr$(X) '$(DESTDIR)$(bindir)/'
install_ext:
$(MAKE) -C sql install
install_rhel:
mkdir -p '$(DESTDIR)/etc/init.d/'
$(INSTALL_PROGRAM) RHEL/repmgrd.init '$(DESTDIR)/etc/init.d/repmgrd'
mkdir -p '$(DESTDIR)/etc/sysconfig/'
$(INSTALL_PROGRAM) RHEL/repmgrd.sysconfig '$(DESTDIR)/etc/sysconfig/repmgrd'
mkdir -p '$(DESTDIR)/etc/repmgr/'
$(INSTALL_PROGRAM) repmgr.conf.sample '$(DESTDIR)/etc/repmgr/'
mkdir -p '$(DESTDIR)/usr/bin/'
$(INSTALL_PROGRAM) repmgrd$(X) '$(DESTDIR)/usr/bin/'
$(INSTALL_PROGRAM) repmgr$(X) '$(DESTDIR)/usr/bin/'
ifneq (,$(DATA)$(DATA_built)) ifneq (,$(DATA)$(DATA_built))
@for file in $(addprefix $(srcdir)/, $(DATA)) $(DATA_built); do \ @for file in $(addprefix $(srcdir)/, $(DATA)) $(DATA_built); do \
@@ -65,18 +45,10 @@ clean:
rm -f *.o rm -f *.o
rm -f repmgrd rm -f repmgrd
rm -f repmgr rm -f repmgr
$(MAKE) -C sql clean
deb: repmgrd repmgr deb: repmgrd repmgr
mkdir -p ./debian/usr/bin mkdir -p ./debian/usr/bin
cp repmgrd repmgr ./debian/usr/bin/ cp repmgrd repmgr ./debian/usr/bin/
mkdir -p ./debian/usr/share/postgresql/9.0/contrib/
cp sql/repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/
cp sql/uninstall_repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/
mkdir -p ./debian/usr/lib/postgresql/9.0/lib/
cp sql/repmgr_funcs.so ./debian/usr/lib/postgresql/9.0/lib/
dpkg-deb --build debian dpkg-deb --build debian
mv debian.deb ../postgresql-repmgr-9.0_1.0.0.deb mv debian.deb ../postgresql-repmgr-9.0_1.0.0.deb
rm -rf ./debian/usr

View File

@@ -1,286 +0,0 @@
repmgr: Quickstart guide
========================
repmgr is an open-source tool suite for mananaging replication and failover
among multiple PostgreSQL server nodes. It enhances PostgreSQL's built-in
hot-standby capabilities with a set of administration tools for monitoring
replication, setting up standby servers and performing failover/switchover
operations.
This quickstart guide assumes you are familiar with PostgreSQL replication
setup and Linux/UNIX system administration. For a more detailed tutorial
covering setup on a variety of different systems, see the README.rst file.
Conceptual Overview
-------------------
repmgr provides two binaries:
- `repmgr`: a command-line client to manage replication and repmgr configuration
- `repmgrd`: an optional daemon process which runs on standby nodes to monitor
replication and node status
Each PostgreSQL node requires a repmgr configuration file; additionally
it must be "registered" using the repmgr command-line client. repmgr stores
information about managed nodes in a custom schema on the node's current master
database.
Requirements
------------
repmgr works with PostgreSQL 9.0 and later. All server nodes must be running the
same PostgreSQL major version, and preferably should be running the same minor
version.
repmgr will work on any Linux or UNIX-like environment capable of running
PostgreSQL. `rsync` must also be installed.
Installation
------------
repmgr must be installed on each PostgreSQL server node.
* Packages
- RPM packages for RedHat-based distributions are available from PGDG
- Debian/Ubuntu provide .deb packages.
It is also possible to build .deb packages directly from the repmgr source;
see README.rst for further details.
* Source installation
- repmgr source code is hosted at github (https://github.com/2ndQuadrant/repmgr);
tar.gz files can be downloaded from https://github.com/2ndQuadrant/repmgr/releases .
repmgr can be built easily using PGXS:
sudo make USE_PGXS=1 install
Configuration
-------------
### Server configuration
Password-less SSH logins must be enabled for the database system user (typically `postgres`)
between all server nodes to enable repmgr to copy required files.
### PostgreSQL configuration
The master PostgreSQL node needs to be configured for replication with the
following settings:
wal_level = 'hot_standby' # minimal, archive, hot_standby, or logical
archive_mode = on # allows archiving to be done
archive_command = 'cd .' # command to use to archive a logfile segment
max_wal_senders = 10 # max number of walsender processes
wal_keep_segments = 5000 # in logfile segments, 16MB each; 0 disables
hot_standby = on # "on" allows queries during recovery
Note that repmgr expects a default of 5000 wal_keep_segments, although this
value can be overridden when executing the `repmgr` client.
Additionally, repmgr requires a dedicated PostgreSQL superuser account
and a database in which to store monitoring and replication data. The
database can in principle be any database, including the default postgres
one, however it's probably advisable to create a dedicated repmgr database.
### repmgr configuration
Each PostgreSQL node requires a repmgr configuration file containing
identification and database connection information:
cluster=test
node=1
node_name=node1
conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/path/to/postgres/bin
* `cluster`: common name for the replication cluster; this must be the same on all nodes
* `node`: a unique, abitrary integer identifier
* `name`: a unique, human-readable name
* `conninfo`: a standard conninfo string enabling repmgr to connect to the
control database; user and name must be the same on all nodes, while other
parameters such as port may differ. The `host` parameter *must* be a hostname
resolvable by all nodes on the cluster.
* `pg_bindir`: (optional) location of PostgreSQL binaries, if not in the default $PATH
Note that the configuration file should *not* be stored inside the PostgreSQL
data directory.
Each node configuration needs to be registered with repmgr, either using the
`repmgr` command line tool, or the `repmgrd` daemon; for details see below. Details
about each node are inserted into the repmgr database (for details see below).
Replication setup and monitoring
--------------------------------
For the purposes of this guide, we'll assume the database user will be
`repmgr_usr` and the database will be `repmgr_db`, and that the following
environment variables are set on each node:
- $HOME: the PostgreSQL system user's home directory
- $PGDATA: the PostgreSQL data directory
Master setup
------------
1. Configure PostgreSQL
- create user and database:
```
CREATE ROLE repmgr_usr LOGIN SUPERUSER;
CREATE DATABASE repmgr_db OWNER repmgr_usr;
```
- configure postgresql.conf for replication (see above)
- update pg_hba.conf:
```
host repmgr_usr repmgr_db 192.168.1.0/24 trust
host replication all 192.168.1.0/24 trust
```
Restart the PostgreSQL server after making these changes.
2. Create the repmgr configuration file:
$ cat $HOME/repmgr/repmgr.conf
cluster=test
node=1
node_name=node1
conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/path/to/postgres/bin
3. Register the master node with repmgr:
$ repmgr -f $HOME/repmgr/repmgr.conf --verbose master register
[2014-07-04 10:43:42] [INFO] repmgr mgr connecting to master database
[2014-07-04 10:43:42] [INFO] repmgr connected to master, checking its state
[2014-07-04 10:43:42] [INFO] master register: creating database objects inside the repmgr_test schema
[2014-07-04 10:43:43] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db)
Slave/standby setup
-------------------
1. Use repmgr to clone the master:
$ repmgr -f $HOME/repmgr/repmgr.conf -D $PGDATA -d repmgr_db -U repmgr_usr -R postgres --verbose standby clone 192.168.1.2
Opening configuration file: ./repmgr.conf
[2014-07-04 10:49:00] [ERROR] Did not find the configuration file './repmgr.conf', continuing
[2014-07-04 10:49:00] [INFO] repmgr connecting to master database
[2014-07-04 10:49:00] [INFO] repmgr connected to master, checking its state
[2014-07-04 10:49:00] [INFO] Successfully connected to primary. Current installation size is 1807 MB
[2014-07-04 10:49:00] [NOTICE] Starting backup...
[2014-07-04 10:49:00] [INFO] creating directory "/path/to/data/"...
(...)
[2014-07-04 10:53:19] [NOTICE] Finishing backup...
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
[2014-07-04 10:53:21] [INFO] repmgr requires primary to keep WAL files 0000000100000000000000AD until at least 0000000100000000000000AD
[2014-07-04 10:53:21] [NOTICE] repmgr standby clone complete
[2014-07-04 10:53:21] [NOTICE] HINT: You can now start your postgresql server
[2014-07-04 10:53:21] [NOTICE] for example : /etc/init.d/postgresql start
-R is the database system user on the master node. At this point it does not matter
if the `repmgr.conf` file is not found.
This will clone the PostgreSQL database files from the master, and additionally
create an appropriate `recovery.conf` file.
2. Start the PostgreSQL server
3. Create the repmgr configuration file:
$ cat $HOME/repmgr/repmgr.conf
cluster=test
node=2
node_name=node2
conninfo='host=repmgr_node2 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/path/to/postgres/bin
4. Register the master node with repmgr:
$ repmgr -f $HOME/repmgr/repmgr.conf --verbose standby register
Opening configuration file: /path/to/repmgr/repmgr.conf
[2014-07-04 11:48:13] [INFO] repmgr connecting to standby database
[2014-07-04 11:48:13] [INFO] repmgr connected to standby, checking its state
[2014-07-04 11:48:13] [INFO] repmgr connecting to master database
[2014-07-04 11:48:13] [INFO] finding node list for cluster 'test'
[2014-07-04 11:48:13] [INFO] checking role of cluster node 'host=repmgr_node1 user=repmgr_usr dbname=repmgr_db'
[2014-07-04 11:48:13] [INFO] repmgr connected to master, checking its state
[2014-07-04 11:48:13] [INFO] repmgr registering the standby
[2014-07-04 11:48:13] [INFO] repmgr registering the standby complete
[2014-07-04 11:48:13] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db)
Monitoring
----------
`repmgrd` is a management and monitoring daemon which runs on standby nodes
and which and can automate remote actions. It can be started simply with e.g.:
repmgrd -f $HOME/repmgr/repmgr.conf --verbose > $HOME/repmgr/repmgr.log 2>&1
or alternatively:
repmgrd -f $HOME/repmgr/repmgr.conf --verbose --monitoring-history > $HOME/repmgr/repmgrd.log 2>&1
which will track advance or lag of the replication in every standby in the
`repl_monitor` table.
Example log output:
[2014-07-04 11:55:17] [INFO] repmgrd Connecting to database 'host=localhost user=repmgr_usr dbname=repmgr_db'
[2014-07-04 11:55:17] [INFO] repmgrd Connected to database, checking its state
[2014-07-04 11:55:17] [INFO] repmgrd Connecting to primary for cluster 'test'
[2014-07-04 11:55:17] [INFO] finding node list for cluster 'test'
[2014-07-04 11:55:17] [INFO] checking role of cluster node 'host=repmgr_node1 user=repmgr_usr dbname=repmgr_db'
[2014-07-04 11:55:17] [INFO] repmgrd Checking cluster configuration with schema 'repmgr_test'
[2014-07-04 11:55:17] [INFO] repmgrd Checking node 2 in cluster 'test'
[2014-07-04 11:55:17] [INFO] Reloading configuration file and updating repmgr tables
[2014-07-04 11:55:17] [INFO] repmgrd Starting continuous standby node monitoring
Failover
--------
To promote a standby to master, on the standby execute e.g.:
repmgr -f $HOME/repmgr/repmgr.conf --verbose standby promote
repmgr will attempt to connect to the current master to verify that it
is not available (if it is, repmgr will not promote the standby).
Other standby servers need to be told to follow the new master with:
repmgr -f $HOME/repmgr/repmgr.conf --verbose standby follow
See file `autofailover_quick_setup.rst` for details on setting up
automated failover.
repmgr database schema
----------------------
repmgr creates a small schema for its own use in the database specified in
each node's conninfo configuration parameter. This database can in principle
be any database. The schema name is the global `cluster` name prefixed
with `repmgr_`, so for the example setup above the schema name is
`repmgr_test`.
The schema contains two tables:
* `repl_nodes`
stores information about all registered servers in the cluster
* `repl_monitor`
stores monitoring information about each node
and one view, `repl_status`, which summarizes the latest monitoring information
for each node.

View File

@@ -5,7 +5,7 @@ repmgr: Replication Manager for PostgreSQL clusters
Introduction Introduction
============ ============
PostgreSQL 9+ allow us to have replicated Hot Standby servers PostgreSQL 9.0 allow us to have replicated Hot Standby servers
which we can query and/or use for high availability. which we can query and/or use for high availability.
While the main components of the feature are included with While the main components of the feature are included with
@@ -20,17 +20,6 @@ databases as a single cluster. repmgr includes two components:
* repmgrd: management and monitoring daemon that watches the cluster * repmgrd: management and monitoring daemon that watches the cluster
and can automate remote actions. and can automate remote actions.
Supported Releases
------------------
repmgr works with PostgreSQL versions 9.0 and later.
There are currently no incompatibilities when upgrading repmgr from 9.0 to 9.1,
so your 9.0 configuration will work with 9.1
Additional parameters must be added to postgresql.conf to take advantage of
the new 9.1 features such as synchronous replication or hot standby feedback.
Requirements Requirements
------------ ------------
@@ -77,7 +66,7 @@ and run::
And if a previously failed node becomes available again, such as And if a previously failed node becomes available again, such as
the lost node1 above, you can get it to resynchronize by only copying the lost node1 above, you can get it to resynchronize by only copying
over changes made while it was down. That happens with what's over changes made while it was down using. That hapens with what's
called a forced clone, which overwrites existing data rather than called a forced clone, which overwrites existing data rather than
assuming it starts with an empty database directory tree:: assuming it starts with an empty database directory tree::
@@ -131,19 +120,19 @@ If you need to remove the source code temporary files from this directory,
that can be done like this:: that can be done like this::
make USE_PGXS=1 clean make USE_PGXS=1 clean
See below for building notes specific to RedHat Linux variants. See below for building notes specific to RedHat Linux variants.
Using a full source code tree Using a full source code tree
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this method, the repmgr distribution is copied into the PostgreSQL source In this method, the repmgr distribution is copied into the PostgreSQL source
code tree, assumed to be under ${postgresql_sources} for this example. code tree, assumed to be at the ${postgresql_sources} for this example.
The resulting subdirectory must be named ``contrib/repmgr``, without any The resulting subdirectory must be named ``contrib/repmgr``, without any
version number:: version number::
cp repmgr.tar.gz ${postgresql_sources}/contrib cp repmgr.tar.gz ${postgresql_sources}/contrib
cd ${postgresql_sources}/contrib cd ${postgresql_sources}/contrib
tar xvzf repmgr-1.0.tar.gz tar xvzf repmgr-1.0.tar.gz
cd repmgr cd repmgr
make make
@@ -237,7 +226,7 @@ If you already tried to build repmgr before doing this, you'll need to do::
make USE_PGXS=1 clean make USE_PGXS=1 clean
to get rid of leftover files from the wrong architecture. To get rid of leftover files from the wrong architecture.
Notes on Ubuntu, Debian or other Debian-based Builds Notes on Ubuntu, Debian or other Debian-based Builds
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -279,8 +268,8 @@ Confirm software was built correctly
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You should now find the repmgr programs available in the subdirectory where You should now find the repmgr programs available in the subdirectory where
the rest of your PostgreSQL binary files are located. You can confirm the the rest of your PostgreSQL installation is at. You can confirm the software
software is available by checking its version:: is available by checking its version::
repmgr --version repmgr --version
repmgrd --version repmgrd --version
@@ -320,7 +309,7 @@ keys and a maching authorization file to a privledged user on the other system::
[postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys [postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@node1]$ chmod go-rwx ~/.ssh/* [postgres@node1]$ chmod go-rwx ~/.ssh/*
[postgres@node1]$ cd ~/.ssh [postgres@node1]$ cd ~/.ssh
[postgres@node1]$ scp id_rsa.pub id_rsa authorized_keys user@node2: [postgres@node1]$ scp id_rsa.pub id_rsa authorized_keys postgres@node2:
Login as a user on the other system, and install the files into the postgres Login as a user on the other system, and install the files into the postgres
user's account:: user's account::
@@ -374,10 +363,10 @@ Usage walkthrough
This assumes you've already followed the steps in "Installation Outline" to This assumes you've already followed the steps in "Installation Outline" to
install repmgr and repmgrd on the system. install repmgr and repmgrd on the system.
A typical production installation of ``repmgr`` might involve two PostgreSQL A normal production installation of ``repmgr`` will normally involve two
instances on seperate servers, both running under the ``postgres`` user account different systems running on the same port, typically the default of 5432,
and both using the default port (5432). This walkthrough assumes the following with both using files owned by the ``postgres`` user account. This
setup: walkthrough assumes the following setup:
* A primary (master) server called "node1," running as the "postgres" user * A primary (master) server called "node1," running as the "postgres" user
who is also the owner of the files. This server is operating on port 5432. This who is also the owner of the files. This server is operating on port 5432. This
@@ -389,7 +378,7 @@ setup:
* Another standby server called "node3" with a similar configuration to "node2". * Another standby server called "node3" with a similar configuration to "node2".
* The Postgres installation in each of the above is defined as $PGDATA, * The Postgress installation in each of the above is defined as $PGDATA,
which is represented here as ``/var/lib/pgsql/9.0/data`` which is represented here as ``/var/lib/pgsql/9.0/data``
Creating some sample data Creating some sample data
@@ -514,14 +503,12 @@ following the standard directory structure of a RHEL system. It should contain:
cluster=test cluster=test
node=1 node=1
node_name=earth
conninfo='host=node1 user=repmgr dbname=pgbench' conninfo='host=node1 user=repmgr dbname=pgbench'
On "node2" create the file ``/var/lib/pgsql/repmgr/repmgr.conf`` with:: On "node2" create the file ``/var/lib/pgsql/repmgr/repmgr.conf`` with::
cluster=test cluster=test
node=2 node=2
node_name=mars
conninfo='host=node2 user=repmgr dbname=pgbench' conninfo='host=node2 user=repmgr dbname=pgbench'
The STANDBY CLONE process should have created a recovery.conf file on The STANDBY CLONE process should have created a recovery.conf file on
@@ -625,18 +612,18 @@ Now restore to the original configuration by stopping
primary server, then bringing up "node2" as a standby with a valid primary server, then bringing up "node2" as a standby with a valid
``recovery.conf`` file. ``recovery.conf`` file.
Stop the "node2" server and type the following on "node1" server:: Stop the "node2" server::
repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby promote repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby promote
Now the original primary, "node1", is acting again as primary. Now the original primary, "node1" is acting again as primary.
Start the "node2" server and type this on "node2":: Start the "node2" server and type this on "node1"::
repmgr standby clone --force -h node2 -p 5432 -U postgres -R postgres --verbose repmgr standby clone --force -h node2 -p 5432 -U postgres -R postgres --verbose
Verify the roles have reversed by attempting to insert a record on "node1" Verify the roles have reversed by attempting to insert a record on "node"
and on "node2". and on "node1".
The servers are now again acting as primary on "node1" and standby on "node2". The servers are now again acting as primary on "node1" and standby on "node2".
@@ -660,7 +647,7 @@ You can usually leave out changes to the port number in this case too.
* A database exists on "prime" called "testdb." * A database exists on "prime" called "testdb."
* The Postgres installation in each of the above is defined as $PGDATA, * The Postgress installation in each of the above is defined as $PGDATA,
which is represented here with ``/data/prime`` as the "prime" server and which is represented here with ``/data/prime`` as the "prime" server and
``/data/standby`` as the "standby" server. ``/data/standby`` as the "standby" server.
@@ -714,14 +701,12 @@ and it should contain::
cluster=test cluster=test
node=1 node=1
node_name=earth
conninfo='host=127.0.0.1 dbname=testdb' conninfo='host=127.0.0.1 dbname=testdb'
On "standby" create the file ``/home/standby/repmgr/repmgr.conf`` with:: On "standby" create the file ``/home/standby/repmgr/repmgr.conf`` with::
cluster=test cluster=test
node=2 node=2
node_name=mars
conninfo='host=127.0.0.1 dbname=testdb' conninfo='host=127.0.0.1 dbname=testdb'
Next, with "prime" server running, we want to use the ``clone standby`` command Next, with "prime" server running, we want to use the ``clone standby`` command
@@ -839,11 +824,12 @@ Also, if you don't do anything about it the monitor history will keep growing.
For both of those reasons you sometime want to make some maintainance of the For both of those reasons you sometime want to make some maintainance of the
``repl_monitor`` table. ``repl_monitor`` table.
If you want to clean the history after a few days you can execute the If you want to clean the history after a few days you can execute a
CLUSTER CLEANUP command in a cron. For example to keep just one day of history truncate/delete (wheter you want to completely clean history or want to keep
a few days of history) in a cron. For example to keep just one day of history
you can put this in your crontab:: you can put this in your crontab::
0 1 * * * repmgr cluster cleanup -k 1 -f ~/repmgr.conf 0 1 * * * psql -c "DELETE FROM repmgr_schema.repl_monitor where now() - last_monitor_time >= '1 day'::interval;" postgres
Configuration and command reference Configuration and command reference
=================================== ===================================
@@ -876,6 +862,7 @@ The output from this program looks like this::
Usage: Usage:
repmgr [OPTIONS] master {register} repmgr [OPTIONS] master {register}
repmgr [OPTIONS] standby {register|clone|promote|follow} repmgr [OPTIONS] standby {register|clone|promote|follow}
repmgr [OPTIONS] cluster {show|cleanup}
General options: General options:
--help show this help, then exit --help show this help, then exit
@@ -890,11 +877,12 @@ The output from this program looks like this::
Configuration options: Configuration options:
-D, --data-dir=DIR local directory where the files will be copied to -D, --data-dir=DIR local directory where the files will be copied to
-f, --config-file=PATH path to the configuration file -f, --config_file=PATH path to the configuration file
-R, --remote-user=USERNAME database server username for rsync -R, --remote-user=USERNAME database server username for rsync
-w, --wal-keep-segments=VALUE minimum value for the GUC wal_keep_segments (default: 5000) -w, --wal-keep-segments=VALUE minimum value for the GUC wal_keep_segments (default: 5000)
-I, --ignore-rsync-warning ignore rsync partial transfer warning
-F, --force force potentially dangerous operations to happen -F, --force force potentially dangerous operations to happen
-I, --ignore-rsync-warning Ignore partial transfert warning
-k, --keep-history keeps indicated number of days of history
repmgr performs some tasks like clone a node, promote it or making follow another node and then exits. repmgr performs some tasks like clone a node, promote it or making follow another node and then exits.
COMMANDS: COMMANDS:
@@ -903,6 +891,8 @@ The output from this program looks like this::
standby clone [node] - allows creation of a new standby standby clone [node] - allows creation of a new standby
standby promote - allows manual promotion of a specific standby into a new master in the event of a failover standby promote - allows manual promotion of a specific standby into a new master in the event of a failover
standby follow - allows the standby to re-point itself to a new master standby follow - allows the standby to re-point itself to a new master
cluster show - print node informations
cluster cleanup - cleans monitor's history
The ``--verbose`` option can be useful in troubleshooting issues with The ``--verbose`` option can be useful in troubleshooting issues with
the program. the program.
@@ -1013,8 +1003,7 @@ The output from this program looks like this::
--help show this help, then exit --help show this help, then exit
--version output version information, then exit --version output version information, then exit
--verbose output verbose activity information --verbose output verbose activity information
--monitoring-history track advance or lag of the replication in every standby in repl_monitor -f, --config_file=PATH database to connect to
-f, --config-file=PATH path to the configuration file
repmgrd monitors a cluster of servers. repmgrd monitors a cluster of servers.
@@ -1044,10 +1033,6 @@ Lag monitoring
repmgrd helps monitor a set of master and standby servers. You can repmgrd helps monitor a set of master and standby servers. You can
see which node is the current master, as well as how far behind each see which node is the current master, as well as how far behind each
is from current. is from current.
To activate the monitor capabilities of repmgr you must include the
option --monitoring-history when running it::
repmgrd --monitoring-history --config-file=/path/to/repmgr.conf &
To look at the current lag between primary and each node listed To look at the current lag between primary and each node listed
in ``repl_node``, consult the ``repl_status`` view:: in ``repl_node``, consult the ``repl_status`` view::
@@ -1080,21 +1065,16 @@ following
* ERR_DB_QUERY 7: Error executing a database query. * ERR_DB_QUERY 7: Error executing a database query.
* ERR_PROMOTED 8: Exiting program because the node has been promoted to master. * ERR_PROMOTED 8: Exiting program because the node has been promoted to master.
* ERR_BAD_PASSWORD 9: Password used to connect to a database was rejected. * ERR_BAD_PASSWORD 9: Password used to connect to a database was rejected.
* ERR_STR_OVERFLOW 10: A string was larger than expected.
License and Contributions License and Contributions
========================= =========================
repmgr is licensed under the GPL v3. All of its code and documentation is repmgr is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2014, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for Copyright 2010-2011, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details. details.
Main sponsorship of repmgr has been from 2ndQuadrant customers. Contributions to repmgr are welcome, and listed in the file CREDITS.
Additional work has been sponsored by the 4CaaST project for cloud computing,
which has received funding from the European Union's Seventh Framework Programme
(FP7/2007-2013) under grant agreement 258862.
Contributions to repmgr are welcome, and will be listed in the file CREDITS.
2ndQuadrant Limited requires that any contributions provide a copyright 2ndQuadrant Limited requires that any contributions provide a copyright
assignment and a disclaimer of any work-for-hire ownership claims from the assignment and a disclaimer of any work-for-hire ownership claims from the
employer of the developer. This lets us make sure that all of the repmgr employer of the developer. This lets us make sure that all of the repmgr
@@ -1110,35 +1090,3 @@ Code in repmgr is formatted to a consistent style using the following command::
Contributors should reformat their code similarly before submitting code to Contributors should reformat their code similarly before submitting code to
the project, in order to minimize merge conflicts with other work. the project, in order to minimize merge conflicts with other work.
Support and Assistance
======================
2ndQuadrant provides 24x7 production support for repmgr, as well as help you
configure it correctly, verify an installation and train you in running a
robust replication cluster.
There is a mailing list/forum to discuss contributions or issues
http://groups.google.com/group/repmgr
#repmgr is registered in freenode IRC
Further information is available at http://www.repmgr.org/
We'd love to hear from you about how you use repmgr. Case studies and
news are always welcome. Send us an email at info@2ndQuadrant.com, or
send a postcard to
repmgr
c/o 2ndQuadrant
7200 The Quorum
Oxford Business Park North
Oxford
OX4 2JZ
Thanks from the repmgr core team
Jaime Casanova
Simon Riggs
Greg Smith
Cedric Villemain

View File

@@ -1,57 +0,0 @@
Summary: repmgr
Name: repmgr
Version: 2.0
Release: 2
License: GPLv3
Group: System Environment/Daemons
URL: http://repmgr.org
Packager: Nathan Van Overloop <nathan.van.overloop@nexperteam.be>
Vendor: 2ndQuadrant Limited
Distribution: centos
Source0: %{name}-%{version}.tar.gz
BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root
%description
repmgr for centos6
%prep
%setup
%build
export PATH=$PATH:/usr/pgsql-9.3/bin/
%{__make} USE_PGXS=1
%install
[ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot}
export PATH=$PATH:/usr/pgsql-9.3/bin/
%{__make} USE_PGXS=1 install DESTDIR=%{buildroot} INSTALL="install -p"
%{__make} USE_PGXS=1 install_prog DESTDIR=%{buildroot} INSTALL="install -p"
%{__make} USE_PGXS=1 install_rhel DESTDIR=%{buildroot} INSTALL="install -p"
%clean
[ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot}
%files
%defattr(-,root,root)
/usr/bin/repmgr
/usr/bin/repmgrd
/usr/pgsql-9.3/bin/repmgr
/usr/pgsql-9.3/bin/repmgrd
/usr/pgsql-9.3/lib/repmgr_funcs.so
/usr/pgsql-9.3/share/contrib/repmgr.sql
/usr/pgsql-9.3/share/contrib/repmgr_funcs.sql
/usr/pgsql-9.3/share/contrib/uninstall_repmgr.sql
/usr/pgsql-9.3/share/contrib/uninstall_repmgr_funcs.sql
%attr(0755,root,root)/etc/init.d/repmgrd
%attr(0644,root,root)/etc/sysconfig/repmgrd
%attr(0644,root,root)/etc/repmgr/repmgr.conf.sample
%changelog
* Thu Jun 05 2014 Nathan Van Overloop <nathan.van.overloop@nexperteam.be> 2.0.2
- fix witness creation to create db and user if needed
* Fri Apr 04 2014 Nathan Van Overloop <nathan.van.overloop@nexperteam.be> 2.0.1
- initial build for RHEL6

View File

@@ -1,114 +0,0 @@
#!/bin/bash
#
# repmgrd Start up the repmgrd daemon
# repmrgd (replication manager daemon)
#
# chkconfig: - 75 16
# description: repmgrd is the repliation manager daemon \
# The repmgrd replication management and monitoring daemon for PostgreSQL.
### BEGIN INIT INFO
# Provides: repmgrd
# Required-Start: $local_fs $remote_fs $network $syslog postgresql
# Required-Stop: $local_fs $remote_fs $network $syslog postgresql
# Should-Start: $syslog postgresql-9.3
# Should-Stop: $syslog postgresql-9.3
# Short-Description: start and stop repmrgd
# Description: Enable repmgrd replication management and monitoring daemon for PostgreSQL
# this is used to monitor a postgresql cluster.
### END INIT INFO
# Source function library.
. /etc/init.d/functions
# Source networking configuration.
. /etc/sysconfig/network
prog=repmgrd
REPMGRD_ENABLED=yes
REPMGRD_OPTS=
REPMGRD_USER=postgres
DAEMONIZE="-d"
# pull in sysconfig settings
[ -f /etc/sysconfig/repmgrd ] && . /etc/sysconfig/repmgrd
LOCKFILE=/var/lock/subsys/$prog
RETVAL=0
case "$REPMGRD_ENABLED" in
[Yy]*)
#nothing to do here
;;
*)
exit 2
;;
esac
if [ -z "$REPMGRD_OPTS" ]
then
echo "Not starting $prog, REPMGRD_OPTS not set in /etc/sysconfig/$prog"
exit 2
fi
start() {
[ "$EUID" != "0" ] && exit 4
[ "$NETWORKING" = "no" ] && exit 1
# Start daemons.
echo -n $"Starting $prog: "
daemon --user $REPMGRD_USER $prog $DAEMONIZE $REPMGRD_OPTS
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch $LOCKFILE
return $RETVAL
}
stop() {
[ "$EUID" != "0" ] && exit 4
echo -n $"Shutting down $prog: "
killproc $prog
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && rm -f $LOCKFILE
return $RETVAL
}
status() {
if [ -f "$LOCKFILE" ]; then
echo "$prog is running"
else
RETVAL=3
echo "$prog is stopped"
fi
return $RETVAL
}
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status $prog
;;
restart|force-reload)
stop
start
;;
try-restart|condrestart)
if status $prog > /dev/null; then
stop
start
fi
;;
reload)
exit 3
;;
*)
echo $"Usage: $0 {start|stop|status|restart|try-restart|force-reload}"
exit 2
esac

View File

@@ -1,4 +0,0 @@
#default sysconfig file for repmrgd
#custom overrides can be placed here
REPMGRD_OPTS="-f /etc/repmgr/repmgr.conf"

18
TODO
View File

@@ -1,18 +1,14 @@
Known issues in repmgr Known issues in repmgr
====================== ======================
* The check for whether ``wal_keep_segments`` is considered large enough
does a string comparison rather than an integer one. It can give both
false positive (setting is large enough but flagged as too small) and
false negative (setting is too small but not noted as such) errors.
* When running repmgr against a remote machine, operations that start * When running repmgr against a remote machine, operations that start
the database server using the ``pg_ctl`` command may accidentally the database server using the ``pg_ctl`` command may accidentally
terminate after their associated ssh session ends. terminate after their associated ssh session ends.
Planned feature improvements * After running repmgrd as a regular foreground application, hitting
============================ control-C causes the program to crash.
* Timeline increases when promoting a standby
* A better check which standby did receive most of the data
* Make the fact that a standby may be delayed a factor in the voting
algorithm
* include support for delayed standbys

View File

@@ -1,225 +0,0 @@
=====================================================
PostgreSQL Automatic Fail-Over - User Documentation
=====================================================
Automatic Failover
==================
repmgr allows setups for automatic failover when it detects the failure of the master node.
Following is a quick setup for this.
Installation
============
For convenience, we define:
**node1**
is the hostname fully qualified of the Master server, IP 192.168.1.10
**node2**
is the hostname fully qualified of the Standby server, IP 192.168.1.11
**witness**
is the hostname fully qualified of the server used for witness, IP 192.168.1.12
**Note:** It is not recommanded to use name defining status of a server like «masterserver»,
this is a name leading to confusion once a failover take place and the Master is
now on the «standbyserver».
Summary
-------
2 PostgreSQL servers are involved in the replication. Automatic fail-over need
to vote to decide what server it should promote, thus an odd number is required
and a witness-repmgrd is installed in a third server where it uses a PostgreSQL
cluster to communicate with other repmgrd daemons.
1. Install PostgreSQL in all the servers involved (including the server used for
witness)
2. Install repmgr in all the servers involved (including the server used for witness)
3. Configure the Master PostreSQL
4. Clone the Master to the Standby using "repmgr standby clone" command
5. Configure repmgr in all the servers involved (including the server used for witness)
6. Register Master and Standby nodes
7. Initiate witness server
8. Start the repmgrd daemons in all nodes
**Note** A complete High-Availability design needs at least 3 servers to still have
a backup node after a first failure.
Install PostgreSQL
------------------
You can install PostgreSQL using any of the recommended methods. You should ensure
it's 9.0 or later.
Install repmgr
--------------
Install repmgr following the steps in the README file.
Configure PostreSQL
-------------------
Log in node1.
Edit the file postgresql.conf and modify the parameters::
listen_addresses='*'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .' # we can also use exit 0, anything that
# just does nothing
max_wal_senders = 10
wal_keep_segments = 5000 # 80 GB required on pg_xlog
hot_standby = on
shared_preload_libraries = 'repmgr_funcs'
Edit the file pg_hba.conf and add lines for the replication::
host repmgr repmgr 127.0.0.1/32 trust
host repmgr repmgr 192.168.1.10/30 trust
host replication all 192.168.1.10/30 trust
**Note:** It is also possible to use a password authentication (md5), .pgpass file
should be edited to allow connection between each node.
Create the user and database to manage replication::
su - postgres
createuser -s repmgr
createdb -O repmgr repmgr
psql -f /usr/share/postgresql/9.0/contrib/repmgr_funcs.sql repmgr
Restart the PostgreSQL server::
pg_ctl -D $PGDATA restart
And check everything is fine in the server log.
Create the ssh-key for the postgres user and copy it to other servers::
su - postgres
ssh-keygen # /!\ do not use a passphrase /!\
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
exit
rsync -avz ~postgres/.ssh/authorized_keys node2:~postgres/.ssh/
rsync -avz ~postgres/.ssh/authorized_keys witness:~postgres/.ssh/
rsync -avz ~postgres/.ssh/id_rsa* node2:~postgres/.ssh/
rsync -avz ~postgres/.ssh/id_rsa* witness:~postgres/.ssh/
Clone Master
------------
Log in node2.
Clone the node1 (the current Master)::
su - postgres
repmgr -d repmgr -U repmgr -h node1 standby clone
Start the PostgreSQL server::
pg_ctl -D $PGDATA start
And check everything is fine in the server log.
Configure repmgr
----------------
Log in each server and configure repmgr by editing the file
/etc/repmgr/repmgr.conf::
cluster=my_cluster
node=1
node_name=earth
conninfo='host=192.168.1.10 dbname=repmgr user=repmgr'
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='promote_command.sh'
follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf'
**cluster**
is the name of the current replication.
**node**
is the number of the current node (1, 2 or 3 in the current example).
**node_name**
is an identifier for every node.
**conninfo**
is used to connect to the local PostgreSQL server (where the configuration file is) from any node. In the witness server configuration it is needed to add a 'port=5499' to the conninfo.
**master_response_timeout**
is the maximum amount of time we are going to wait before deciding the master has died and start failover procedure.
**reconnect_attempts**
is the number of times we will try to reconnect to master after a failure has been detected and before start failover procedure.
**reconnect_interval**
is the amount of time between retries to reconnect to master after a failure has been detected and before start failover procedure.
**failover**
configure behavior: *manual* or *automatic*.
**promote_command**
the command executed to do the failover (including the PostgreSQL failover itself). The command must return 0 on success.
**follow_command**
the command executed to address the current standby to another Master. The command must return 0 on success.
Register Master and Standby
---------------------------
Log in node1.
Register the node as Master::
su - postgres
repmgr -f /etc/repmgr/repmgr.conf master register
Log in node2. Register it as a standby::
su - postgres
repmgr -f /etc/repmgr/repmgr.conf standby register
Initialize witness server
-------------------------
Log in witness.
Initialize the witness server::
su - postgres
repmgr -d repmgr -U repmgr -h 192.168.1.10 -D $WITNESS_PGDATA -f /etc/repmgr/repmgr.conf witness create
It needs information to connect to the master to copy the configuration of the cluster, also it needs to know where it should initialize it's own $PGDATA.
As part of the procees it also ask for the superuser password so it can connect when needed.
Start the repmgrd daemons
-------------------------
Log in node2 and witness.
su - postgres
repmgrd -f /etc/repmgr/repmgr.conf > /var/log/postgresql/repmgr.log 2>&1
**Note:** The Master does not need a repmgrd daemon.
Suspend Automatic behavior
==========================
Edit the repmgr.conf of the node to remove from automatic processing and change::
failover=manual
Then, signal repmgrd daemon::
su - postgres
kill -HUP `pidof repmgrd`
Usage
=====
The repmgr documentation is in the README file (how to build, options, etc.)

View File

@@ -1,6 +1,6 @@
/* /*
* check_dir.c - Directories management functions * check_dir.c - Directories management functions
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -31,6 +31,8 @@
#include "strutil.h" #include "strutil.h"
#include "log.h" #include "log.h"
static int mkdir_p(char *path, mode_t omode);
/* /*
* make sure the directory either doesn't exist or is empty * make sure the directory either doesn't exist or is empty
* we use this function to check the new data directory and * we use this function to check the new data directory and
@@ -44,9 +46,9 @@
int int
check_dir(char *dir) check_dir(char *dir)
{ {
DIR *chkdir; DIR *chkdir;
struct dirent *file; struct dirent *file;
int result = 1; int result = 1;
errno = 0; errno = 0;
@@ -58,7 +60,7 @@ check_dir(char *dir)
while ((file = readdir(chkdir)) != NULL) while ((file = readdir(chkdir)) != NULL)
{ {
if (strcmp(".", file->d_name) == 0 || if (strcmp(".", file->d_name) == 0 ||
strcmp("..", file->d_name) == 0) strcmp("..", file->d_name) == 0)
{ {
/* skip this and parent directory */ /* skip this and parent directory */
continue; continue;
@@ -71,7 +73,6 @@ check_dir(char *dir)
} }
#ifdef WIN32 #ifdef WIN32
/* /*
* This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
* released version * released version
@@ -83,29 +84,29 @@ check_dir(char *dir)
closedir(chkdir); closedir(chkdir);
if (errno != 0) if (errno != 0)
return -1; /* some kind of I/O error? */ return -1; /* some kind of I/O error? */
return result; return result;
} }
/* /*
* Create directory with error log message when failing * Create directory
*/ */
bool bool
create_dir(char *dir) create_directory(char *dir)
{ {
if (mkdir_p(dir, 0700) == 0) if (mkdir_p(dir, 0700) == 0)
return true; return true;
log_err(_("Could not create directory \"%s\": %s\n"), log_err(_("Could not create directory \"%s\": %s\n"),
dir, strerror(errno)); dir, strerror(errno));
return false; return false;
} }
bool bool
set_dir_permissions(char *dir) set_directory_permissions(char *dir)
{ {
return (chmod(dir, 0700) != 0) ? false : true; return (chmod(dir, 0700) != 0) ? false : true;
} }
@@ -123,15 +124,15 @@ set_dir_permissions(char *dir)
* note that on failure, the path arg has been modified to show the particular * note that on failure, the path arg has been modified to show the particular
* directory level we had problems with. * directory level we had problems with.
*/ */
int static int
mkdir_p(char *path, mode_t omode) mkdir_p(char *path, mode_t omode)
{ {
struct stat sb; struct stat sb;
mode_t numask, mode_t numask,
oumask; oumask;
int first, int first,
last, last,
retval; retval;
char *p; char *p;
p = path; p = path;
@@ -150,8 +151,8 @@ mkdir_p(char *path, mode_t omode)
return 1; return 1;
} }
else if (p[1] == ':' && else if (p[1] == ':' &&
((p[0] >= 'a' && p[0] <= 'z') || ((p[0] >= 'a' && p[0] <= 'z') ||
(p[0] >= 'A' && p[0] <= 'Z'))) (p[0] >= 'A' && p[0] <= 'Z')))
{ {
/* local drive */ /* local drive */
p += 2; p += 2;
@@ -222,87 +223,10 @@ bool
is_pg_dir(char *dir) is_pg_dir(char *dir)
{ {
const size_t buf_sz = 8192; const size_t buf_sz = 8192;
char path[buf_sz]; char path[buf_sz];
struct stat sb; struct stat sb;
int r;
/* test pgdata */
xsnprintf(path, buf_sz, "%s/PG_VERSION", dir); xsnprintf(path, buf_sz, "%s/PG_VERSION", dir);
if (stat(path, &sb) == 0)
return true;
/* test tablespace dir */ return (stat(path, &sb) == 0) ? true : false;
sprintf(path, "ls %s/PG_*/ -I*", dir);
r = system(path);
if (r == 0)
return true;
return false;
}
bool
create_pg_dir(char *dir, bool force)
{
bool pg_dir = false;
/* Check this directory could be used as a PGDATA dir */
switch (check_dir(dir))
{
case 0:
/* dir not there, must create it */
log_info(_("creating directory \"%s\"...\n"), dir);
if (!create_dir(dir))
{
log_err(_("couldn't create directory \"%s\"...\n"),
dir);
return false;
}
break;
case 1:
/* Present but empty, fix permissions and use it */
log_info(_("checking and correcting permissions on existing directory %s ...\n"),
dir);
if (!set_dir_permissions(dir))
{
log_err(_("could not change permissions of directory \"%s\": %s\n"),
dir, strerror(errno));
return false;
}
break;
case 2:
/* Present and not empty */
log_warning(_("directory \"%s\" exists but is not empty\n"),
dir);
pg_dir = is_pg_dir(dir);
/*
* we use force to reduce the time needed to restore a node which
* turn async after a failover or anything else
*/
if (pg_dir && force)
{
/* Let it continue */
break;
}
else if (pg_dir && !force)
{
log_warning(_("\nThis looks like a PostgreSQL directory.\n"
"If you are sure you want to clone here, "
"please check there is no PostgreSQL server "
"running and use the --force option\n"));
return false;
}
return false;
default:
/* Trouble accessing directory */
log_err(_("could not access directory \"%s\": %s\n"),
dir, strerror(errno));
return false;
}
return true;
} }

View File

@@ -1,6 +1,6 @@
/* /*
* check_dir.h * check_dir.h
* Copyright (c) 2ndQuadrant, 2010-2014 * Copyright (c) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -20,11 +20,9 @@
#ifndef _REPMGR_CHECK_DIR_H_ #ifndef _REPMGR_CHECK_DIR_H_
#define _REPMGR_CHECK_DIR_H_ #define _REPMGR_CHECK_DIR_H_
int mkdir_p(char *path, mode_t omode); int check_dir(char *dir);
int check_dir(char *dir); bool create_directory(char *dir);
bool create_dir(char *dir); bool set_directory_permissions(char *dir);
bool set_dir_permissions(char *dir); bool is_pg_dir(char *dir);
bool is_pg_dir(char *dir);
bool create_pg_dir(char *dir, bool force);
#endif #endif

252
config.c
View File

@@ -1,6 +1,6 @@
/* /*
* config.c - Functions to parse the config file * config.c - Functions to parse the config file
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -18,57 +18,36 @@
*/ */
#include "config.h" #include "config.h"
#include "log.h"
#include "strutil.h"
#include "repmgr.h" #include "repmgr.h"
#include "strutil.h"
void void
parse_config(const char *config_file, t_configuration_options * options) parse_config(const char* config_file, t_configuration_options* options)
{ {
char *s, char *s, buff[MAXLINELENGTH];
buff[MAXLINELENGTH]; char name[MAXLEN];
char name[MAXLEN]; char value[MAXLEN];
char value[MAXLEN];
FILE *fp = fopen(config_file, "r"); FILE *fp = fopen (config_file, "r");
/* Initialize */ /* Initialize */
memset(options->cluster_name, 0, sizeof(options->cluster_name)); memset(options->cluster_name, 0, sizeof(options->cluster_name));
options->node = -1; options->node = -1;
memset(options->conninfo, 0, sizeof(options->conninfo)); memset(options->conninfo, 0, sizeof(options->conninfo));
options->failover = MANUAL_FAILOVER;
options->priority = 0;
memset(options->node_name, 0, sizeof(options->node_name));
memset(options->promote_command, 0, sizeof(options->promote_command));
memset(options->follow_command, 0, sizeof(options->follow_command));
memset(options->rsync_options, 0, sizeof(options->rsync_options)); memset(options->rsync_options, 0, sizeof(options->rsync_options));
memset(options->ssh_options, 0, sizeof(options->ssh_options));
memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
memset(options->pgctl_options, 0, sizeof(options->pgctl_options));
/* if nothing has been provided defaults to 60 */
options->master_response_timeout = 60;
/* it defaults to 6 retries with a time between retries of 10s */
options->reconnect_attempts = 6;
options->reconnect_intvl = 10;
options->monitor_interval_secs = 2;
options->retry_promote_interval_secs = 300;
/* /*
* Since some commands don't require a config file at all, not having one * Since some commands don't require a config file at all, not
* isn't necessarily a problem. * having one isn't necessarily a problem.
*/ */
if (fp == NULL) if (fp == NULL)
{ {
log_err(_("Did not find the configuration file '%s', continuing\n"), fprintf(stderr, _("Did not find the configuration file '%s', continuing\n"), config_file);
config_file);
return; return;
} }
/* Read next line */ /* Read next line */
while ((s = fgets(buff, sizeof buff, fp)) != NULL) while ((s = fgets (buff, sizeof buff, fp)) != NULL)
{ {
/* Skip blank lines and comments */ /* Skip blank lines and comments */
if (buff[0] == '\n' || buff[0] == '#') if (buff[0] == '\n' || buff[0] == '#')
@@ -79,138 +58,70 @@ parse_config(const char *config_file, t_configuration_options * options)
/* Copy into correct entry in parameters struct */ /* Copy into correct entry in parameters struct */
if (strcmp(name, "cluster") == 0) if (strcmp(name, "cluster") == 0)
strncpy(options->cluster_name, value, MAXLEN); strncpy (options->cluster_name, value, MAXLEN);
else if (strcmp(name, "node") == 0) else if (strcmp(name, "node") == 0)
options->node = atoi(value); options->node = atoi(value);
else if (strcmp(name, "conninfo") == 0) else if (strcmp(name, "conninfo") == 0)
strncpy(options->conninfo, value, MAXLEN); strncpy (options->conninfo, value, MAXLEN);
else if (strcmp(name, "rsync_options") == 0) else if (strcmp(name, "rsync_options") == 0)
strncpy(options->rsync_options, value, QUERY_STR_LEN); strncpy (options->rsync_options, value, QUERY_STR_LEN);
else if (strcmp(name, "ssh_options") == 0)
strncpy(options->ssh_options, value, QUERY_STR_LEN);
else if (strcmp(name, "loglevel") == 0) else if (strcmp(name, "loglevel") == 0)
strncpy(options->loglevel, value, MAXLEN); strncpy (options->loglevel, value, MAXLEN);
else if (strcmp(name, "logfacility") == 0) else if (strcmp(name, "logfacility") == 0)
strncpy(options->logfacility, value, MAXLEN); strncpy (options->logfacility, value, MAXLEN);
else if (strcmp(name, "failover") == 0)
{
char failoverstr[MAXLEN];
strncpy(failoverstr, value, MAXLEN);
if (strcmp(failoverstr, "manual") == 0)
options->failover = MANUAL_FAILOVER;
else if (strcmp(failoverstr, "automatic") == 0)
options->failover = AUTOMATIC_FAILOVER;
else
{
log_warning(_("value for failover option is incorrect, it should be automatic or manual. Defaulting to manual.\n"));
options->failover = MANUAL_FAILOVER;
}
}
else if (strcmp(name, "priority") == 0)
options->priority = atoi(value);
else if (strcmp(name, "node_name") == 0)
strncpy(options->node_name, value, MAXLEN);
else if (strcmp(name, "promote_command") == 0)
strncpy(options->promote_command, value, MAXLEN);
else if (strcmp(name, "follow_command") == 0)
strncpy(options->follow_command, value, MAXLEN);
else if (strcmp(name, "master_response_timeout") == 0)
options->master_response_timeout = atoi(value);
else if (strcmp(name, "reconnect_attempts") == 0)
options->reconnect_attempts = atoi(value);
else if (strcmp(name, "reconnect_interval") == 0)
options->reconnect_intvl = atoi(value);
else if (strcmp(name, "pg_bindir") == 0)
strncpy(options->pg_bindir, value, MAXLEN);
else if (strcmp(name, "pg_ctl_options") == 0)
strncpy(options->pgctl_options, value, MAXLEN);
else if (strcmp(name, "logfile") == 0)
strncpy(options->logfile, value, MAXLEN);
else if (strcmp(name, "monitor_interval_secs") == 0)
options->monitor_interval_secs = atoi(value);
else if (strcmp(name, "retry_promote_interval_secs") == 0)
options->retry_promote_interval_secs = atoi(value);
else else
log_warning(_("%s/%s: Unknown name/value pair!\n"), name, value); printf ("WARNING: %s/%s: Unknown name/value pair!\n", name, value);
} }
/* Close file */ /* Close file */
fclose(fp); fclose (fp);
/* Check config settings */ /* Check config settings */
if (*options->cluster_name == '\0') if (strnlen(options->cluster_name, MAXLEN)==0)
{ {
log_err(_("Cluster name is missing. Check the configuration file.\n")); fprintf(stderr, "Cluster name is missing. "
"Check the configuration file.\n");
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
if (options->node == -1) if (options->node == -1)
{ {
log_err(_("Node information is missing. Check the configuration file.\n")); fprintf(stderr, "Node information is missing. "
exit(ERR_BAD_CONFIG); "Check the configuration file.\n");
}
if (options->master_response_timeout <= 0)
{
log_err(_("Master response timeout must be greater than zero. Check the configuration file.\n"));
exit(ERR_BAD_CONFIG);
}
if (options->reconnect_attempts < 0)
{
log_err(_("Reconnect attempts must be zero or greater. Check the configuration file.\n"));
exit(ERR_BAD_CONFIG);
}
if (options->reconnect_intvl <= 0)
{
log_err(_("Reconnect intervals must be zero or greater. Check the configuration file.\n"));
exit(ERR_BAD_CONFIG);
}
if (*options->pg_bindir == '\0')
{
log_err(_("pg_bindir config value not found. Check the configuration file.\n"));
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
} }
char * char *
trim(char *s) trim (char *s)
{ {
/* Initialize start, end pointers */ /* Initialize start, end pointers */
char *s1 = s, char *s1 = s, *s2 = &s[strlen (s) - 1];
*s2 = &s[strlen(s) - 1];
/* Trim and delimit right side */ /* Trim and delimit right side */
while ((isspace(*s2)) && (s2 >= s1)) while ( (isspace (*s2)) && (s2 >= s1) )
--s2; --s2;
*(s2 + 1) = '\0'; *(s2+1) = '\0';
/* Trim left side */ /* Trim left side */
while ((isspace(*s1)) && (s1 < s2)) while ( (isspace (*s1)) && (s1 < s2) )
++s1; ++s1;
/* Copy finished string */ /* Copy finished string */
memmove(s, s1, s2 - s1); strcpy (s, s1);
s[s2 - s1 + 1] = '\0';
return s; return s;
} }
void void
parse_line(char *buff, char *name, char *value) parse_line(char *buff, char *name, char *value)
{ {
int i = 0; int i = 0;
int j = 0; int j = 0;
/* /*
* first we find the name of the parameter * first we find the name of the parameter
*/ */
for (; i < MAXLEN; ++i) for ( ; i < MAXLEN; ++i)
{ {
if (buff[i] != '=') if (buff[i] != '=')
name[j++] = buff[i]; name[j++] = buff[i];
@@ -223,7 +134,7 @@ parse_line(char *buff, char *name, char *value)
* Now the value * Now the value
*/ */
j = 0; j = 0;
for (++i; i < MAXLEN; ++i) for ( ++i ; i < MAXLEN; ++i)
if (buff[i] == '\'') if (buff[i] == '\'')
continue; continue;
else if (buff[i] != '\n') else if (buff[i] != '\n')
@@ -233,100 +144,3 @@ parse_line(char *buff, char *name, char *value)
value[j] = '\0'; value[j] = '\0';
trim(value); trim(value);
} }
bool
reload_config(char *config_file, t_configuration_options * orig_options)
{
PGconn *conn;
t_configuration_options new_options;
/*
* Re-read the configuration file: repmgr.conf
*/
log_info(_("Reloading configuration file and updating repmgr tables\n"));
parse_config(config_file, &new_options);
if (new_options.node == -1)
{
log_warning(_("Cannot load new configuration, will keep current one.\n"));
return false;
}
if (strcmp(new_options.cluster_name, orig_options->cluster_name) != 0)
{
log_warning(_("Cannot change cluster name, will keep current configuration.\n"));
return false;
}
if (new_options.node != orig_options->node)
{
log_warning(_("Cannot change node number, will keep current configuration.\n"));
return false;
}
if (strcmp(new_options.node_name, orig_options->node_name) != 0)
{
log_warning(_("Cannot change standby name, will keep current configuration.\n"));
return false;
}
if (new_options.failover != MANUAL_FAILOVER && new_options.failover != AUTOMATIC_FAILOVER)
{
log_warning(_("New value for failover is not valid. Should be MANUAL or AUTOMATIC.\n"));
return false;
}
if (new_options.master_response_timeout <= 0)
{
log_warning(_("New value for master_response_timeout is not valid. Should be greater than zero.\n"));
return false;
}
if (new_options.reconnect_attempts < 0)
{
log_warning(_("New value for reconnect_attempts is not valid. Should be greater or equal than zero.\n"));
return false;
}
if (new_options.reconnect_intvl < 0)
{
log_warning(_("New value for reconnect_interval is not valid. Should be greater or equal than zero.\n"));
return false;
}
/* Test conninfo string */
conn = establish_db_connection(new_options.conninfo, false);
if (!conn || (PQstatus(conn) != CONNECTION_OK))
{
log_warning(_("conninfo string is not valid, will keep current configuration.\n"));
return false;
}
PQfinish(conn);
/* Configuration seems ok, will load new values */
strcpy(orig_options->cluster_name, new_options.cluster_name);
orig_options->node = new_options.node;
strcpy(orig_options->conninfo, new_options.conninfo);
orig_options->failover = new_options.failover;
orig_options->priority = new_options.priority;
strcpy(orig_options->node_name, new_options.node_name);
strcpy(orig_options->promote_command, new_options.promote_command);
strcpy(orig_options->follow_command, new_options.follow_command);
strcpy(orig_options->rsync_options, new_options.rsync_options);
strcpy(orig_options->ssh_options, new_options.ssh_options);
orig_options->master_response_timeout = new_options.master_response_timeout;
orig_options->reconnect_attempts = new_options.reconnect_attempts;
orig_options->reconnect_intvl = new_options.reconnect_intvl;
/*
* XXX These ones can change with a simple SIGHUP?
*
* strcpy (orig_options->loglevel, new_options.loglevel); strcpy
* (orig_options->logfacility, new_options.logfacility);
*
* logger_shutdown(); XXX do we have progname here ? logger_init(progname,
* orig_options.loglevel, orig_options.logfacility);
*/
return true;
}

View File

@@ -1,6 +1,6 @@
/* /*
* config.h * config.h
* Copyright (c) 2ndQuadrant, 2010-2014 * Copyright (c) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -25,33 +25,16 @@
typedef struct typedef struct
{ {
char cluster_name[MAXLEN]; char cluster_name[MAXLEN];
int node; int node;
char conninfo[MAXLEN]; char conninfo[MAXLEN];
int failover; char loglevel[MAXLEN];
int priority; char logfacility[MAXLEN];
char node_name[MAXLEN]; char rsync_options[QUERY_STR_LEN];
char promote_command[MAXLEN]; } t_configuration_options;
char follow_command[MAXLEN];
char loglevel[MAXLEN];
char logfacility[MAXLEN];
char rsync_options[QUERY_STR_LEN];
char ssh_options[QUERY_STR_LEN];
int master_response_timeout;
int reconnect_attempts;
int reconnect_intvl;
char pg_bindir[MAXLEN];
char pgctl_options[MAXLEN];
char logfile[MAXLEN];
int monitor_interval_secs;
int retry_promote_interval_secs;
} t_configuration_options;
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", -1, -1, -1, "", "", "", 0, 0 } void parse_config(const char* config_file, t_configuration_options* options);
void parse_line(char *buff, char *name, char *value);
void parse_config(const char *config_file, t_configuration_options * options); char *trim(char *s);
void parse_line(char *buff, char *name, char *value);
char *trim(char *s);
bool reload_config(char *config_file, t_configuration_options * orig_options);
#endif #endif

398
dbutils.c
View File

@@ -1,6 +1,6 @@
/* /*
* dbutils.c - Database connection/management functions * dbutils.c - Database connection/management functions
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -17,30 +17,21 @@
* *
*/ */
#include <unistd.h>
#include <time.h>
#include <sys/time.h>
#include "repmgr.h" #include "repmgr.h"
#include "strutil.h" #include "strutil.h"
#include "log.h" #include "log.h"
PGconn * PGconn *
establish_db_connection(const char *conninfo, const bool exit_on_error) establishDBConnection(const char *conninfo, const bool exit_on_error)
{ {
/* Make a connection to the database */ /* Make a connection to the database */
PGconn *conn = NULL; PGconn *conn = PQconnectdb(conninfo);
char connection_string[MAXLEN];
strcpy(connection_string, conninfo);
strcat(connection_string, " fallback_application_name='repmgr'");
conn = PQconnectdb(connection_string);
/* Check to see that the backend connection was successfully made */ /* Check to see that the backend connection was successfully made */
if ((PQstatus(conn) != CONNECTION_OK)) if ((PQstatus(conn) != CONNECTION_OK))
{ {
log_err(_("Connection to database failed: %s\n"), log_err(_("Connection to database failed: %s\n"),
PQerrorMessage(conn)); PQerrorMessage(conn));
if (exit_on_error) if (exit_on_error)
{ {
@@ -53,17 +44,16 @@ establish_db_connection(const char *conninfo, const bool exit_on_error)
} }
PGconn * PGconn *
establish_db_connection_by_params(const char *keywords[], const char *values[], establishDBConnectionByParams(const char *keywords[], const char *values[],const bool exit_on_error)
const bool exit_on_error)
{ {
/* Make a connection to the database */ /* Make a connection to the database */
PGconn *conn = PQconnectdbParams(keywords, values, true); PGconn *conn = PQconnectdbParams(keywords, values, true);
/* Check to see that the backend connection was successfully made */ /* Check to see that the backend connection was successfully made */
if ((PQstatus(conn) != CONNECTION_OK)) if ((PQstatus(conn) != CONNECTION_OK))
{ {
log_err(_("Connection to database failed: %s\n"), log_err(_("Connection to database failed: %s\n"),
PQerrorMessage(conn)); PQerrorMessage(conn));
if (exit_on_error) if (exit_on_error)
{ {
PQfinish(conn); PQfinish(conn);
@@ -74,133 +64,58 @@ establish_db_connection_by_params(const char *keywords[], const char *values[],
return conn; return conn;
} }
int bool
is_standby(PGconn *conn) is_standby(PGconn *conn)
{ {
PGresult *res; PGresult *res;
int result = 0; bool result;
res = PQexec(conn, "SELECT pg_is_in_recovery()"); res = PQexec(conn, "SELECT pg_is_in_recovery()");
if (res == NULL || PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_err(_("Can't query server mode: %s"),
PQerrorMessage(conn));
result = -1;
}
else if (PQntuples(res) == 1 && strcmp(PQgetvalue(res, 0, 0), "t") == 0)
result = 1;
PQclear(res);
return result;
}
int
is_witness(PGconn *conn, char *schema, char *cluster, int node_id)
{
PGresult *res;
int result = 0;
char sqlquery[QUERY_STR_LEN];
sqlquery_snprintf(sqlquery, "SELECT witness from %s.repl_nodes where cluster = '%s' and id = %d",
schema, cluster, node_id);
res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
log_err(_("Can't query server mode: %s"), PQerrorMessage(conn)); log_err(_("Can't query server mode: %s"),
result = -1; PQerrorMessage(conn));
PQclear(res);
PQfinish(conn);
exit(ERR_DB_QUERY);
} }
else if (PQntuples(res) == 1 && strcmp(PQgetvalue(res, 0, 0), "t") == 0)
result = 1; if (strcmp(PQgetvalue(res, 0, 0), "f") == 0)
result = false;
else
result = true;
PQclear(res); PQclear(res);
return result; return result;
} }
/* check the PQStatus and try to 'select 1' to confirm good connection */
bool
is_pgup(PGconn *conn, int timeout)
{
char sqlquery[QUERY_STR_LEN];
/* Check the connection status twice in case it changes after reset */
bool twice = false;
/* Check the connection status twice in case it changes after reset */
for (;;)
{
if (PQstatus(conn) != CONNECTION_OK)
{
if (twice)
return false;
PQreset(conn); /* reconnect */
twice = true;
}
else
{
/*
* Send a SELECT 1 just to check if the connection is OK
*/
if (!cancel_query(conn, timeout))
goto failed;
if (wait_connection_availability(conn, timeout) != 1)
goto failed;
sqlquery_snprintf(sqlquery, "SELECT 1");
if (PQsendQuery(conn, sqlquery) == 0)
{
log_warning(_("PQsendQuery: Query could not be sent to primary. %s\n"),
PQerrorMessage(conn));
goto failed;
}
if (wait_connection_availability(conn, timeout) != 1)
goto failed;
break;
failed:
/*
* we need to retry, because we might just have loose the
* connection once
*/
if (twice)
return false;
PQreset(conn); /* reconnect */
twice = true;
}
}
return true;
}
/* /*
* If postgreSQL version is 9 or superior returns the major version * If postgreSQL version is 9 or superior returns the major version
* if 8 or inferior returns an empty string * if 8 or inferior returns an empty string
*/ */
char * char *
pg_version(PGconn *conn, char *major_version) pg_version(PGconn *conn, char* major_version)
{ {
PGresult *res; PGresult *res;
int major_version1; int major_version1;
char *major_version2; char *major_version2;
res = PQexec(conn, res = PQexec(conn,
"WITH pg_version(ver) AS " "WITH pg_version(ver) AS "
"(SELECT split_part(version(), ' ', 2)) " "(SELECT split_part(version(), ' ', 2)) "
"SELECT split_part(ver, '.', 1), split_part(ver, '.', 2) " "SELECT split_part(ver, '.', 1), split_part(ver, '.', 2) "
"FROM pg_version"); "FROM pg_version");
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
log_err(_("Version check PQexec failed: %s"), log_err(_("Version check PQexec failed: %s"),
PQerrorMessage(conn)); PQerrorMessage(conn));
PQclear(res); PQclear(res);
return NULL; PQfinish(conn);
exit(ERR_DB_QUERY);
} }
major_version1 = atoi(PQgetvalue(res, 0, 0)); major_version1 = atoi(PQgetvalue(res, 0, 0));
@@ -210,7 +125,7 @@ pg_version(PGconn *conn, char *major_version)
{ {
/* form a major version string */ /* form a major version string */
xsnprintf(major_version, MAXVERSIONSTR, "%d.%s", major_version1, xsnprintf(major_version, MAXVERSIONSTR, "%d.%s", major_version1,
major_version2); major_version2);
} }
else else
strcpy(major_version, ""); strcpy(major_version, "");
@@ -221,92 +136,59 @@ pg_version(PGconn *conn, char *major_version)
} }
int bool
guc_set(PGconn *conn, const char *parameter, const char *op, guc_setted(PGconn *conn, const char *parameter, const char *op,
const char *value) const char *value)
{ {
PGresult *res; PGresult *res;
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
int retval = 1;
sqlquery_snprintf(sqlquery, "SELECT true FROM pg_settings " sqlquery_snprintf(sqlquery, "SELECT true FROM pg_settings "
" WHERE name = '%s' AND setting %s '%s'", " WHERE name = '%s' AND setting %s '%s'",
parameter, op, value); parameter, op, value);
res = PQexec(conn, sqlquery); res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
log_err(_("GUC setting check PQexec failed: %s"), log_err(_("GUC setting check PQexec failed: %s"),
PQerrorMessage(conn)); PQerrorMessage(conn));
retval = -1; PQclear(res);
PQfinish(conn);
exit(ERR_DB_QUERY);
} }
else if (PQntuples(res) == 0) if (PQntuples(res) == 0)
{ {
retval = 0; PQclear(res);
return false;
} }
PQclear(res); PQclear(res);
return retval; return true;
}
/**
* Just like guc_set except with an extra parameter containing the name of
* the pg datatype so that the comparison can be done properly.
*/
int
guc_set_typed(PGconn *conn, const char *parameter, const char *op,
const char *value, const char *datatype)
{
PGresult *res;
char sqlquery[QUERY_STR_LEN];
int retval = 1;
sqlquery_snprintf(sqlquery, "SELECT true FROM pg_settings "
" WHERE name = '%s' AND setting::%s %s '%s'::%s",
parameter, datatype, op, value, datatype);
res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_err(_("GUC setting check PQexec failed: %s"),
PQerrorMessage(conn));
retval = -1;
}
else if (PQntuples(res) == 0)
{
retval = 0;
}
PQclear(res);
return retval;
} }
const char * const char *
get_cluster_size(PGconn *conn) get_cluster_size(PGconn *conn)
{ {
PGresult *res; PGresult *res;
const char *size = NULL; const char *size;
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
sqlquery_snprintf( sqlquery_snprintf(
sqlquery, sqlquery,
"SELECT pg_size_pretty(SUM(pg_database_size(oid))::bigint) " "SELECT pg_size_pretty(SUM(pg_database_size(oid))::bigint) "
" FROM pg_database "); " FROM pg_database ");
res = PQexec(conn, sqlquery); res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
log_err(_("Get cluster size PQexec failed: %s"), log_err(_("Get cluster size PQexec failed: %s"),
PQerrorMessage(conn)); PQerrorMessage(conn));
PQclear(res);
PQfinish(conn);
exit(ERR_DB_QUERY);
} }
else size = PQgetvalue(res, 0, 0);
{
size = PQgetvalue(res, 0, 0);
}
PQclear(res); PQclear(res);
return size; return size;
} }
@@ -315,23 +197,24 @@ get_cluster_size(PGconn *conn)
* get a connection to master by reading repl_nodes, creating a connection * get a connection to master by reading repl_nodes, creating a connection
* to each node (one at a time) and finding if it is a master or a standby * to each node (one at a time) and finding if it is a master or a standby
* *
* NB: If master_conninfo_out may be NULL. If it is non-null, it is assumed to * NB: If master_conninfo_out may be NULL. If it is non-null, it is assumed to
* point to allocated memory of MAXCONNINFO in length, and the master server * point to allocated memory of MAXCONNINFO in length, and the master server
* connection string is placed there. * connection string is placed there.
*/ */
PGconn * PGconn *
get_master_connection(PGconn *standby_conn, char *schema, char *cluster, getMasterConnection(PGconn *standby_conn, char *cluster,
int *master_id, char *master_conninfo_out) int *master_id, char *master_conninfo_out)
{ {
PGconn *master_conn = NULL; PGconn *master_conn = NULL;
PGresult *res1; PGresult *res1;
PGresult *res2; PGresult *res2;
char sqlquery[QUERY_STR_LEN]; char sqlquery[QUERY_STR_LEN];
char master_conninfo_stack[MAXCONNINFO]; char master_conninfo_stack[MAXCONNINFO];
char *master_conninfo = &*master_conninfo_stack; char *master_conninfo = &*master_conninfo_stack;
char schema_quoted[MAXLEN]; char schema_str[MAXLEN];
char schema_quoted[MAXLEN];
int i; int i;
/* /*
* If the caller wanted to get a copy of the connection info string, sub * If the caller wanted to get a copy of the connection info string, sub
@@ -345,9 +228,10 @@ get_master_connection(PGconn *standby_conn, char *schema, char *cluster,
* *
* Assemble the unquoted schema name * Assemble the unquoted schema name
*/ */
maxlen_snprintf(schema_str, "repmgr_%s", cluster);
{ {
char *identifier = PQescapeIdentifier(standby_conn, schema, char *identifier = PQescapeIdentifier(standby_conn, schema_str,
strlen(schema)); strlen(schema_str));
maxlen_snprintf(schema_quoted, "%s", identifier); maxlen_snprintf(schema_quoted, "%s", identifier);
PQfreemem(identifier); PQfreemem(identifier);
@@ -355,44 +239,45 @@ get_master_connection(PGconn *standby_conn, char *schema, char *cluster,
/* find all nodes belonging to this cluster */ /* find all nodes belonging to this cluster */
log_info(_("finding node list for cluster '%s'\n"), log_info(_("finding node list for cluster '%s'\n"),
cluster); cluster);
sqlquery_snprintf(sqlquery, "SELECT id, conninfo FROM %s.repl_nodes " sqlquery_snprintf(sqlquery, "SELECT * FROM %s.repl_nodes "
" WHERE cluster = '%s' and not witness", " WHERE cluster = '%s'",
schema_quoted, cluster); schema_quoted, cluster);
res1 = PQexec(standby_conn, sqlquery); res1 = PQexec(standby_conn, sqlquery);
if (PQresultStatus(res1) != PGRES_TUPLES_OK) if (PQresultStatus(res1) != PGRES_TUPLES_OK)
{ {
log_err(_("Can't get nodes info: %s\n"), log_err(_("Can't get nodes info: %s\n"),
PQerrorMessage(standby_conn)); PQerrorMessage(standby_conn));
PQclear(res1); PQclear(res1);
return NULL; PQfinish(standby_conn);
exit(ERR_DB_QUERY);
} }
for (i = 0; i < PQntuples(res1); i++) for (i = 0; i < PQntuples(res1); i++)
{ {
/* initialize with the values of the current node being processed */ /* initialize with the values of the current node being processed */
*master_id = atoi(PQgetvalue(res1, i, 0)); *master_id = atoi(PQgetvalue(res1, i, 0));
strncpy(master_conninfo, PQgetvalue(res1, i, 1), MAXCONNINFO); strncpy(master_conninfo, PQgetvalue(res1, i, 2), MAXCONNINFO);
log_info(_("checking role of cluster node '%s'\n"), log_info(_("checking role of cluster node '%s'\n"),
master_conninfo); master_conninfo);
master_conn = establish_db_connection(master_conninfo, false); master_conn = establishDBConnection(master_conninfo, false);
if (PQstatus(master_conn) != CONNECTION_OK) if (PQstatus(master_conn) != CONNECTION_OK)
continue; continue;
/* /*
* Can't use the is_standby() function here because on error that * Can't use the is_standby() function here because on error that
* function closes the connection passed and exits. This still needs * function closes the connection passed and exits. This still
* to close master_conn first. * needs to close master_conn first.
*/ */
res2 = PQexec(master_conn, "SELECT pg_is_in_recovery()"); res2 = PQexec(master_conn, "SELECT pg_is_in_recovery()");
if (PQresultStatus(res2) != PGRES_TUPLES_OK) if (PQresultStatus(res2) != PGRES_TUPLES_OK)
{ {
log_err(_("Can't get recovery state from this node: %s\n"), log_err(_("Can't get recovery state from this node: %s\n"),
PQerrorMessage(master_conn)); PQerrorMessage(master_conn));
PQclear(res2); PQclear(res2);
PQfinish(master_conn); PQfinish(master_conn);
continue; continue;
@@ -414,116 +299,15 @@ get_master_connection(PGconn *standby_conn, char *schema, char *cluster,
} }
} }
/* /* If we finish this loop without finding a master then
* If we finish this loop without finding a master then we doesn't have * we doesn't have the info or the master has failed (or we
* the info or the master has failed (or we reached max_connections or * reached max_connections or superuser_reserved_connections,
* superuser_reserved_connections, anything else I'm missing?). * anything else I'm missing?).
* *
* Probably we will need to check the error to know if we need to start * Probably we will need to check the error to know if we need
* failover procedure or just fix some situation on the standby. * to start failover procedure or just fix some situation on the
* standby.
*/ */
PQclear(res1); PQclear(res1);
return NULL; return NULL;
} }
/*
* wait until current query finishes ignoring any results, this could be an
* async command or a cancelation of a query
* return 1 if Ok; 0 if any error ocurred; -1 if timeout reached
*/
int
wait_connection_availability(PGconn *conn, long long timeout)
{
PGresult *res;
fd_set read_set;
int sock = PQsocket(conn);
struct timeval tmout,
before,
after;
struct timezone tz;
/* recalc to microseconds */
timeout *= 1000000;
while (timeout > 0)
{
if (PQconsumeInput(conn) == 0)
{
log_warning(_("wait_connection_availability: could not receive data from connection. %s\n"),
PQerrorMessage(conn));
return 0;
}
if (PQisBusy(conn) == 0)
{
do
{
res = PQgetResult(conn);
PQclear(res);
} while (res != NULL);
break;
}
tmout.tv_sec = 0;
tmout.tv_usec = 250000;
FD_ZERO(&read_set);
FD_SET(sock, &read_set);
gettimeofday(&before, &tz);
if (select(sock, &read_set, NULL, NULL, &tmout) == -1)
{
log_warning(
_("wait_connection_availability: select() returned with error: %s"),
strerror(errno));
return -1;
}
gettimeofday(&after, &tz);
timeout -= (after.tv_sec * 1000000 + after.tv_usec) -
(before.tv_sec * 1000000 + before.tv_usec);
}
if (timeout >= 0)
{
return 1;
}
log_warning(_("wait_connection_availability: timeout reached"));
return -1;
}
bool
cancel_query(PGconn *conn, int timeout)
{
char errbuf[ERRBUFF_SIZE];
PGcancel *pgcancel;
if (wait_connection_availability(conn, timeout) != 1)
return false;
pgcancel = PQgetCancel(conn);
if (pgcancel == NULL)
return false;
/*
* PQcancel can only return 0 if socket()/connect()/send() fails, in any
* of those cases we can assume something bad happened to the connection
*/
if (PQcancel(pgcancel, errbuf, ERRBUFF_SIZE) == 0)
{
log_warning(_("Can't stop current query: %s\n"), errbuf);
PQfreeCancel(pgcancel);
return false;
}
PQfreeCancel(pgcancel);
return true;
}

View File

@@ -1,6 +1,6 @@
/* /*
* dbutils.h * dbutils.h
* Copyright (c) 2ndQuadrant, 2010-2014 * Copyright (c) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -22,25 +22,16 @@
#include "strutil.h" #include "strutil.h"
PGconn *establish_db_connection(const char *conninfo, PGconn *establishDBConnection(const char *conninfo, const bool exit_on_error);
const bool exit_on_error); PGconn *establishDBConnectionByParams(const char *keywords[],
PGconn *establish_db_connection_by_params(const char *keywords[], const char *values[],
const char *values[], const bool exit_on_error);
const bool exit_on_error); bool is_standby(PGconn *conn);
int is_standby(PGconn *conn); char *pg_version(PGconn *conn, char* major_version);
int is_witness(PGconn *conn, char *schema, char *cluster, int node_id); bool guc_setted(PGconn *conn, const char *parameter, const char *op,
bool is_pgup(PGconn *conn, int timeout); const char *value);
char *pg_version(PGconn *conn, char *major_version); const char *get_cluster_size(PGconn *conn);
int guc_set(PGconn *conn, const char *parameter, const char *op, PGconn *getMasterConnection(PGconn *standby_conn, char *cluster,
const char *value); int *master_id, char *master_conninfo_out);
int guc_set_typed(PGconn *conn, const char *parameter, const char *op,
const char *value, const char *datatype);
const char *get_cluster_size(PGconn *conn);
PGconn *get_master_connection(PGconn *standby_conn, char *schema, char *cluster,
int *master_id, char *master_conninfo_out);
int wait_connection_availability(PGconn *conn, long long timeout);
bool cancel_query(PGconn *conn, int timeout);
#endif #endif

View File

@@ -1,9 +1,9 @@
Package: repmgr-auto Package: repmgr
Version: 2.0beta2 Version: 1.0-1
Section: database Section: database
Priority: optional Priority: optional
Architecture: all Architecture: all
Depends: rsync, postgresql-9.0 | postgresql-9.1 | postgresql-9.2 | postgresql-9.3 Depends: rsync, postgresql-9.0
Maintainer: Jaime Casanova <jaime@2ndQuadrant.com> Maintainer: Greg Smith <greg@2ndQuadrant.com>
Description: PostgreSQL replication setup, magament and monitoring Description: PostgreSQL replication setup, magament and monitoring
has two main executables has two main executables

View File

@@ -1,18 +0,0 @@
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=no
# Options for repmgrd (required)
#REPMGRD_OPTS="--config-file /path/to/repmgr.conf"
# User to run repmgrd as
#REPMGRD_USER=postgres
# repmgrd binary
#REPMGR_BIN=/usr/bin/repmgr
# pid file
#REPMGR_PIDFILE=/var/run/repmgrd.pid

View File

@@ -1,101 +0,0 @@
#!/bin/sh
### BEGIN INIT INFO
# Provides: repmgrd
# Required-Start: $local_fs $remote_fs $network $syslog postgresql
# Required-Stop: $local_fs $remote_fs $network $syslog postgresql
# Should-Start: $syslog postgresql
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start/stop repmgrd
# Description: Enable repmgrd replication management and monitoring daemon for PostgreSQL
### END INIT INFO
set -e
DESC="PostgreSQL replication management and monitoring daemon"
NAME=repmgrd
REPMGRD_ENABLED=no
REPMGRD_OPTS=
REPMGRD_USER=postgres
REPMGRD_BIN=/usr/bin/repmgrd
REPMGRD_PIDFILE=/var/run/repmgrd.pid
# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME
test -x $REPMGRD_BIN || exit 0
case "$REPMGRD_ENABLED" in
[Yy]*)
break
;;
*)
exit 0
;;
esac
# Define LSB log_* functions.
. /lib/lsb/init-functions
if [ -z "$REPMGRD_OPTS" ]
then
log_warning_msg "Not starting $NAME, REPMGRD_OPTS not set in /etc/default/$NAME"
exit 0
fi
do_start()
{
# Return
# 0 if daemon has been started
# 1 if daemon was already running
# other if daemon could not be started or a failure occured
start-stop-daemon --start --quiet --background --chuid $REPMGRD_USER --make-pidfile --pidfile $REPMGRD_PIDFILE --exec $REPMGRD_BIN -- $REPMGRD_OPTS
}
do_stop()
{
# Return
# 0 if daemon has been stopped
# 1 if daemon was already stopped
# other if daemon could not be stopped or a failure occurred
start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $REPMGRD_PIDFILE --exec $REPMGRD_BIN
}
case "$1" in
start)
log_daemon_msg "Starting $DESC" "$NAME"
do_start
case "$?" in
0) log_end_msg 0 ;;
1) log_progress_msg "already started"
log_end_msg 0 ;;
*) log_end_msg 1 ;;
esac
;;
stop)
log_daemon_msg "Stopping $DESC" "$NAME"
do_stop
case "$?" in
0) log_end_msg 0 ;;
1) log_progress_msg "already stopped"
log_end_msg 0 ;;
*) log_end_msg 1 ;;
esac
;;
restart|force-reload)
$0 stop
$0 start
;;
status)
status_of_proc -p $REPMGRD_PIDFILE $REPMGRD_BIN $NAME && exit 0 || exit $?
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload|status}" >&2
exit 3
;;
esac
exit 0

View File

@@ -1,6 +1,6 @@
/* /*
* errcode.h * errcode.h
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -33,8 +33,5 @@
#define ERR_PROMOTED 8 #define ERR_PROMOTED 8
#define ERR_BAD_PASSWORD 9 #define ERR_BAD_PASSWORD 9
#define ERR_STR_OVERFLOW 10 #define ERR_STR_OVERFLOW 10
#define ERR_FAILOVER_FAIL 11
#define ERR_BAD_SSH 12
#define ERR_SYS_FAILURE 13
#endif /* _ERRCODE_H_ */ #endif /* _ERRCODE_H_ */

134
log.c
View File

@@ -1,6 +1,6 @@
/* /*
* log.c - Logging methods * log.c - Logging methods
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
* This module is a set of methods for logging (currently only syslog) * This module is a set of methods for logging (currently only syslog)
* *
@@ -25,10 +25,8 @@
#ifdef HAVE_SYSLOG #ifdef HAVE_SYSLOG
#include <syslog.h> #include <syslog.h>
#endif
#include <stdarg.h> #include <stdarg.h>
#include <time.h> #endif
#include "log.h" #include "log.h"
@@ -39,44 +37,20 @@
/* #define REPMGR_DEBUG */ /* #define REPMGR_DEBUG */
void static int detect_log_level(const char* level);
stderr_log_with_level(const char *level_name, int level, const char *fmt, ...) static int detect_log_facility(const char* facility);
int log_type = REPMGR_STDERR;
int log_level = LOG_NOTICE;
bool logger_init(const char* ident, const char* level, const char* facility)
{ {
time_t t;
struct tm *tm;
char buff[100];
va_list ap;
if (log_level >= level) int l;
{ int f;
time(&t);
tm = localtime(&t);
strftime(buff, 100, "[%Y-%m-%d %H:%M:%S]", tm);
fprintf(stderr, "%s [%s] ", buff, level_name);
va_start(ap, fmt);
vfprintf(stderr, fmt, ap);
va_end(ap);
fflush(stderr);
}
}
static int detect_log_level(const char *level);
static int detect_log_facility(const char *facility);
int log_type = REPMGR_STDERR;
int log_level = LOG_NOTICE;
bool
logger_init(t_configuration_options * opts, const char *ident, const char *level, const char *facility)
{
int l;
int f;
#ifdef HAVE_SYSLOG #ifdef HAVE_SYSLOG
int syslog_facility = DEFAULT_SYSLOG_FACILITY; int syslog_facility = DEFAULT_SYSLOG_FACILITY;
#endif #endif
#ifdef REPMGR_DEBUG #ifdef REPMGR_DEBUG
@@ -133,33 +107,21 @@ logger_init(t_configuration_options * opts, const char *ident, const char *level
if (log_type == REPMGR_SYSLOG) if (log_type == REPMGR_SYSLOG)
{ {
setlogmask(LOG_UPTO(log_level)); setlogmask (LOG_UPTO (log_level));
openlog(ident, LOG_CONS | LOG_PID | LOG_NDELAY, syslog_facility); openlog (ident, LOG_CONS | LOG_PID | LOG_NDELAY, syslog_facility);
stderr_log_notice(_("Setup syslog (level: %s, facility: %s)\n"), level, facility); stderr_log_notice(_("Setup syslog (level: %s, facility: %s)\n"), level, facility);
} }
#endif #endif
if (*opts->logfile)
{
FILE *fd;
fd = freopen(opts->logfile, "a", stderr);
if (fd == NULL)
{
fprintf(stderr, "error reopening stderr to '%s': %s",
opts->logfile, strerror(errno));
}
}
return true; return true;
} }
bool bool logger_shutdown(void)
logger_shutdown(void)
{ {
#ifdef HAVE_SYSLOG #ifdef HAVE_SYSLOG
if (log_type == REPMGR_SYSLOG) if (log_type == REPMGR_SYSLOG)
closelog(); closelog();
@@ -173,15 +135,13 @@ logger_shutdown(void)
* options, which might increase requested logging over what's specified * options, which might increase requested logging over what's specified
* in the regular configuration file. * in the regular configuration file.
*/ */
void void logger_min_verbose(int minimum)
logger_min_verbose(int minimum)
{ {
if (log_level < minimum) if (log_level < minimum)
log_level = minimum; log_level = minimum;
} }
int int detect_log_level(const char* level)
detect_log_level(const char *level)
{ {
if (!strcmp(level, "DEBUG")) if (!strcmp(level, "DEBUG"))
return LOG_DEBUG; return LOG_DEBUG;
@@ -203,42 +163,40 @@ detect_log_level(const char *level)
return 0; return 0;
} }
int int detect_log_facility(const char* facility)
detect_log_facility(const char *facility)
{ {
int local = 0; int local = 0;
if (!strncmp(facility, "LOCAL", 5) && strlen(facility) == 6) if (!strncmp(facility, "LOCAL", 5) && strlen(facility) == 6)
{ {
local = atoi(&facility[5]); local = atoi (&facility[5]);
switch (local) switch (local)
{ {
case 0: case 0:
return LOG_LOCAL0; return LOG_LOCAL0;
break; break;
case 1: case 1:
return LOG_LOCAL1; return LOG_LOCAL1;
break; break;
case 2: case 2:
return LOG_LOCAL2; return LOG_LOCAL2;
break; break;
case 3: case 3:
return LOG_LOCAL3; return LOG_LOCAL3;
break; break;
case 4: case 4:
return LOG_LOCAL4; return LOG_LOCAL4;
break; break;
case 5: case 5:
return LOG_LOCAL5; return LOG_LOCAL5;
break; break;
case 6: case 6:
return LOG_LOCAL6; return LOG_LOCAL6;
break; break;
case 7: case 7:
return LOG_LOCAL7; return LOG_LOCAL7;
break; break;
} }
} }

53
log.h
View File

@@ -1,6 +1,6 @@
/* /*
* log.h * log.h
* Copyright (c) 2ndQuadrant, 2010-2014 * Copyright (c) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -25,19 +25,15 @@
#define REPMGR_SYSLOG 1 #define REPMGR_SYSLOG 1
#define REPMGR_STDERR 2 #define REPMGR_STDERR 2
void
stderr_log_with_level(const char *level_name, int level, const char *fmt,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
/* Standard error logging */ /* Standard error logging */
#define stderr_log_debug(...) stderr_log_with_level("DEBUG", LOG_DEBUG, __VA_ARGS__) #define stderr_log_debug(...) if (log_level >= LOG_DEBUG) fprintf(stderr, __VA_ARGS__)
#define stderr_log_info(...) stderr_log_with_level("INFO", LOG_INFO, __VA_ARGS__) #define stderr_log_info(...) if (log_level >= LOG_INFO) fprintf(stderr, __VA_ARGS__)
#define stderr_log_notice(...) stderr_log_with_level("NOTICE", LOG_NOTICE, __VA_ARGS__) #define stderr_log_notice(...) if (log_level >= LOG_NOTICE) fprintf(stderr, __VA_ARGS__)
#define stderr_log_warning(...) stderr_log_with_level("WARNING", LOG_WARNING, __VA_ARGS__) #define stderr_log_warning(...) if (log_level >= LOG_WARNING) fprintf(stderr, __VA_ARGS__)
#define stderr_log_err(...) stderr_log_with_level("ERROR", LOG_ERR, __VA_ARGS__) #define stderr_log_err(...) if (log_level >= LOG_ERR) fprintf(stderr, __VA_ARGS__)
#define stderr_log_crit(...) stderr_log_with_level("CRITICAL", LOG_CRIT, __VA_ARGS__) #define stderr_log_crit(...) if (log_level >= LOG_CRIT) fprintf(stderr, __VA_ARGS__)
#define stderr_log_alert(...) stderr_log_with_level("ALERT", LOG_ALERT, __VA_ARGS__) #define stderr_log_alert(...) if (log_level >= LOG_ALERT) fprintf(stderr, __VA_ARGS__)
#define stderr_log_emerg(...) stderr_log_with_level("EMERGENCY", LOG_EMERG, __VA_ARGS__) #define stderr_log_emerg(...) if (log_level >= LOG_EMERG) fprintf(stderr, __VA_ARGS__)
#ifdef HAVE_SYSLOG #ifdef HAVE_SYSLOG
@@ -90,16 +86,17 @@ __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
if (log_type == REPMGR_SYSLOG) syslog(LOG_ALERT, __VA_ARGS__); \ if (log_type == REPMGR_SYSLOG) syslog(LOG_ALERT, __VA_ARGS__); \
else stderr_log_alert(__VA_ARGS__); \ else stderr_log_alert(__VA_ARGS__); \
} }
#else #else
#define LOG_EMERG 0 /* system is unusable */ #define LOG_EMERG 0 /* system is unusable */
#define LOG_ALERT 1 /* action must be taken immediately */ #define LOG_ALERT 1 /* action must be taken immediately */
#define LOG_CRIT 2 /* critical conditions */ #define LOG_CRIT 2 /* critical conditions */
#define LOG_ERR 3 /* error conditions */ #define LOG_ERR 3 /* error conditions */
#define LOG_WARNING 4 /* warning conditions */ #define LOG_WARNING 4 /* warning conditions */
#define LOG_NOTICE 5 /* normal but significant condition */ #define LOG_NOTICE 5 /* normal but significant condition */
#define LOG_INFO 6 /* informational */ #define LOG_INFO 6 /* informational */
#define LOG_DEBUG 7 /* debug-level messages */ #define LOG_DEBUG 7 /* debug-level messages */
#define log_debug(...) stderr_log_debug(__VA_ARGS__) #define log_debug(...) stderr_log_debug(__VA_ARGS__)
#define log_info(...) stderr_log_info(__VA_ARGS__) #define log_info(...) stderr_log_info(__VA_ARGS__)
@@ -109,18 +106,16 @@ __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
#define log_crit(...) stderr_log_crit(__VA_ARGS__) #define log_crit(...) stderr_log_crit(__VA_ARGS__)
#define log_alert(...) stderr_log_alert(__VA_ARGS__) #define log_alert(...) stderr_log_alert(__VA_ARGS__)
#define log_emerg(...) stderr_log_emerg(__VA_ARGS__) #define log_emerg(...) stderr_log_emerg(__VA_ARGS__)
#endif #endif
/* Logger initialisation and shutdown */ /* Logger initialisation and shutdown */
bool logger_shutdown(void); bool logger_shutdown(void);
bool logger_init(const char* ident, const char* level, const char* facility);
void logger_min_verbose(int minimum);
bool logger_init(t_configuration_options * opts, const char *ident, extern int log_type;
const char *level, const char *facility); extern int log_level;
void logger_min_verbose(int minimum);
extern int log_type;
extern int log_level;
#endif #endif

2288
repmgr.c

File diff suppressed because it is too large Load Diff

21
repmgr.conf Normal file
View File

@@ -0,0 +1,21 @@
###################################################
# Replication Manager configuration file
###################################################
# Cluster name
cluster=test
# Node ID
node=2
# Connection information
conninfo='host=192.168.204.104'
rsync_options=--archive --checksum --compress --progress --rsh=ssh
# Log level: possible values are DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG
# Default: NOTICE
loglevel=NOTICE
# Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER
# Default: STDERR
logfacility=STDERR

View File

@@ -1,62 +0,0 @@
###################################################
# Replication Manager configuration file
###################################################
# Cluster name
cluster=test
# Node ID
node=2
node_name=standby2
# Connection information
conninfo='host=192.168.204.104'
rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""
ssh_options=-o "StrictHostKeyChecking no"
# How many seconds we wait for master response before declaring master failure
master_response_timeout=60
# How many time we try to reconnect to master before starting failover procedure
reconnect_attempts=6
reconnect_interval=10
# Autofailover options
failover=manual
priority=-1
promote_command='repmgr standby promote -f /path/to/repmgr.conf'
follow_command='repmgr standby follow -f /path/to/repmgr.conf -W'
# Log level: possible values are DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG
# Default: NOTICE
loglevel=NOTICE
# Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER
# Default: STDERR
logfacility=STDERR
# path to pg_ctl executable
pg_bindir=/usr/bin/
#
# you may add command line arguments for pg_ctl
#
# pg_ctl_options='-s'
#
# redirect stderr to a logfile
#
# logfile='/var/log/repmgr.log'
#
# change monitoring interval; default is 2s
#
# monitor_interval_secs=2
#
# change wait time for master; before we bail out and exit when the
# master disappears, we wait 6 * retry_promote_interval_secs seconds;
# by default this would be half an hour (since sleep_delay default
# value is 300)
#
# retry_promote_interval_secs=300

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr.h * repmgr.h
* Copyright (c) 2ndQuadrant, 2010-2014 * Copyright (c) 2ndQuadrant, 2010-2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -30,7 +30,6 @@
#define PRIMARY_MODE 0 #define PRIMARY_MODE 0
#define STANDBY_MODE 1 #define STANDBY_MODE 1
#define WITNESS_MODE 2
#include "config.h" #include "config.h"
#define MAXFILENAME 1024 #define MAXFILENAME 1024
@@ -43,34 +42,25 @@
#define DEFAULT_DBNAME "postgres" #define DEFAULT_DBNAME "postgres"
#define DEFAULT_REPMGR_SCHEMA_PREFIX "repmgr_" #define DEFAULT_REPMGR_SCHEMA_PREFIX "repmgr_"
#define MANUAL_FAILOVER 0
#define AUTOMATIC_FAILOVER 1
/* Run time options type */ /* Run time options type */
typedef struct typedef struct
{ {
char dbname[MAXLEN]; char dbname[MAXLEN];
char host[MAXLEN]; char host[MAXLEN];
char username[MAXLEN]; char username[MAXLEN];
char dest_dir[MAXFILENAME]; char dest_dir[MAXFILENAME];
char config_file[MAXFILENAME]; char config_file[MAXFILENAME];
char remote_user[MAXLEN]; char remote_user[MAXLEN];
char wal_keep_segments[MAXLEN]; char wal_keep_segments[MAXLEN];
bool verbose; bool verbose;
bool force; bool force;
bool wait_for_master; bool ignore_rsync_warn;
bool ignore_rsync_warn;
char masterport[MAXLEN]; char masterport[MAXLEN];
char localport[MAXLEN];
/* parameter used by CLUSTER CLEANUP */ /* parameter used by CLUSTER CLEANUP */
int keep_history; int keep_history;
} t_runtime_options;
char min_recovery_apply_delay[MAXLEN];
} t_runtime_options;
#define T_RUNTIME_OPTIONS_INITIALIZER { "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, "", "", 0, "" }
#endif #endif

View File

@@ -1,7 +1,7 @@
/* /*
* repmgr.sql * repmgr.sql
* *
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2011
* *
*/ */
@@ -14,11 +14,8 @@ CREATE SCHEMA repmgr;
*/ */
CREATE TABLE repl_nodes ( CREATE TABLE repl_nodes (
id integer primary key, id integer primary key,
cluster text not null, -- Name to identify the cluster cluster text not null, -- Name to identify the cluster
name text not null, conninfo text not null
conninfo text not null,
priority integer not null,
witness boolean not null default false
); );
ALTER TABLE repl_nodes OWNER TO repmgr; ALTER TABLE repl_nodes OWNER TO repmgr;
@@ -31,12 +28,13 @@ CREATE TABLE repl_monitor (
standby_node INTEGER NOT NULL, standby_node INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL, last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_wal_primary_location TEXT NOT NULL, last_wal_primary_location TEXT NOT NULL,
last_wal_standby_location TEXT, -- In case of a witness server this will be NULL last_wal_standby_location TEXT NOT NULL,
replication_lag BIGINT NOT NULL, replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL apply_lag BIGINT NOT NULL
); );
ALTER TABLE repl_monitor OWNER TO repmgr; ALTER TABLE repl_monitor OWNER TO repmgr;
/* /*
* This view shows the latest monitor info about every node. * This view shows the latest monitor info about every node.
* Interesting thing to see: * Interesting thing to see:
@@ -48,14 +46,14 @@ ALTER TABLE repl_monitor OWNER TO repmgr;
* time_lag: how many seconds are we from being up-to-date with master * time_lag: how many seconds are we from being up-to-date with master
*/ */
CREATE VIEW repl_status AS CREATE VIEW repl_status AS
SELECT primary_node, standby_node, name AS standby_name, last_monitor_time, last_wal_primary_location, WITH monitor_info AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY primary_node, standby_node
ORDER BY last_monitor_time desc)
FROM repl_monitor)
SELECT primary_node, standby_node, last_monitor_time, last_wal_primary_location,
last_wal_standby_location, pg_size_pretty(replication_lag) replication_lag, last_wal_standby_location, pg_size_pretty(replication_lag) replication_lag,
pg_size_pretty(apply_lag) apply_lag, pg_size_pretty(apply_lag) apply_lag,
age(now(), last_monitor_time) AS time_lag age(now(), last_monitor_time) AS time_lag
FROM repl_monitor JOIN repl_nodes ON standby_node = id FROM monitor_info a
WHERE (standby_node, last_monitor_time) IN (SELECT standby_node, MAX(last_monitor_time) WHERE row_number = 1;
FROM repl_monitor GROUP BY 1);
ALTER VIEW repl_status OWNER TO repmgr; ALTER VIEW repl_status OWNER TO repmgr;
CREATE INDEX idx_repl_status_sort ON repl_monitor(last_monitor_time, standby_node);

1484
repmgrd.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +0,0 @@
#
# Makefile
# Copyright (c) 2ndQuadrant, 2010
#
MODULE_big = repmgr_funcs
DATA_built=repmgr_funcs.sql
DATA=uninstall_repmgr_funcs.sql
OBJS=repmgr_funcs.o
ifdef USE_PGXS
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
else
subdir = contrib/repmgr/sql
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
include $(top_srcdir)/contrib/contrib-global.mk
endif

View File

@@ -1,232 +0,0 @@
/*
* repmgr_funcs.c
* Copyright (c) 2ndQuadrant, 2010
*
* Shared memory state management and some backend functions in SQL
*/
#include "postgres.h"
#include "fmgr.h"
#include "access/xlog.h"
#include "miscadmin.h"
#include "storage/ipc.h"
#include "storage/lwlock.h"
#include "storage/procarray.h"
#include "storage/shmem.h"
#include "storage/spin.h"
#include "utils/builtins.h"
#include "utils/timestamp.h"
/* same definition as the one in xlog_internal.h */
#define MAXFNAMELEN 64
PG_MODULE_MAGIC;
/*
* Global shared state
*/
typedef struct repmgrSharedState
{
LWLockId lock; /* protects search/modification */
char location[MAXFNAMELEN]; /* last known xlog location */
TimestampTz last_updated;
} repmgrSharedState;
/* Links to shared memory state */
static repmgrSharedState *shared_state = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
void _PG_init(void);
void _PG_fini(void);
static void repmgr_shmem_startup(void);
static Size repmgr_memsize(void);
static bool repmgr_set_standby_location(char *locationstr);
Datum repmgr_update_standby_location(PG_FUNCTION_ARGS);
Datum repmgr_get_last_standby_location(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgr_update_standby_location);
PG_FUNCTION_INFO_V1(repmgr_get_last_standby_location);
Datum repmgr_update_last_updated(PG_FUNCTION_ARGS);
Datum repmgr_get_last_updated(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(repmgr_update_last_updated);
PG_FUNCTION_INFO_V1(repmgr_get_last_updated);
/*
* Module load callback
*/
void
_PG_init(void)
{
/*
* In order to create our shared memory area, we have to be loaded via
* shared_preload_libraries. If not, fall out without hooking into any of
* the main system. (We don't throw error here because it seems useful to
* allow the repmgr functions to be created even when the module isn't
* active. The functions must protect themselves against being called
* then, however.)
*/
if (!process_shared_preload_libraries_in_progress)
return;
/*
* Request additional shared resources. (These are no-ops if we're not in
* the postmaster process.) We'll allocate or attach to the shared
* resources in repmgr_shmem_startup().
*/
RequestAddinShmemSpace(repmgr_memsize());
RequestAddinLWLocks(1);
/*
* Install hooks.
*/
prev_shmem_startup_hook = shmem_startup_hook;
shmem_startup_hook = repmgr_shmem_startup;
}
/*
* Module unload callback
*/
void
_PG_fini(void)
{
/* Uninstall hooks. */
shmem_startup_hook = prev_shmem_startup_hook;
}
/*
* shmem_startup hook: allocate or attach to shared memory,
*/
static void
repmgr_shmem_startup(void)
{
bool found;
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
/* reset in case this is a restart within the postmaster */
shared_state = NULL;
/*
* Create or attach to the shared memory state, including hash table
*/
LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
shared_state = ShmemInitStruct("repmgr shared state",
sizeof(repmgrSharedState),
&found);
if (!found)
{
/* First time through ... */
shared_state->lock = LWLockAssign();
snprintf(shared_state->location,
sizeof(shared_state->location), "%X/%X", 0, 0);
}
LWLockRelease(AddinShmemInitLock);
}
/*
* Estimate shared memory space needed.
*/
static Size
repmgr_memsize(void)
{
return MAXALIGN(sizeof(repmgrSharedState));
}
static bool
repmgr_set_standby_location(char *locationstr)
{
/* Safety check... */
if (!shared_state)
return false;
LWLockAcquire(shared_state->lock, LW_EXCLUSIVE);
strncpy(shared_state->location, locationstr, MAXFNAMELEN);
LWLockRelease(shared_state->lock);
return true;
}
/* SQL Functions */
/* Read last xlog location reported by this standby from shared memory */
Datum
repmgr_get_last_standby_location(PG_FUNCTION_ARGS)
{
char location[MAXFNAMELEN];
/* Safety check... */
if (!shared_state)
PG_RETURN_NULL();
LWLockAcquire(shared_state->lock, LW_SHARED);
strncpy(location, shared_state->location, MAXFNAMELEN);
LWLockRelease(shared_state->lock);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/* Set update last xlog location reported by this standby to shared memory */
Datum
repmgr_update_standby_location(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
/* Safety check... */
if (!shared_state)
PG_RETURN_BOOL(false);
locationstr = text_to_cstring(location);
PG_RETURN_BOOL(repmgr_set_standby_location(locationstr));
}
/* update and return last updated with current timestamp */
Datum
repmgr_update_last_updated(PG_FUNCTION_ARGS)
{
TimestampTz last_updated = GetCurrentTimestamp();
/* Safety check... */
if (!shared_state)
PG_RETURN_NULL();
LWLockAcquire(shared_state->lock, LW_SHARED);
shared_state->last_updated = last_updated;
LWLockRelease(shared_state->lock);
PG_RETURN_TIMESTAMPTZ(last_updated);
}
/* get last updated timestamp */
Datum
repmgr_get_last_updated(PG_FUNCTION_ARGS)
{
TimestampTz last_updated;
/* Safety check... */
if (!shared_state)
PG_RETURN_NULL();
LWLockAcquire(shared_state->lock, LW_EXCLUSIVE);
last_updated = shared_state->last_updated;
LWLockRelease(shared_state->lock);
PG_RETURN_TIMESTAMPTZ(last_updated);
}

View File

@@ -1,23 +0,0 @@
/*
* repmgr_function.sql
* Copyright (c) 2ndQuadrant, 2010-2014
*
*/
-- SET SEARCH_PATH TO 'repmgr';
CREATE FUNCTION repmgr_update_standby_location(text) RETURNS boolean
AS 'MODULE_PATHNAME', 'repmgr_update_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_standby_location() RETURNS text
AS 'MODULE_PATHNAME', 'repmgr_get_last_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_update_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_update_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'repmgr_get_last_updated'
LANGUAGE C STRICT;

View File

@@ -1,11 +0,0 @@
/*
* uninstall_repmgr_funcs.sql
* Copyright (c) 2ndQuadrant, 2010-2014
*
*/
DROP FUNCTION repmgr_update_standby_location(text);
DROP FUNCTION repmgr_get_last_standby_location();
DROP FUNCTION repmgr_update_last_updated();
DROP FUNCTION repmgr_get_last_updated();

View File

@@ -1,7 +1,7 @@
/* /*
* strutil.c * strutil.c
* *
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2011
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -25,21 +25,29 @@
#include "log.h" #include "log.h"
#include "strutil.h" #include "strutil.h"
static int static int xvsnprintf(char *str, size_t size, const char *format, va_list ap);
xvsnprintf(char *str, size_t size, const char *format, va_list ap)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0))); /* Add strnlen on platforms that don't have it, like OS X */
#ifndef strnlen
size_t
strnlen(const char *s, size_t n)
{
const char *end = (const char *) memchr(s, '\0', n);
return(end ? end - s : n);
}
#endif
static int static int
xvsnprintf(char *str, size_t size, const char *format, va_list ap) xvsnprintf(char *str, size_t size, const char *format, va_list ap)
{ {
int retval; int retval;
retval = vsnprintf(str, size, format, ap); retval = vsnprintf(str, size, format, ap);
if (retval >= (int) size) if (retval >= size)
{ {
log_err(_("Buffer of size not large enough to format entire string '%s'\n"), log_err(_("Buffer of size not large enough to format entire string '%s'\n"),
str); str);
exit(ERR_STR_OVERFLOW); exit(ERR_STR_OVERFLOW);
} }
@@ -48,10 +56,10 @@ xvsnprintf(char *str, size_t size, const char *format, va_list ap)
int int
xsnprintf(char *str, size_t size, const char *format,...) xsnprintf(char *str, size_t size, const char *format, ...)
{ {
va_list arglist; va_list arglist;
int retval; int retval;
va_start(arglist, format); va_start(arglist, format);
retval = xvsnprintf(str, size, format, arglist); retval = xvsnprintf(str, size, format, arglist);
@@ -62,7 +70,7 @@ xsnprintf(char *str, size_t size, const char *format,...)
int int
sqlquery_snprintf(char *str, const char *format,...) sqlquery_snprintf(char *str, const char *format, ...)
{ {
va_list arglist; va_list arglist;
int retval; int retval;
@@ -75,8 +83,7 @@ sqlquery_snprintf(char *str, const char *format,...)
} }
int int maxlen_snprintf(char *str, const char *format, ...)
maxlen_snprintf(char *str, const char *format,...)
{ {
va_list arglist; va_list arglist;
int retval; int retval;

View File

@@ -1,6 +1,6 @@
/* /*
* strutil.h * strutil.h
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
@@ -31,16 +31,13 @@
#define MAXCONNINFO 1024 #define MAXCONNINFO 1024
extern int extern int xsnprintf(char *str, size_t size, const char *format, ...);
xsnprintf(char *str, size_t size, const char *format,...) extern int sqlquery_snprintf(char *str, const char *format, ...);
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4))); extern int maxlen_snprintf(char *str, const char *format, ...);
extern int /* Add strnlen on platforms that don't have it, like OS X */
sqlquery_snprintf(char *str, const char *format,...) #ifndef strnlen
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); extern size_t strnlen(const char *s, size_t n);
#endif
extern int #endif /* _STRUTIL_H_ */
maxlen_snprintf(char *str, const char *format,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
#endif /* _STRUTIL_H_ */

View File

@@ -1,7 +1,7 @@
/* /*
* uninstall_repmgr.sql * uninstall_repmgr.sql
* *
* Copyright (C) 2ndQuadrant, 2010-2014 * Copyright (C) 2ndQuadrant, 2010-2011
* *
*/ */

View File

@@ -1,6 +1,4 @@
#ifndef _VERSION_H_ #ifndef _VERSION_H_
#define _VERSION_H_ #define _VERSION_H_
#define REPMGR_VERSION "1.2.0"
#define REPMGR_VERSION "2.1dev"
#endif #endif