Compare commits

...

585 Commits

Author SHA1 Message Date
Ian Barwick
3988653d6c Add missing line break in logging output
Something else we don't have to worry about in repmgr4.
2019-02-27 09:52:45 +09:00
Ian Barwick
3f9b10a02c Fix directories to exclude in clone from Barman
Backport fix from repmgr4.
2019-02-22 16:31:58 +09:00
Ian Barwick
df34e7e8c5 Prevent "invalid LSN ..." infinite loop when node is passive 2019-02-22 15:49:02 +09:00
Ian Barwick
668b2c9b59 repmgrd: use PQping() as a first test of whether an upstream node is available
It's possible the upstream node may be temporarily not accepting connections
but is still running, so we only confirm that connections are not possible once
PQping() reports a negative result.

This feature has been adapted from repmgr4.
2019-02-22 14:04:37 +09:00
Ian Barwick
9629fb6eb5 repmgrd: remove superfluous query buffer
Query can be sent as-is, no need to copy it to a buffer.
2019-02-21 16:14:08 +09:00
Ian Barwick
967b7c6876 repmgrd: improve logging during failover
Ensure relevant decision making information is visible at the
default log level (INFO), and also that where log messages are
specific to a particular node, that node's ID is noted.
2019-02-21 15:12:23 +09:00
Ian Barwick
120dd5b82d Make the default log level INFO
This ensures that repmgrd outputs a reasonable amount of logging
information at the default log level.
2019-02-21 14:43:14 +09:00
Ian Barwick
243b5d2b48 doc: update README
Add note about repmgr4.1
2019-02-21 14:38:13 +09:00
Ian Barwick
24a354c0a7 Prevent "invalid LSN returned from node..." infinite loop
Currently in repmgrd3, if a repmgrd enters failover, but one or more other
repmgrds do not (e.g. partial primary invisibility), the repmgrd in failover
may enter an infinite loop waiting for the repmgrd(s) not in failover to
update shared memory.
2019-02-21 14:18:50 +09:00
Ian Barwick
a4f572a1ff Bump version
3.4.0
2019-02-21 13:07:48 +09:00
Ian Barwick
9ae8f5780b Fix declaration of "create_checkpoint()" 2018-02-09 12:17:13 +09:00
Chris Fraser
dd9df04334 Create checkpoint after pg_ctl promote (#378)
Creates a Postgres checkpoint after `pg_ctl promote` runs on the former standby and before `pg_rewind` runs on the former master. This fixes the race condition that was reported in https://github.com/2ndQuadrant/repmgr/issues/372
2018-02-09 12:14:26 +09:00
Marco Nenciarini
5411225b6f fix copy&paste error in README.md 2017-11-24 11:38:33 +01:00
Ian Barwick
cf90bc3224 docs: Update README
Add note about .deb packages for 3.3.
2017-11-24 15:08:53 +09:00
Ian Barwick
6ba9077ba5 Initialise variable to empty string 2017-10-04 10:00:15 +09:00
Ian Barwick
ead4866719 Update required formatting standard. 2017-10-04 10:00:07 +09:00
Ian Barwick
a0937e959f "standby clone": fail if recovery.conf could not be created.
Addresses GitHub #296
2017-10-04 10:00:02 +09:00
Ian Barwick
00391ba95d README: clarify possible values for 'wal_level'
Per gripe in GitHub #251
2017-10-04 09:59:58 +09:00
Ian Barwick
01edae1b20 Improve handling of PostgreSQL output during "standby switchover"
Adapted from suggestion by GitHub user "ikusimakusi":

  https://github.com/2ndQuadrant/repmgr/pull/268
2017-10-04 09:59:50 +09:00
Ian Barwick
b92d0cc696 "standby switchover": don't use --ignore-external-config-files
It's deprecated. Do however pass "--copy-external-config-files" if
specified.

Per GitHub #310.
2017-10-04 09:59:45 +09:00
Ian Barwick
2264848601 Use actual program name rather than "repmgr" when executing remote commands
It's possible some distribution packages may assign a different name to the
"repmgr" binary (typically appending a version number), so we shouldn't hard-code
into the command string.

However it's reasonable to assume the "repmgr" binary will have the same name
across a replication cluster so we won't engage in any contortions to account
for possible variations.

Per GitHub #323
2017-10-04 09:59:38 +09:00
Ian Barwick
657125a3fb Note that "conninfo" must not be a connection URI.
Per GitHub #321.
2017-10-04 09:59:33 +09:00
Ian Barwick
8fefb799ee Update README 2017-10-04 09:59:29 +09:00
Ian Barwick
37b458dfcd Various documentation updates 2017-10-04 09:59:25 +09:00
Ian Barwick
72b14a7274 Fix link to downloads on repmgr.org
The directory listing doesn't seem to be working since the recent server
migration.
2017-10-04 09:59:21 +09:00
Ian Barwick
19684f965b "standby switchover": actually abort if SSH connection not possible 2017-10-04 09:59:17 +09:00
Abhijit Menon-Sen
9690aeb030 Fix typo 2017-08-09 17:37:57 +09:00
Ian Barwick
774a3abf24 Minor style fix 2017-08-09 17:35:49 +09:00
Ian Barwick
95d6f08ff4 repmgr: initialise witness connection parameter buffers as empty strings
There's a risk we may be accessing uninitialised memory if any of the
parameters are not supplied.

Addresses GitHub #315.
2017-08-01 14:00:11 +09:00
Ian Barwick
33af998a1e Update README
Fix text.
2017-08-01 14:00:06 +09:00
Ian Barwick
18a56b266b repmgr: fix generation of default "dbname"
If not explicitly provided, "dbname" was being set early to the default
"username" value, which was leading to different behaviour to libpq
applications, where "dbname" defaults to "username" at connection
time.
2017-06-28 22:19:45 +09:00
Ian Barwick
b7d1e7a091 Minor fixes to get_server_version() 2017-06-28 22:19:41 +09:00
Ian Barwick
c7f9fbf524 Clarify repmgr/pgbouncer fencing document
It's intended as a self-contained demonstration.
2017-06-28 22:19:37 +09:00
Ian Barwick
e0ea9c3be4 repmgr: fix standby register --force when updating existing node record 2017-06-15 21:58:50 +09:00
Ian Barwick
318f1dac40 Update HISTORY 2017-05-29 11:43:30 +09:00
Ian Barwick
bda4b0995c Patch Makefile from downstream
Per GitHub #282

Ref:
  https://anonscm.debian.org/cgit/pkg-postgresql/repmgr.git/tree/debian/patches/makefile-no-libs.patch
2017-05-22 22:26:27 +09:00
Ian Barwick
c14449f0a7 Fix build on PostgreSQL older than the current libpq
Make sure that the includedir_internal directory is used before the
includedir_server, otherwise the build may fail for PostgreSQL
version lower than the libpq version.

Backpatched from downstream:
  https://anonscm.debian.org/cgit/pkg-postgresql/repmgr.git/tree/debian/patches/makefile-libpq-internal.patch

Per GitHub #282
2017-05-22 22:26:21 +09:00
Ian Barwick
557e34b70c Add basic regression test from downstream
Per GitHub #282
Ref: https://anonscm.debian.org/cgit/pkg-postgresql/repmgr.git/tree/debian/patches/regress.patch
2017-05-22 22:26:15 +09:00
Ian Barwick
333083869b repmgrd: fix more PostgreSQL 10 WAL function renamings 2017-05-22 22:25:53 +09:00
Ian Barwick
57fae00844 repmgrd: support latest round of PostgreSQL 10 WAL function renamings 2017-05-22 11:02:22 +09:00
Ian Barwick
3de336f1c0 Update HISTORY 2017-05-22 08:44:54 +09:00
Ian Barwick
5493b57443 repmgr: parse --no-slot in pg_basebackup_options
From PostgreSQL 10 we'll need to know whether this is present when
performing sanity checks for available replication slots.

Add a sanity check for conflicting presence of -S/--slot while we're
at it so we can abort early.
2017-05-22 08:40:49 +09:00
Ian Barwick
e53f1bf844 repmgrd: support further renamed WAL function for PostgreSQL 10 2017-05-22 08:38:12 +09:00
Ian Barwick
90638811c8 repmgr: support further renamed WAL function for PostreSQL 10
pg_xlogfile_name() -> pg_walfile_name()
2017-05-22 08:37:44 +09:00
Ian Barwick
892e3b93d1 repmgr: support --wal-method (replacing --xlog-method) for pg_basebackup in PostgreSQL 10 2017-05-22 08:35:30 +09:00
Ian Barwick
6f15a7e52e repmgr: initial support for PostgreSQL 10
Handle directory name change from pg_xlog to pg_wal.

Note that some functions with "xlog" in their name may also change.
2017-05-22 08:26:17 +09:00
Ian Barwick
98998f73bf repmgrd: remove unnecessary inclusions
Per gripe in GitHub #303
2017-05-22 08:13:39 +09:00
Ian Barwick
34ac2d8141 Bump version
3.3.2
2017-05-15 08:46:21 +09:00
Ian Barwick
c820b61f28 Update HISTORY 2017-05-15 08:45:11 +09:00
Ian Barwick
9e620656c5 Minor log/comment fixes 2017-05-15 08:25:52 +09:00
Ian Barwick
2fa277cc53 repmgr: fix --replication-user option when using conninfo string
In "standby clone", if a conninfo string was provided, this was passed
as-is to pg_basebackup - rewrite conninfo string to include the
value passed with --replication-user, if provided.
2017-05-05 09:28:53 +09:00
Ian Barwick
6a4f5944a1 repmgr: add missing '-P' option
Addresses GitHub #293
2017-05-05 09:22:52 +09:00
Ian Barwick
c02a12a113 repmgrd: actually call repmgr_update_last_updated()
Function was created but never actually used, resulting in incorrect
values for "communication_time_lag" in the "repl_status" view.

This appears to have been an oversight in the original commit
( c3b58658ad ).

Addresses GitHub #290
2017-05-05 09:22:46 +09:00
Ian Barwick
01b3933922 repmgr: master register - remove superfluous transaction start
Also make quotation mark usage in logging output consistent.
2017-05-05 09:22:09 +09:00
Ian Barwick
39b3b32814 repmgr: improve detection of pg_rewind on remote server
If `pg_bindir` is not explicitly provided, the remote `ls` command
will be `ls pg_rewind`, which will very likely not find pg_rewind.
In this case execute `which pg_rewind` to confirm it's in the default
path.

Addresses GitHub #267.
2017-05-05 09:22:03 +09:00
Ian Barwick
846e0f73b2 repmgr: avoid spurious cluster name errors during 'standby switchover'
'standby restore-config' doesn't require a configuration file, but
pass it anyway.

Addresses GitHub #269
2017-05-05 09:21:56 +09:00
Ian Barwick
7467525c8d Add log_detail() method 2017-05-05 09:21:51 +09:00
Simon Riggs
b27a94ccbe Typo in failover docs, reported by VaoTsun 2017-05-05 09:21:47 +09:00
Simon Riggs
2e69d155da Typo in README.md reported by dhx 2017-05-05 09:21:43 +09:00
Ian Barwick
870a367d3b repmgr: prevent spurious error message when running 'standby switchover'
When 'repmgr standby follow' is run on a dormant server, with connection
parameters for the upstream node provided (which is done during the
switchover process to reintegrate the stopped former master into the
replication cluster), a spurious error message is generated about
a slot which cannot be deleted as it's active. During the switchover
process the current master's (former standby's) slot on the former master
is deleted at a later point so can be skipped here.

The error message is annoying but harmless and has no effect on the
switchover process.

Addresses GitHub #285.
2017-04-10 23:00:10 +09:00
Ian Barwick
9c28d3626b Update HISTORY 2017-03-20 14:20:52 +09:00
Ian Barwick
0916d8f2ad repmgr: disallow node ids which are not positive signed 32 bit integers
Fixes GitHub #280
2017-03-20 11:26:34 +09:00
Ian Barwick
1964f890be repmgr: enable master register --force with node foreign key dependency
Fixes GitHub #273
2017-03-20 10:03:38 +09:00
Ian Barwick
976a61005e Only attempt to set synchronous transaction mode with valid connection 2017-03-17 20:33:29 +09:00
Ian Barwick
0c82278fd4 repmgr: fix command line parsing with hostname as an additional argument
Check explicitly whether -h/--hostname provided, otherwise PGHOST,
if set, will be misinterpreted.
2017-03-17 20:21:13 +09:00
Ian Barwick
0abfde3773 repmgr: ensure that data dir permissions set correctly when cloning in barman mode 2017-03-17 16:37:49 +09:00
Ian Barwick
1746831486 repmgr: fix standby clone with barman
As of Barman commit 5ff62d3255, `pg_notify`
is also excluded from Barman backups.
2017-03-17 16:32:35 +09:00
Ian Barwick
8c8e368a69 repmgr: set "synchronous_commit" to "local" by default
Rather than set this for individual connections, we'll change the setting
each time a connection is made (except replication connections), which will
obviate the need to take this into consideration when making connections
in the application code.

Resolves GitHub #276.
2017-03-17 11:32:25 +09:00
Ian Barwick
0ef532dcff repmgr: improve standby clone when synchronous replication in use
Fixes GitHub #277
2017-03-16 16:46:08 +09:00
Ian Barwick
478407fd86 repmgr: fix standby clone with barman
As of Barman commit 5ff62d3255, `pg_notify`
is also excluded from Barman backups.
2017-03-16 10:29:50 +09:00
Ian Barwick
05bfdfab2c repmgr: fix command line parsing with hostname as an additional argument
Check explicitly whether -h/--hostname provided, otherwise PGHOST,
if set, will be misinterpreted.
2017-03-14 22:55:27 +09:00
Ian Barwick
29740dc41b Bump version
3.3.1
2017-03-13 16:00:44 +09:00
Ian Barwick
ad6ecef2ab repmgr: clean up standby follow code 2017-03-13 13:27:02 +09:00
Ian Barwick
5318d37462 README: fix typos 2017-03-13 13:20:56 +09:00
Ian Barwick
7244dda20f repmgr: fix typo 2017-03-13 13:20:35 +09:00
Ian Barwick
e651284927 Update documentation/sample configuration with references to --wal-method 2017-03-13 13:20:20 +09:00
Ian Barwick
72a2ac284a repmgr: improve logging of rsync actions
In particular, copy_remote_files() would report any kind of non-zero
exit status from rsync as an error, even though when cloning data
directories and tablespaces we explicitly ignore the "vanished
files" status (code 24) as it's expected behaviour for files in these
locations to disappear during the rsync copy process.

Conflicts:
	HISTORY
2017-03-13 13:16:44 +09:00
Ian Barwick
cec01c6620 repmgr: improve standby follow log output
In particular suppress any error messages encountered when trying to
connect to the old upstream node, as these are not critical and
will lead to confusion.
2017-03-13 13:14:19 +09:00
Ian Barwick
989f683bc6 repmgr: have "standby follow" delete old replication slot, if possible
Addresses GitHub #272
2017-03-13 13:14:05 +09:00
Ian Barwick
fa30382f2c When retrieving a node record, set upstream_node_id correctly.
-1 (NO_UPSTREAM_NODE) should be returned if the record's column is NULL.
2017-03-13 12:16:22 +09:00
Martin
defc2653e0 There where 2 barman configuration parameters missing in the repmgr.conf
sample file.

Added with some comments
2017-02-15 17:17:39 -03:00
Ian Barwick
67e8ca73b5 repmgrd: fix XLogRecPtr conversion function 2017-01-11 15:03:13 +09:00
Ian Barwick
a1a1d64e1f repmgrd: fix usage description
Matches the one provided by repmgr.
2017-01-11 15:03:07 +09:00
Ian Barwick
76509038cc repmgrd: prevent invalid apply lag value being written to the monitoring table 2017-01-11 15:03:02 +09:00
Ian Barwick
7f8e50c882 Update copyright notice to 2017
Also standardize case to "(c)"
2017-01-11 15:02:55 +09:00
Ian Barwick
5deb6c8ce4 rempgr: don't link to backend functions
The intent was to avoid maintaining duplicate code, but this approach
makes it difficult to build Debian packages (see GitHub #261).

As the functions in question are quite compact and unlikely to change,
we'll just use the adapted versions provided for 9.5 and earlier.
2017-01-04 16:55:09 +09:00
Ian Barwick
175ee8acfc README: update version information 2017-01-04 10:59:13 +09:00
Ian Barwick
d1491f51a3 Remove erroneously added configuration item from repmgr.conf.sample
Per GitHub #262
2017-01-04 09:35:39 +09:00
Ian Barwick
bc9febdc48 repmgr: fix error message string
Per GitHub #263.
2017-01-04 09:25:50 +09:00
Ian Barwick
b6cf22ac90 Update HISTORY 2016-12-27 14:49:50 +09:00
Ian Barwick
d89a73cbf4 repmgrd: fix log messages and code comments to reflect what is actually happening
Sometimes we're setting node status to active.
2016-12-27 14:48:34 +09:00
Ian Barwick
1f09e92e3f Typo fix 2016-12-26 15:23:04 +09:00
Ian Barwick
1bdc72a07b Updates to README and --help output 2016-12-26 15:23:00 +09:00
Ian Barwick
a6f1c6e483 repmgr: use provided --replication-user in pg_basebackup mode 2016-12-26 15:22:56 +09:00
Ian Barwick
6e14f0bc5d repmgr: in rsync mode don't delete backup_history file
Make behaviour consistent with pg_basebackup.
2016-12-26 10:39:05 +09:00
Ian Barwick
a336d22bd9 repmgr: miscelleanous code cleanup 2016-12-26 10:39:00 +09:00
Ian Barwick
e88a8a9708 repmgr: standby register --wait-sync=0 can loop infinitely 2016-12-23 11:29:19 +09:00
Ian Barwick
8f3f4eb4a3 Update version number for 3.3 branch 2016-12-22 16:53:39 +09:00
Ian Barwick
dc18e5b791 repmgr: escape conninfo parameters when cloning from Barman 2016-12-22 12:21:44 +09:00
Ian Barwick
9da0914976 Consolidate various backported functions 2016-12-22 12:11:54 +09:00
Ian Barwick
666e71a589 repmgr: escape parameter values in primary_conninfo, if required
Addresses GitHub #253.
2016-12-22 11:40:46 +09:00
Ian Barwick
062af91d36 Update HISTORY 2016-12-21 17:06:58 +09:00
Ian Barwick
571ad698db Add documentation for standby register --force and --upstream-conninfo 2016-12-21 17:06:07 +09:00
Ian Barwick
742f7e167f repmgr: support option --replication-user in STANDBY CLONE too 2016-12-14 17:00:03 +09:00
Ian Barwick
1fb2801639 repmgr: add option --replication-user 2016-12-14 15:58:11 +09:00
Ian Barwick
e3031f0204 repmgr: fix log message 2016-12-14 12:39:47 +09:00
Ian Barwick
79748f28f1 repmgr: improve STANDBY REGISTER sanity checks and log messages 2016-12-14 12:38:42 +09:00
Ian Barwick
46740b64a9 repmgr: enable forced registration of a node with a downstream cascaded standby 2016-12-14 12:22:41 +09:00
Ian Barwick
6557099832 repmgr: initial support for standby register --force
This functionality is intended for those cases where a cascading replication
cluster is being automatically provisioned and it might be necessary to
clone multiple levels in parallel.

As always, use of `--force` implies you know what you are doing.
2016-12-12 15:51:19 +09:00
Ian Barwick
083e288ac3 repmgr: add usage warnings for --no-conninfo-password 2016-12-08 23:01:46 +09:00
Ian Barwick
f5e3d7c041 Merge branch 'no-passwords'
GitHub pull request #255
2016-12-08 22:56:04 +09:00
Abhijit Menon-Sen
402e02f4b7 Add an option to suppress passwords in recovery.conf/primary_conninfo 2016-12-08 22:36:18 +09:00
Ian Barwick
a21b16f960 Update HISTORY 2016-12-07 17:12:21 +09:00
Ian Barwick
be58af701b repmgr: add option --upstream-conninfo
When executing `repmgr standby clone`, this enables the primary_conninfo
string set in recovery.conf to be explictly defined (rather than generated
from the upstream node's conninfo string and connection parameters).

This is primarily intended for those cases when an L2 cascaded standby
is being cloned from the cluster primary, and its intended upstream
might not yet be available.
2016-12-07 17:05:57 +09:00
Ian Barwick
eb2cdf8a98 repmgr: improve handling of command line parameter errors
Previously providing a parameter which requires a value (e.g. -f/--config-file)
would result in a misleading error like "Unknown option -f".

Rather than suppress getopt's error messages, we'll rely on these to inform
about missing required values or unknown options (as other PostgreSQL tools
do), though we will still provide our own list of command line errors and
warnings countered above and beyond getopt's sanity checks.
2016-12-05 14:40:32 +09:00
Ian Barwick
7cc0400c03 repmgr: add option --log-to-file; remove timestamp from log output by default
Log lines will still be prefixed with timestamp if `-log-to-file` used.
2016-12-05 14:09:41 +09:00
Ian Barwick
9788b2bd29 repmgr: don't display timestamp in log output
Differentiate between repmgr and repmgrd output
2016-12-05 10:44:30 +09:00
Ian Barwick
227f0190f7 repmgr: initial support for PostgreSQL 10
Handle directory name change from pg_xlog to pg_wal.

Note that some functions with "xlog" in their name may also change.
2016-12-02 12:06:08 +09:00
Ian Barwick
d6dbc70916 Update HISTORY 2016-11-25 08:17:01 +09:00
Ian Barwick
d2f4eda224 Note use of phbouncer %include directive 2016-11-23 09:33:30 +09:00
Ian Barwick
2588853e83 Refactor sample script in repmgrd-node-fencing.md 2016-11-23 09:21:14 +09:00
Ian Barwick
b54f98ed8a repmgr: always log to STDERR even if log facility defined 2016-11-23 08:55:55 +09:00
Ian Barwick
26f73686e5 repmgrd: fixes to configuration reload mechanism 2016-11-02 23:23:26 +09:00
Ian Barwick
e274a2cbcb repmgrd: clean up some redundant code 2016-11-02 16:52:44 +09:00
Ian Barwick
d502bbe614 repmgrd: enable logging configuration to be changed 2016-11-01 20:34:50 +09:00
Ian Barwick
2594411820 reload_config(): document items which can change 2016-11-01 20:08:57 +09:00
Ian Barwick
d22535de00 repmgr: pointless parsing the configuration here 2016-11-01 19:28:03 +09:00
Ian Barwick
fce1f0cd4a Refactor reload_config()
Remove any non-repmgrd specific items.

parse_config() already sanity-checks the values so no need to
recheck. Refactor parse_config() so when called by reload_config()
it won't exit if errors are encountered.
2016-11-01 19:05:21 +09:00
Ian Barwick
bb842c3989 repmgr: improve replication status checking during switchover
When checking the new standby's record in pg_stat_replication, keep
polling until the expected status is reported, and only give up
after a timeout was exceeded.

Previously repmgr would report an error if status was "startup",
even though this is not a problem.
2016-11-01 17:42:35 +09:00
Ian Barwick
556ff3c311 repmgrd: clarify master_response_timeout 2016-11-01 15:56:57 +09:00
Ian Barwick
251486546d repmgr: when unregistering a witness, remove record on that witness to
Fixes GitHub #250.
2016-10-30 16:28:12 +09:00
Ian Barwick
53d3e71cd3 README: formatting 2016-10-27 10:42:42 +09:00
Ian Barwick
b986ce81b2 README: link to latest release notes 2016-10-27 10:40:31 +09:00
Ian Barwick
7ddb060bdc README: clarify that the repmgr metadatabase must be part of the replication cluster 2016-10-26 20:29:30 +09:00
Ian Barwick
6b02faf37c Update HISTORY 2016-10-24 09:04:04 +09:00
Ian Barwick
0cde0068dd repmgrd: improve witness failover logging
Log the new master node as INFO rather than DEBUG.
2016-10-20 10:56:16 +09:00
Ian Barwick
20d66df0ef repmgr: prevent involuntary cloning where no repmgr schema present on master 2016-10-19 20:26:55 +09:00
Ian Barwick
3f7c30b84d repmgr: require a valid repmgr cluster name unless -F/--force supplied
Addresses issue mentioned in GitHub #242.
2016-10-19 16:16:12 +09:00
Ian Barwick
a63baf7fcb repmgr: rename local variable config_file_found to remote_config_file_found
We also have a global `config_file_found` - avoid confusion with that.
2016-10-19 15:30:02 +09:00
Ian Barwick
e19c643389 repmgr: refactor check_upstream_config() for clarity
Also ensure the check for sufficient walsenders is only carried out
when using pg_basebackup.
2016-10-19 15:06:21 +09:00
Ian Barwick
f058833451 repmgr: in Barman clone mode, don't try and create the data directory twice
In Barman mode the data directory is created early containing a temporary
directory needed to hold temporary files while cloning from the Barman
server. In other modes we might not know the data directory location until
connecting to the source server, so its creation happens later. In Barman
mode ensure that step is skipped.
2016-10-19 10:00:50 +09:00
Ian Barwick
96c14adfdb repmgr: ensure data directory defaults to that of the source node
As long as -D/--pgdata is not supplied, and barman mode is not used
for cloning.
2016-10-17 16:34:49 +09:00
Ian Barwick
50119056a5 repmgr: clean up runtime options structure
Place elements in a sensible order and split the associated initializer
macro over multiple lines for easier editing.

Also move a few related global variables into to the structure to keep
everything in the same place.
2016-10-11 12:54:56 +09:00
Ian Barwick
a279c42df9 "pg_logical/snapshot" -> "pg_logical/snapshots" 2016-10-11 09:32:55 +09:00
Ian Barwick
f70b6ea136 Merge branch 'fix-barman-mode' of git://github.com/gciolli/repmgr
Fixes GitHub #245.
2016-10-11 09:21:31 +09:00
Ian Barwick
e4cb6d7130 repmgr: simplify LSN parsing
Also silences a compiler warning.
2016-10-10 22:56:49 +09:00
Gianni Ciolli
502c056753 When restoring from Barman, create pg_logical subdirectories
Nodes cloned from Barman backups were missing these subdirectories,
and so they were unable to start when promoted.
2016-10-10 10:18:33 +02:00
Ian Barwick
871ec47ff5 Fix repmgr cluster crosscheck output
Show actual node ID rather than incremental number.

Fixes GitHub #244.
2016-10-10 16:20:17 +09:00
Ian Barwick
f435abb3ec Update README
Also reorder HISTORY entries.
2016-10-10 15:10:06 +09:00
Ian Barwick
a217b4d0a9 repmgr: standardize SSH-related error messages 2016-10-07 07:42:15 +09:00
Ian Barwick
2dcb75f889 Add 'cluster crosscheck' to help output detail
Per GitHub #243.
2016-10-06 07:38:45 +09:00
Ian Barwick
b509ce6382 Minor README fix 2016-10-05 16:47:37 +09:00
Ian Barwick
1150bf272a Update README
`--ignore-external-config-files` deprecated
2016-10-05 15:09:07 +09:00
Ian Barwick
09ac6cd145 Update history 2016-10-05 13:57:10 +09:00
Ian Barwick
2fae788bc4 Add documentation for repmgrd failover process and failed node fencing
Addresses GitHub #200.
2016-10-05 11:25:36 +09:00
Ian Barwick
eb90f864c9 repmgr: consistent error message style 2016-10-05 10:31:25 +09:00
Ian Barwick
ba89758366 Update barman-wal-restore documentation
Barman 2.0 provides this in a separate, more convenient `barman-cli` package;
document this and add note about previous `barman-wal-restore.py` script.
2016-10-03 15:59:03 +09:00
Ian Barwick
84595fe711 Tweak repmgr.conf.sample
Put `monitor_interval_secs` at the start of the `repmgrd` section, as it's
a very fundamental configuration item.
2016-10-03 15:57:33 +09:00
Ian Barwick
9523894808 Bump dev version number
3.3dev
2016-09-30 15:14:35 +09:00
Ian Barwick
df09af4d57 Update README
`repmgr cluster show --csv` now only reflects node availability,
and no longer overloads this with node type (master/standby) information.
2016-09-30 15:07:34 +09:00
Ian Barwick
2c1cbc6bf9 Fix witness server initialisation 2016-09-30 13:43:47 +09:00
Ian Barwick
ed22fe326e Document and expand pg_ctl override configuration options
These are now prefixed with "service_" to emphasize that they're
OS-level commands, not repmgr ones; also added reload and promote
commands:

    service_start_command
    service_stop_command
    service_restart_command
    service_reload_command
    service_promote_command

GitHub #169
2016-09-30 11:58:45 +09:00
Ian Barwick
46500e1408 Documentation update and miscellaneous code cleanup 2016-09-30 09:30:22 +09:00
Ian Barwick
c3971513b6 Refactor show diagnose to handle node IDs correctly.
See previous comments for `show matrix`.
2016-09-30 01:46:01 +09:00
Ian Barwick
a2910eded9 Refactor show matrix to handle node IDs correctly.
Previously the code assumed repmgr node IDs to be sequential,
which is not guaranteed to be the case. With a non-sequential
list of node IDs, an incorrect node id would be displayed,
and memory accessed beyond the bounds of the matrix array.

The refactored code is considerably less elegant than the original
but will correctly handle a non-sequential sequence of node IDs.
2016-09-29 18:51:55 +09:00
Ian Barwick
dc70e2d804 Remove superfluous PQfinish() call
Connection was previously closed, if this error condition is triggered
a segfault will occur.
2016-09-29 11:53:13 +09:00
Ian Barwick
ea45158f50 Refactor cluster diagnose
- use the remote user setting, like other SSH-based remote operations
  (avoid hardcoding the user name)
- enable `repmgr cluster matrix'  to accept the cluster name, node id
  and the database connection information instead of requiring repmgr.conf;
  this means we don't have to assume that repmgr.conf is in one
  of the default locations
2016-09-29 11:41:14 +09:00
Gianni Ciolli
84d1e16edd Factor out "build_cluster_diagnose" from "do_cluster_diagnose"
We separate the code that builds the cube from the code that displays
it, in preparation for reusing the cube somewhere else, e.g. for
automatic failover detection.
2016-09-29 01:23:09 +09:00
Gianni Ciolli
57815af3ac Factor out "build_cluster_matrix" from "do_cluster_matrix"
We separate the code that builds the matrix from the code that
displays it, in preparation for reusing the matrix somewhere else,
e.g. for automatic failover detection.
2016-09-29 01:22:54 +09:00
Gianni Ciolli
a4a2e48ab4 README rewording 2016-09-29 01:21:06 +09:00
Gianni Ciolli
5189488b92 Add "cluster diagnose" mode
This mode merges the output of "cluster matrix" from each node to
improve node state knowledge.
2016-09-29 01:19:36 +09:00
Gianni Ciolli
263128a740 Bug fix 2016-09-29 00:59:26 +09:00
Ian Barwick
f775750334 cluster matrix: make remote execution of cluster show more configurable
- use the remote user setting, like other SSH-based remote operations
  (avoid hardcoding the user name)
- enable `repmgr cluster show` to accept the cluster name and the
  database connection information instead of requiring repmgr.conf;
  this means we don't have to assume that repmgr.conf is in one
  of the default locations
2016-09-29 00:54:33 +09:00
Ian Barwick
41ec45a4cc Remove ssh_hostname support
Currently repmgr assumes the SSH hostname will be the same as the
database hostname, and it's easy enough now to extract this
from the node's conninfo string.

We can consider re-adding this in the next release if required.
2016-09-29 00:24:04 +09:00
Gianni Ciolli
9b5b9acb82 Add "cluster matrix" mode and "ssh_hostname" parameter
- The "cluster matrix" command supports CSV mode via the --csv
  switch.
- Add the optional ssh_hostname configuration parameter, which is
  required by "cluster matrix".
- A corresponding ssh_hostname column has been added to the repl_nodes
  table and to the repl_show_nodes view.
2016-09-28 23:40:35 +09:00
Ian Barwick
77de5dbeeb Update HISTORY 2016-09-28 23:37:42 +09:00
Gunnlaugur Thor Briem
465f1a73a5 Fix inconsistent repmgr.conf path in failover cfg
The examples of `promote_command` and `follow_command` reference the
`repmgr.conf` file under a different path from the rest of the README.
This makes them consistent with the rest of the README.
2016-09-28 23:14:49 +09:00
Ian Barwick
c4f84bd777 Update README
Document new `--copy-external-config-files` option.
2016-09-28 16:40:13 +09:00
Ian Barwick
da4dc26505 repmgr: various minor logging fixes 2016-09-28 15:51:26 +09:00
Ian Barwick
19670db1d4 repmgr: add missing 'break' statements in switch structure 2016-09-28 15:41:29 +09:00
Ian Barwick
b9f52e74eb repmgr: add --help output and documentation for --wait-sync 2016-09-28 14:22:26 +09:00
Ian Barwick
fa10fd8493 repmgr: add option --wait-sync for standby register
Causes repmgr to wait for the updated node record to propagate
to the standby before exiting. This can be used to ensure that
actions which depend on the standby's node record being synchronised
(such as starting repmgrd) are not carried out prematurely.

Addresses GitHub #103
2016-09-28 14:08:42 +09:00
Ian Barwick
b7f20ee1f7 repmgrd: don't start if node is inactive and failover=automatic
If failover=automatic, it would be reasonable to expect repmgrd to
consider this node as a promotion candidate, however this will not
happen if it is marked inactive. This often happens when a failed
primary is recloned as a standby but not re-registered, and if
repmgrd would run it would give the incorrect impression that
failover capability is available.

Addresses GitHub #153.
2016-09-28 10:59:20 +09:00
Ian Barwick
bbb2e2f017 Initial implementation of improved configuration file copying
GitHub #210.
2016-09-27 22:19:45 +09:00
Ian Barwick
52328b8f33 repmgr: update README 2016-09-27 10:27:27 +09:00
Ian Barwick
65c2be3441 Remove redundant code 2016-09-26 10:44:34 +09:00
Ian Barwick
b17593ff4d repmgr: improve handling application name during standby clone
Addresses issues noticed while investigating GitHub #238
2016-09-26 10:41:01 +09:00
Ian Barwick
7c1776655b repmgr: correctly set application name during standby follow
Addresses issue mentioned in GitHub #238
2016-09-25 21:33:11 +09:00
Ian Barwick
789470b227 repmgr: rename 'upstream_conninfo' etc. to 'recovery_conninfo' etc.
These variables are used only for generating `primary_conninfo` in recovery.conf,
rename to make their purpose clearer.
2016-09-25 19:18:54 +09:00
Martin
5d3c0d6163 Make it clear in the error message that the type of connection that
failed was an SSH connection, not a libpq one.
2016-09-23 09:15:09 -03:00
Ian Barwick
44d4ca46b0 repmgr: place write_primary_conninfo() next to the other recovery.conf functions
Also terminate the buffer created, for tidiness.
2016-09-22 10:10:26 +09:00
Ian Barwick
114c1bddcb repmgr: only require 'wal_keep_segments' to be set in certain corner cases
Now that repmgr uses pg_basebackup's `--xlog-method=stream` setting by
default, and enables provision of `restore_command`, there's no reason
to require `wal_keep_segments` to be set in the default use-case.

`repmgr standby clone` will now only fail with an error if `wal_keep_segments`
is zero and one of the following cases applies:

* `--rsync-only` clone with no `restore_command` set
* clone with pg_basebackup and `--xlog-method=fetch`
* -w/--wal-keep-segments specified on the command line

If, for whatever reason, it's necessary to perform a standby clone
with `wal_keep_segments=0` in one of the above cases, specifying
`-w/--wal-keep-segments=0` on the command line will effectively
override the check.

GitHub #204
2016-09-21 16:42:14 +09:00
Ian Barwick
5090b8cab1 Update HISTORY 2016-09-21 13:39:42 +09:00
Ian Barwick
5e338473f7 repmgr: before cloning a standby, verify sufficient free walsenders available
Previously repmgr only checked that 'max_wal_senders' is a positive value.
It will now additionally verify that the requisite number of replication
connections can actually be made before commencing with a cloning operation
using pg_basebackup.

GitHub #214
2016-09-21 12:36:17 +09:00
Ian Barwick
e043d5c9a9 Clarify requirements for passwordless SSH access
This is already effectively optional; in 3.2 we will ensure it becomes
fully optional (mainly by deprecating --ignore-external-config-files
and replacing it with --copy-external-config-files).
2016-09-21 08:43:45 +09:00
Ian Barwick
03911488aa repmgr: in standby switchover, quote file paths in remotely executed commands
Per suggestion from GitHub user sebasmannem (#229)
2016-09-21 08:09:44 +09:00
Ian Barwick
3e51a85e07 repmgr: consolidate error messages during replication slot generation
Return error messages to the called so they can be logged as events;
prevent log message duplication in one case.
2016-09-20 09:16:33 +09:00
Ian Barwick
036c59526a repmgr: in switchover mode, prevent checks for local config file if not provided
In switchover mode, if no remote repmgr config file is provided with `-C`,
repmgr attempts to look for a file with the same path as the local
file (provided with `-f/--config-file`). However if this was not specified,
repmgr would execute `ls` with an empty filepath on the remote host, which
appeared to succeed, causing subsequent remote repmgr command executions
to fail as a blank value was provided for `-f/--config-file`.

Fixes GitHub #229.
2016-09-19 21:24:49 +09:00
Ian Barwick
2c55accbdd README: clarify supported versions 2016-09-19 20:59:35 +09:00
Ian Barwick
3ce231a571 Ensure get_pg_settings() returns false if parameter not found
Previously, if e.g. a non-superuser connection is used to get a value
like `data_directory`, which is available to superusers only, it
would return true.
2016-09-19 14:08:27 +09:00
Ian Barwick
178b380f34 Explictly specify pg_catalog path in all system queries 2016-09-19 14:08:20 +09:00
Abhijit Menon-Sen
4d36712901 Another typo fix 2016-09-17 14:15:50 +05:30
Abhijit Menon-Sen
7c3c30ae4a More typo fixes from Andreas 2016-09-17 14:02:26 +05:30
Abhijit Menon-Sen
7e6491a6d6 Fix typo, thanks to Andreas Kretschmer 2016-09-16 21:05:09 +05:30
Ian Barwick
ac8910000f repmgr: Explicitly set permissions on recovery.conf to 0600
Per GitHub #236 (dmarck)
2016-09-09 11:39:55 +09:00
Ian Barwick
cc3c2f5073 Merge branch 'conninfo-fixes'
Refactor recovery.conf generation to take into account the node being
cloned from might not be the intended upstream node, e.g. "grandchild"
node being cloned direct from the master ("grandparent") rather than
the intended parent node.

This extends functionality introduced with Barman support and ensures
that behaviour of cascaded standby cloning is consistent, regardless
of cloning method.
2016-09-05 13:16:31 +09:00
Ian Barwick
171df20386 repmgr: avoid (null) in local_command debug output 2016-09-01 12:14:36 +09:00
Ian Barwick
2105837ef4 repmgr: in standby clone, always add user-supplied connection information to upstream default parameters
Also misc other cleanup and documentation.
2016-09-01 12:06:19 +09:00
Ian Barwick
d12ecba63c repmgr: fix Barman error message in standy clone
Report the Barman catalogue name for the managed server, not the
hostname of the Barman server itself.
2016-08-31 13:50:16 +09:00
Ian Barwick
5276cb279c repmgr: add correct return codes for get_tablespace_data()
Remove unused variable retval.
2016-08-31 13:46:44 +09:00
Ian Barwick
719ad3cf95 repmgr: properly remove temporary directory created when cloning from Barman 2016-08-31 13:37:17 +09:00
Ian Barwick
e87399afc1 Clean up create_recovery_file() function
Caller is responsible for providing a t_conninfo_param_list from which
the value for "primary_conninfo" is created.
2016-08-31 13:03:59 +09:00
Ian Barwick
1d05345aa3 Add error code ERR_BARMAN
Indicates unrecoverable error condition when accessing the barman server
2016-08-31 11:41:55 +09:00
Ian Barwick
a8afa843ee Add parameter checks and help output for --no-upstream-connection 2016-08-31 11:35:23 +09:00
Ian Barwick
5c4b477d84 Consolidate various checks in do_standby_clone() 2016-08-31 11:03:44 +09:00
Ian Barwick
f8fe801225 use repmgr db connection with barman 2016-08-31 09:53:18 +09:00
Ian Barwick
d7456d879d Add option --no-upstream-connection and parse Barman conninfo string 2016-08-31 09:30:03 +09:00
Ian Barwick
751469a08d Check SSH and create data dir early
so we can init the barman stuff, which we want to verify the
upstream conninfo
2016-08-31 09:28:50 +09:00
Ian Barwick
afa5c1469b Actually differentiate between clone source node and defined upstream node 2016-08-31 09:28:36 +09:00
Ian Barwick
1778eeab9c Initial change for differentiating between host we're cloning from and the defined upstream 2016-08-31 09:25:11 +09:00
Gianni Ciolli
95de5ef976 Bug fix 2016-08-31 09:23:36 +09:00
Gianni Ciolli
c0eea90402 Cascading replication support for Barman mode
If upstream_node is specified, we point the standby to that node;
otherwise we point it to the current primary node.
2016-08-31 09:23:25 +09:00
Gianni Ciolli
135fa2e1b9 Fixing primary_conninfo generation (#232)
After introducing Barman mode, it is no longer true that STANDBY CLONE
can derive primary_conninfo from the connection to the master. Now we
ask Barman how to connect to a valid cluster node, and then we fetch
the conninfo for the current master from repmgr metadata.
2016-08-30 10:45:22 +09:00
Gianni Ciolli
2a8861be8b Fixing inaccurate option checking (#230)
The incompatibility between Barman mode and use_replication_slots was
not checked properly, and did not cause repmgr to exit.
2016-08-30 08:33:41 +09:00
Ian Barwick
a55c224510 README: add note about cloned standby configuration. 2016-08-25 09:33:52 +09:00
Ian Barwick
844b9f54e4 Merge branch 'github-234-repmgrd-no-schema' 2016-08-25 09:21:03 +09:00
Ian Barwick
8de84707d9 Always use PQstatus to check connection status
This addresses GitHib #234.
2016-08-25 08:35:47 +09:00
Ian Barwick
3ea61689eb repmgr.conf.sample: change db and user names in conninfo string
They now match the examples given in README.md
2016-08-23 14:11:04 +09:00
Ian Barwick
efb106f8a0 Pass ssh_options when executing a remote command.
This resolves GitHub #233.
2016-08-22 14:34:48 +09:00
Ian Barwick
5baec14a1e Add repmgr.conf setting 'barman_config'
Enables provision of a non-default Barman configuration file.
2016-08-18 15:06:06 +09:00
Ian Barwick
fe469fe188 Merge pull request #191 from gciolli/feature-barman-support
Add Barman support to repmgr standby clone
2016-08-18 15:04:46 +09:00
Ian Barwick
5a7ce552f0 "renumber" deprecated command line options without a short version
This will keep them out of the main list.
2016-08-17 14:51:32 +09:00
Ian Barwick
ef7bed1b3d repmgrd: refactor standby monitoring status query and code
This had grown somewhat complex with addition of handling for
various corner cases. Much of the work has now been delegated
to the query itself.
2016-08-16 19:15:58 +09:00
Ian Barwick
6bd1c6a36d Skip largely pointless master reconnection attempt.
Experimental - see notes in code.
2016-08-16 13:25:39 +09:00
Ian Barwick
9831cabd4d Minor refactoring of do_master_failover()
- rename some variables for clarity
- ensure all structures are initialised correctly
- update code comments
2016-08-16 11:23:59 +09:00
Ian Barwick
d244fb29d7 Update HISTORY 2016-08-15 13:15:55 +09:00
Ian Barwick
4a349f7224 repmgr: emit warning if a deprecated command line option is used 2016-08-15 12:27:55 +09:00
Ian Barwick
fb6109b3e6 Update README.md
Note default usage of `pg_basebackup --xlog-method=stream`.
2016-08-15 10:16:22 +09:00
Martín Marqués
b314f5aaf4 Merge pull request #225 from ryno83/patch-1
Update README.md
2016-08-12 19:52:25 +02:00
Renaud Fortier
7fc340a8e2 Update README.md
I think this will improve the readme.
2016-08-12 10:28:31 -04:00
Ian Barwick
e4c8bd981b Update HISTORY 2016-08-12 09:59:10 +09:00
Ian Barwick
a310417a49 Refactor standby monitoring query
Addresses GitHub #224
2016-08-11 17:28:16 +09:00
Ian Barwick
9a07686ceb When the output of a remote command isn't required, ensure it's consumed anyway
This fixes a regression introduced with commit 85f68e9f77

Also clean up some code made redundant by same.
2016-08-11 08:46:55 +09:00
Ian Barwick
45aa0724c4 Update HISTORY
Also remove code comment obsoleted by previous commit
2016-08-09 15:32:56 +09:00
Ian Barwick
a558e9379e Merge branch 'gciolli-fix_219'
PR #220 resolving GitHub #219
2016-08-09 15:29:59 +09:00
Gianni Ciolli
85f68e9f77 Only collect remote command output if the caller requires it
This addresses GitHub #216 and #167.
2016-08-09 14:14:52 +09:00
Ian Barwick
00e55c0672 Update HISTORY 2016-08-09 12:29:19 +09:00
Ian Barwick
84ab37c600 Improve handling of failover events when failover is set to manual
- prevent repmgrd from repeatedly executing the failover code
- add event notification 'standby_disconnect_manual'
- update documentation

This addresses GitHub #221.
2016-08-09 12:09:09 +09:00
Ian Barwick
6a198401db Fix repmgrd's command line help option parsing
As in commit d0c05e6f46, properly distinguish between
the command line option -? and getopt's unknown option marker '?'
2016-08-08 21:17:56 +09:00
Ian Barwick
cb78802027 repmgrd: prevent endless loops in failover with manual node
The LSN reported by the shared memory function defaults to "0/0"
(InvalidXLogRecPtr) - this indicates that the repmgrd on that node
hasn't been able to update it yet. However during failover several
places in the code assumed this is an error, which would cause
an endless loop waiting for updates which would never come.

To get around this without changing function definitions, we can
store an explicit message in the shared memory location field so the
caller can tell whether the other node hasn't yet updated the field,
or encountered situation which means it should not be considered
as a promotion candidate (which in most cases will be because
`failover` is set to `manual`.

Resolves GitHub #222.
2016-08-08 14:29:24 +09:00
Gianni Ciolli
48f637486d Now STANDBY SWITCHOVER and STANDBY FOLLOW log an event notification on
success and also on some failures, precisely those when it makes sense
or it is reasonably possible to do so.
2016-08-05 14:19:04 +02:00
Ian Barwick
73280a426b Update HISTORY 2016-08-05 16:43:52 +09:00
Ian Barwick
b8ee321d5f Merge branch 'gciolli-feature-better-auto-deb'
GitHub pull request #218
2016-08-05 16:42:27 +09:00
Gianni Ciolli
ccdc0f9871 Improved "repmgr-auto" Debian package
* Version set to 3.2dev

* Binaries are placed in PGBINDIR and then linked from /usr/bin,
  instead of being placed into /usr/bin directly. This is necessary
  for the switchover command, because it requires pg_rewind, which is
  placed in PGBINDIR too.
2016-08-05 09:34:32 +02:00
Gianni Ciolli
3bccd79510 Two bugfixes to Barman mode
- we forgot to recreate an empty pg_replslot;
- we parse tablespace info more safely.
2016-08-04 14:40:24 +02:00
Gianni Ciolli
a4ee10ca22 Rewording sentence in README 2016-08-04 14:40:24 +02:00
Gianni Ciolli
7ca9ff6d54 Reindenting do_standby_clone (no code change). 2016-08-04 14:40:24 +02:00
Gianni Ciolli
b660eb7988 Rename from pg_restore_command to restore_command
This follows the recent upstream change.
2016-08-04 14:40:23 +02:00
Gianni Ciolli
6a4546a7b3 Updated README to reflect the new barman-wal-restore.py script 2016-08-04 14:40:23 +02:00
Gianni Ciolli
2f529e20c1 Barman support, draft #1
TODO: we need to check what happens with configuration files placed in
non-standard locations.
2016-08-04 14:40:23 +02:00
Gianni Ciolli
9853581d12 Introduce a Cell List to store tablespace data 2016-08-04 14:40:23 +02:00
Gianni Ciolli
ecdae9671f Factor out tablespace metadata retrieval
The purpose of this commit is to prepare the terrain for non-default
tablespace support in the forthcoming Barman support.
2016-08-04 14:40:23 +02:00
Gianni Ciolli
1f3e937bbe Add local_command function to run commands locally 2016-08-04 14:40:23 +02:00
Ian Barwick
89aeccedc2 Various bugfixes and code documentation improvements 2016-08-04 12:31:24 +09:00
Ian Barwick
d9bda915bb Update documentation and --help output for witness register
This completes the implementation of GitHub #186
2016-08-04 10:36:26 +09:00
Ian Barwick
c565be4ab6 Improve database connection status checking 2016-08-04 10:36:21 +09:00
Ian Barwick
c26fd21351 Implement repmgr standby register command 2016-08-04 10:36:16 +09:00
Ian Barwick
6b57d0e680 Separate witness registration into do_witness_register() 2016-08-04 10:36:12 +09:00
Ian Barwick
6faf029c93 Add witness unregister command info in help output 2016-08-02 18:39:04 +09:00
Ian Barwick
c42437a4f2 standby/witness unregister - enable even if node isn't running
If the `--node` option is provided with the id of the node to unregister,
the action can be executed on any node.

This addresses GitHub #211.
2016-08-02 17:09:27 +09:00
Ian Barwick
d0c05e6f46 Clean up command line option handling and help output
- properly distinguish between the command line option -? and getopt's
  unknown option marker '?'
- remove deprecated command line options --initdb-no-pwprompt and
  -l/--local-port
- add witness command summary in help output
2016-08-02 14:40:13 +09:00
Ian Barwick
050f007cc2 Initial implementation of witness unregister 2016-08-02 12:22:54 +09:00
Ian Barwick
371d80ff35 Document repmgr cluster show --csv 2016-08-01 16:10:10 +09:00
Ian Barwick
e0a61afb7d Suppress connection error display in repmgr cluster show
This prevents connection error messages being mixed in
with `repmgr cluster show` output. Error message output can
still be enabled with the --verbose flag.

Fixes GitHub #215
2016-08-01 14:57:40 +09:00
Ian Barwick
bbc88ce05c Miscellaneous code cleanup and typo fixes 2016-07-28 16:32:07 +09:00
Ian Barwick
61e907cf70 Update README
Default log level is NOTICE, not INFO.
2016-07-27 12:11:00 +09:00
Ian Barwick
02668ee045 Parse the contents of the "pg_basebackup_options" parameter in repmgr.conf
This is to ensure that when repmgr executes pg_basebackup it doesn't
add any options which would conflict with user-supplied options.

This is related to GitHub #206, where the -S/--slot option has been
added for 9.6 - it's important to check this doesn't conflict with
-X/--xlog-method.

While we're at it, rename the ErrorList handling code to ItemList
etc. so we can use it for generic non-error-related lists.
2016-07-26 16:12:43 +09:00
Ian Barwick
36eb26f86d Mark some variables as static. 2016-07-26 11:40:26 +09:00
Ian Barwick
cbc2c7b3e6 From PostgreSQL 9.6, use pg_basebackup's -S/--slot option if appropriate
GitHub #206
2016-07-26 10:35:30 +09:00
Ian Barwick
8a28dadde4 Rename RECOVERY_FILE to RECOVERY_COMMAND_FILE
This is for consistency with the PostgreSQL source code (see:
src/backend/access/transam/xlog.c ), but as it's not exported
we need to define it ourselves anyway.
2016-07-26 09:19:52 +09:00
Ian Barwick
3eda7373ad Prevent duplicated parameters being passed to pg_basebackup
This was not a problem, but ugly.
2016-07-25 20:44:36 +09:00
Ian Barwick
34e574ac66 Update README
Add caveats for pg_rewind with 9.4 and 9.3 in particular, also fix typo.
2016-07-21 08:58:44 +09:00
Ian Barwick
e8fcc3d7a6 repmgr: set default user for -R/--remote-user 2016-07-18 14:30:30 +09:00
Ian Barwick
eba0f1d7ae Update README with note about using screen et al 2016-07-18 11:01:42 +09:00
Ian Barwick
db32565b36 Add notes about setting pg_bindir for Debian/Ubuntu-based distributions.
repmgr doesn't know about pg_ctlcluster.

Per GitHub query #196.
2016-07-15 16:00:31 +09:00
Ian Barwick
94befc3230 repmgr standby clone historically accepts a hostname as third parameter 2016-07-15 11:17:05 +09:00
Ian Barwick
340899f082 Update README
Add note about 2ndQuadrant RPM repository.
2016-07-13 09:55:01 +09:00
Ian Barwick
76681c0850 Update code comments 2016-07-12 10:56:31 +09:00
Ian Barwick
eebaef59a3 Remove unused error code ERR_BAD_PASSWORD 2016-07-12 09:36:50 +09:00
Ian Barwick
ddaaa28449 README: update error code list 2016-07-12 09:35:52 +09:00
Ian Barwick
e81bf869ec Update README with details about conninfo parameter handling
From 3.1.4 `repmgr` will behave like other PostgreSQL utilities
when handling database connection parameters, in particular
accepting a conninfo string and honouring libpq connection defaults.
2016-07-12 09:27:22 +09:00
Ian Barwick
fa62d715c2 Make more consistent use of conninfo parameters
Removed the existing keyword array which has a fixed, limited number
of parameters and replace it with a dynamic array which can be
used to store as many parameters as reported by libpq.
2016-07-11 15:24:53 +09:00
Ian Barwick
72af24e1d6 Add missing space when setting "application_name" 2016-07-08 09:46:59 +09:00
Ian Barwick
61d617ae93 Improve default host/dbname handling
repmgr disallows socket connections anyway (the whole point of providing
the host is to connect to a remote machine) so don't show that as
a fallback default in the -?/--help output.
2016-07-08 08:34:15 +09:00
Ian Barwick
319fba8b1f Enable a conninfo string to be passed to repmgr in the -d/--dbname parameter
This matches the behaviour of other PostgreSQL utilities such as pg_basebackup,
psql et al.

Note that unlike psql, but like pg_basebackup, repmgr does not accept a
"left-over" parameter as a conninfo string; this could be added later.

Parameters specified in the conninfo string will override any parameters
supplied correcly (e.g. `-d "host=foo"` will override `-h bar`).
2016-07-08 08:31:14 +09:00
Ian Barwick
c92ea1d057 Generate "primary_conninfo" using the primary connection's parameters
Having successfully connected to the primary, we can use the actual parameters
reported by libpq to create "primary_conninfo", rather than the limited
subset previously defined by repmgr. Assuming that the user can
pass a conninfo string to repmgr (see following commit), this makes it
possible to provide other connection parameters, e.g. related to
SSL usage.
2016-07-08 08:11:11 +09:00
Ian Barwick
b2ca6fd35e Remove now-superfluous wildcard in rmtree() call 2016-07-07 09:51:55 +09:00
Martin
c880187e89 Fix alignment and syntax 2016-07-06 15:52:58 -03:00
Jarkko Oranen
4724da41ad Add sample configuration for systemd support 2016-07-06 15:52:58 -03:00
Jarkko Oranen
d44885b330 Allow overriding start, stop and restart commands issued by repmgr
This commit introduces three new options:
  - start_command
  - stop_command
  - restart_command

If these are set, repmgr will issue the specified command instead
of the default pg_ctl commands
2016-07-06 15:52:58 -03:00
Ian Barwick
72f9b0145a Use rmtree instead of executing "rm -rf"
This completes the task mentioned in b6b6439819
2016-07-06 15:55:56 +09:00
Ian Barwick
5e03ef40cb Update HISTORY 2016-07-06 15:48:14 +09:00
Ian Barwick
091541619d Fix repmgrd monitoring calculation when in archive recovery 2016-07-06 09:27:31 +09:00
Ian Barwick
5e9db47d12 Fix query in get_node_record_by_name() 2016-07-05 21:06:31 +09:00
Ian Barwick
e8a0cd33b5 Ensure all node record structures are initialised 2016-07-05 11:33:06 +09:00
Ian Barwick
8cd79fd7dd Revert "repmgr: add option -B/--remote-pg_bindir for standby switchover"
This reverts commit c30447ac90.
2016-07-04 11:30:36 +09:00
Ian Barwick
013b4b4b8a Update README/TODO about following non-master server 2016-07-01 12:15:37 +09:00
Ian Barwick
c5a721a3cf TODO: remove resolved item 2016-07-01 12:07:10 +09:00
Ian Barwick
a6294b7da0 Update README.md
Add note about logging configuration and settings in `pg_ctl_options`
for switchover operations.
2016-07-01 10:47:20 +09:00
Ian Barwick
a0f02e454c repmgr: during switchover check demotion candidate's pidfile
If the pidfile is still there after apparent shutdown, or we're
unable to access the server at all, something has gone wrong and
the switchover should be aborted.
2016-07-01 09:42:55 +09:00
Ian Barwick
69d9d137e0 repmgr: change default pg_ctl shutdown mode to "fast"
This matches the default pg_ctl behaviour (from 9.5) and is the
more sensible option for performing time-critical operations such
as switchover.
2016-07-01 08:52:26 +09:00
Ian Barwick
60bceae905 Use PQping only to check for server shutdown
Functionally there's no difference between that and attempting to
make an actual connection, so use one method only, which also
simplifies the code.
2016-07-01 07:50:27 +09:00
Ian Barwick
746c9793ed Better detect completion of demotion candidate shutdown
If a connection attempt fails, keep pinging the server until it
finally away, or the timeout kicks in.

Addresses issue reported in GitHub #188 and previously noted in
repmgr.c
2016-06-30 21:35:00 +09:00
Ian Barwick
c30447ac90 repmgr: add option -B/--remote-pg_bindir for standby switchover
This enables the switchover operation to function if the remote server
(current primary) has a different binary directory to the current
server, and addresses the issue reported in GitHub #172.
2016-06-30 12:51:54 +09:00
Ian Barwick
097024a32f repmgr: add new error code ERR_SWITCHOVER_FAIL 2016-06-29 12:11:53 +09:00
Ian Barwick
66b7dbbed7 repmgr: use make_pg_path() consistently
Per comment from gciolli.
2016-06-29 11:33:33 +09:00
Ian Barwick
74f6f97f26 repmgrd: log whether in standby or witness monitor loop
This is mainly for development and debugging purposes.
2016-06-29 10:31:57 +09:00
Ian Barwick
968c2f1954 Add notes about connect_timeout conninfo parameter.
Per suggestion in GitHub #148
2016-06-27 13:57:40 +09:00
Ian Barwick
bd76d0eb92 Update postgresql.org links to https 2016-06-27 12:32:10 +09:00
Ian Barwick
f1ee6e19b6 Ensure configuration options correctly initialised in repmgrd.c
Per GitHub #150.

Also remove unused variable.
2016-06-27 11:26:05 +09:00
Ian Barwick
fbb65b4a43 Remove RHEL packaging files.
There's no point in maintaining in parallel to the PGDG packages.
See also notes in GitHub #156.
2016-06-24 10:19:20 +09:00
Ian Barwick
3fac975de6 Prevent multiple nodes being registered with the same name.
Fixes GitHub #192.
2016-06-24 09:25:41 +09:00
Ian Barwick
a2b5ba595a repmgrd: reword log message for clarity 2016-06-23 09:47:35 +09:00
Ian Barwick
c16ab3c889 Fix handling of global PGconn variables in repmgrd
Don't call PQfinish before calling terminate(), elsewhere always
set to NULL after calling PQfinish().

This fixes GitHub #182.
2016-06-21 17:30:22 +09:00
Ian Barwick
dd5b6f9f12 Whitespace fixes 2016-06-21 16:04:41 +09:00
Ian Barwick
303bb22ee1 Note potential replication lag check improvement 2016-06-20 12:23:34 +09:00
Ian Barwick
5d8b1a3a31 monitoring: ensure that invalid replication_lag value is not inserted.
Per Github #189.
2016-06-20 10:55:25 +09:00
Ian Barwick
3d6c349d88 Rename "pg_restore_command" to "restore_command"
The 'pg_' prefix could cause confusion with actual binaries
(pg_ctl, pg_basebackup etc.) and there's no obvious reason
why we need it.
2016-06-17 14:43:02 +09:00
Ian Barwick
1ade1acb22 Report standby location as last apply location when in archive recovery
Otherwise the monitoring table's 'last_wal_standby_location' will stay at
the location of the last streaming WAL received.

This complements the bugfix applied in e814c1120e.
2016-06-15 15:41:10 +09:00
Ian Barwick
66fd003ab4 Schema-qualify pg_catalog objects 2016-06-10 17:58:10 +09:00
Ian Barwick
0d42b771f5 Remove unnecessary get_server_version() calls in do_standby_clone()
We cache the value at the start of the function, and it's reasonable to
assume that the server version is not going to suddenly change.
2016-06-06 23:41:47 +09:00
Ian Barwick
005640be51 Fix PQconninfoParse() return type check 2016-06-05 10:20:42 +09:00
Martin
b6ebd34e2f Some other indentation fixes found 2016-06-03 20:20:43 -03:00
Martin
951879f80d Typo noticed by Brett Maton. 2016-06-03 20:20:43 -03:00
Martin
46ff9fb587 No code change, just indentation was incorrect in the failover part
making it hard to read.
2016-06-03 20:20:43 -03:00
Ian Barwick
cc610f995d Whitespace and typo fixes 2016-06-03 21:25:22 +09:00
Martin
384618cb33 There was a missing table in sql/repmgr2_repmgr3.sql which made events
error when trying to insert them.

This is just a copy and paste from the table creation in repmgr.c

This fixes #184 reported by Andreas Kretschmer
2016-06-02 14:23:14 -03:00
Martin
0dd617cfca The ALTER TABLE to set the foreign key as DEFERRABLE only worked on
9.4+, as there is no ALTER CONSTRAINT in 9.3.

This new ALTER TABLE does the same in two hops by removing the foreign
key and creating it again in the same ALTER TABLE.

This fixes #183
2016-06-02 13:56:06 -03:00
Martin
f18d629bd2 Typo in a comment.
Reported by @nucfisher
2016-05-30 09:41:38 -03:00
Ian Barwick
afc904f876 Fix typos and whitespace 2016-05-30 16:01:38 +09:00
Ian Barwick
3bcea46c3b Clarify handling of tablespace_map file. 2016-05-30 10:48:08 +09:00
Ian Barwick
d7e85f7565 Clarify purpose of remapping code 2016-05-30 07:47:09 +09:00
Ian Barwick
b14d8ddb74 Ensure read_backup_label() does not exit on error
This would leave an unstopped backup; we'll let do_standby_clone()
do any cleanup necessary before exiting with an error.
2016-05-24 17:04:54 +09:00
Ian Barwick
9b2a907b09 Convert erroneously forgotten printf debug to proper logging 2016-05-24 16:44:22 +09:00
Martín Marqués
f63d42fe77 Merge pull request #178 from gciolli/master
Debian auto-build version upgrade
2016-05-23 14:48:28 -03:00
Gianni Ciolli
560066fa9d Basic CSV mode for "repmgr cluster show", enabled by --csv command
line option
2016-05-23 19:00:25 +02:00
Martin
3937670d14 Added logrotate configuration file (for those who would like one ;))
and changed parameters in sysconfig file so that we use /var/log for
logs and /var/run for pid files.
2016-05-23 13:30:05 -03:00
Gianni Ciolli
0daa7381b3 Debian auto-build version upgrade 2016-05-22 22:10:31 +02:00
Martín Marqués
e53545af4f Merge pull request #177 from petere/fix-compiler-warning
Fix compiler warning
2016-05-22 09:55:50 -03:00
Peter Eisentraut
45178c19d8 Fix compiler warning
For a char * variable, '\0' is just a strange way to write NULL, and
clang warns about it.
2016-05-21 21:04:19 -04:00
Martin
cf46834041 Add new option pg_restore_command.
This can be used so that repmgr standby clone adds the string
specified in repmgr.conf as a restore_command in recovery.conf.

We can use this option for integration with barman by setting the
parameter to an appropriate get-wal call.
2016-05-17 15:21:40 -03:00
Martin
c30609426a Fix several inconsistencies added in d5d8eb2bcb8862607799d602af620e5ca98bc837 2016-05-17 15:21:40 -03:00
Martin
1c49c4159c Add -X stream parameter to pg_basebackup to ensure that we get
all the WALs needed without needing to set wal_keep_segments to
a ridiculously high value.

This is not necessary on 9.6 if we are using replication slots,
as all WAL segments needed will be kept on the primary until
they are consumed by the slot.
2016-05-17 15:21:40 -03:00
Martin
b6b6439819 I copied over the rmtree function (and other functions needed by this one)
from the postgresql code so we use that instead of issuing system calls
with rm -rf ....

I also eliminated the rm -rf for pg_xlog.

Will later do the same with the other system call to remove files
in pg_replslot/
2016-05-17 15:21:40 -03:00
Ian Barwick
9a05999abb Fix log formatting 2016-05-17 17:24:02 +09:00
Ian Barwick
4c463a66b7 Update HISTORY 2016-05-17 10:37:14 +09:00
Ian Barwick
209de699ce README.md: improve documentation of repl_status view 2016-05-16 13:51:40 +09:00
Ian Barwick
e814c1120e repmgrd: handle situations where streaming replication is inactive 2016-05-12 22:17:44 +09:00
Ian Barwick
247823db4d Remove extraneous PQfinish() 2016-05-12 14:05:44 +09:00
Ian Barwick
beda22d5f9 Correct check for wal_level in 9.3 2016-05-12 13:06:10 +09:00
Ian Barwick
2eb00a3e6f Remove unneeded column 2016-05-12 09:56:29 +09:00
Ian Barwick
0a798bf6e4 Comment fixes and formatting tweaks 2016-05-12 09:52:22 +09:00
Ian Barwick
21b2ff1a1f repmgrd: better handling of missing upstream_node_id
Ensure we default to master node.
2016-05-12 09:20:33 +09:00
Ian Barwick
57f9432692 Add missing newlines in log messages 2016-05-11 21:47:40 +09:00
Ian Barwick
54d3c7a4ca repmgrd: avoid additional connection to local instance in do_master_failover() 2016-05-11 09:55:38 +09:00
Ian Barwick
7fd44a3d74 Suppress gnu_printf format warning 2016-05-11 09:23:06 +09:00
Ian Barwick
b0f6b7bad7 repmgrd: rename variable for clarity 2016-05-11 08:29:55 +09:00
Ian Barwick
4dbbf40196 Don't follow the promotion candidate standby if the primary reappears 2016-05-10 13:58:59 +09:00
Ian Barwick
d5e24689a4 Don't terminate a standby's repmgrd if self-promotion fails due to master reappearing
Per GitHub #173
2016-05-10 11:45:03 +09:00
Martin
10e47441a2 The commit fixes problems not taking in account while working on the
issue with rsync returning non-zero status on vanishing files on commit
83e5f98171.

Alvaro Herrera gave me some tips which pointed me in the correct
direction.

This was reported by sungjae lee <sj860908@gmail.com>
2016-05-06 17:34:46 -03:00
Ian Barwick
274a30efa5 Fix pre-9.6 wal_level check 2016-04-12 16:17:50 +09:00
Ian Barwick
db63b5bb1c Fix hint message formatting 2016-04-12 16:08:07 +09:00
Ian Barwick
e100728b93 Update HISTORY 2016-04-12 15:51:42 +09:00
Ian Barwick
d104f2a914 Update HISTORY 2016-04-12 15:51:42 +09:00
Ian Barwick
2946c097f0 repmgrd: rename some variables to better match the system functions they're populated from 2016-04-12 15:51:42 +09:00
Ian Barwick
a538ceb0ea Remove duplicate inclusion from header file 2016-04-12 15:51:42 +09:00
Ian Barwick
5a2a8d1c82 Update HISTORY 2016-04-12 15:51:41 +09:00
Ian Barwick
b5a7efa58e Preserve failover slots when cloning a standby, if enabled 2016-04-12 15:51:38 +09:00
Ian Barwick
9f6f58e4ed MAXFILENAME -> MAXPGPATH 2016-04-06 11:19:00 +09:00
Craig Ringer
c22f4eaf6f WIP support for preserving failover slots 2016-04-06 11:18:54 +09:00
Martin
925d82f7a4 Add to the documentation the need to have archive_mode and
archive_command set in postgresql.conf

Fixes #154
2016-04-05 21:37:14 -03:00
Ian Barwick
1db577e294 Update Makefile
Add include file dependencies (see caveat in file).

Also update comments.
2016-04-06 09:20:00 +09:00
Martin
a886fddccc We were not checking the return code after rsyncing the tablespaces.
This fixes #168
2016-04-05 15:30:42 -03:00
Martin
83e5f98171 Ignore rsync error code for vanished files.
It's very common to come over vanish files during a backup or rsync
o the data directory (dropped index, temp tables, etc.)

This fixes #149
2016-04-05 15:22:40 -03:00
Ian Barwick
eb31a56186 Enable long option --pgdata as alias for -D/--data-dir
pg_ctl provides -D/--pgdata; we want to be as close to the core utilities
as possible.
2016-04-05 22:30:19 +09:00
Ian Barwick
8cd2c6fd05 Add comment about wal_level setting interpretation 2016-04-04 14:57:20 +09:00
Ian Barwick
e3e1c5de4e Use "immediately_reserve" parameter in pg_create_physical_replication_slot (9.6) 2016-04-04 12:56:00 +09:00
Ian Barwick
f9a150504a Enable repmgr to be compiled with PostgreSQL 9.6 2016-04-04 12:41:03 +09:00
Ian Barwick
5bc809466c Make self-referencing foreign key on repl_nodes table deferrable 2016-04-01 15:19:22 +09:00
Ian Barwick
5d32026b79 Improve debugging output for node resyncing
We'll need this for testing.
2016-04-01 11:29:35 +09:00
Ian Barwick
2a8d6f72c6 Make witness server node update an atomic operation
If the connection to the primary is lost, roll back to the previously
known state.

TRUNCATE is of course not MVCC-friendly, but that shouldn't matter here
as only one process should ever be looking at this table.
2016-04-01 11:07:17 +09:00
Ian Barwick
190cc7dcb4 Rename copy_configuration () to witness_copy_node_records()
As it's witness-specific. Per suggestion from Martín.
2016-04-01 08:44:23 +09:00
Ian Barwick
819937d4bd Replace MAXFILENAME with MAXPGPATH 2016-03-31 20:10:39 +09:00
Ian Barwick
57299cb978 Comment out configuration items in sample config file
The configured values are either the defaults, or examples which
may not work in a real environment. If this file is being used as
a template, the onus is on the user to uncomment and check all
desired parameters.
2016-03-31 15:14:15 +09:00
Ian Barwick
59f503835b Merge branch 'gciolli-master' 2016-03-31 15:00:29 +09:00
Ian Barwick
33e626cd75 Merge branch 'master' of https://github.com/gciolli/repmgr into gciolli-master 2016-03-31 15:00:10 +09:00
Ian Barwick
491ec37adf Update HISTORY 2016-03-31 14:58:50 +09:00
Ian Barwick
c93790fc96 Check directory entity filetype in a more portable way 2016-03-30 20:20:36 +09:00
Ian Barwick
ecabe2c294 Fix pg_ctl path generation in do_standby_switchover() 2016-03-30 16:45:47 +09:00
Gianni Ciolli
2ba57e5938 Rewording comment for clarity. 2016-03-30 09:27:37 +02:00
Ian Barwick
2eec17e25f Add headers as dependencies in Makefile 2016-03-30 10:28:41 +09:00
Ian Barwick
c48c248c15 Regularly sync witness server repl_nodes table.
Although the witness server will resync the repl_nodes table following
a failover, other operations (e.g. removing or cloning a standby)
were previously not reflected in the witness server's copy of this
table.

As a short-term workaround, automatically resync the table at regular
intervals (defined by the configuration file parameter
"witness_repl_nodes_sync_interval_secs", default 30 seconds).
2016-03-29 16:49:28 +09:00
Martín Marqués
958e45f2b8 Merge pull request #165 from dhyannataraj/master
Better use /24 network mask in this example
2016-03-28 13:32:09 -03:00
Nikolay Shaplov
daafd70383 Better use /24 network mask in this example 2016-03-28 18:56:50 +03:00
Ian Barwick
c828598bfb It's unlikely this situation will occur on a witness server
Which is why the error message is for master/standby only.
2016-03-28 15:53:25 +09:00
Ian Barwick
b55519c4a2 Add hint about registering the server after cloning it.
This step is easy to forget.
2016-03-02 09:41:06 +09:00
Ian Barwick
4cafd443e1 README: Add note about 'repmgr_funcs' 2016-02-25 16:36:19 +09:00
Ian Barwick
d400d7f9ac repmgrd: fix error message 2016-02-24 15:33:36 +09:00
Ian Barwick
62bb3db1f8 Fix code comment 2016-02-24 15:27:08 +09:00
Ian Barwick
d9961bbb17 Minor fixes to README.md 2016-02-23 14:25:26 +09:00
Ian Barwick
e1b8982c14 Ensure witness node is registered before the repl_nodes table is copied
This fixes a bug introduced into the previous commit, where the
witness node was registered last to prevent a spurious node record
being created even if witness server creation failed.
2016-02-23 07:36:41 +09:00
Martin
2fe3b3c2a3 Fix a few paragraphs from the README.md. 2016-02-22 09:20:11 -03:00
Ian Barwick
c6e1bc205a Prevent repmgr/repmgrd running as root 2016-02-22 14:58:17 +09:00
Ian Barwick
7241391ddc Better handling of errors during witness creation
Ensure witness is only registered after all steps for creation
have been successfully completed.

Also write an event record if connection could not be made to
the witness server after initial creation.

This addresses GitHub issue #146.
2016-02-16 14:28:33 +09:00
Ian Barwick
c8f449f178 witness creation: extract database and user names from the local conninfo string
99.9% of the time they'll be the same as the primary connection, but
it's more consistent to use the provided local conninfo string
(from which the port is already extracted).
2016-02-09 15:59:28 +09:00
Ian Barwick
49420c437f README.md: update witness server section 2016-02-09 15:58:51 +09:00
Ian Barwick
827ffef5f9 Add '-P/--pwprompt' option for "repmgr create witness"
Optionally prompt for superuser and repmgr user when creating a witness.
This ensures a password can be provided if the primary's pg_hba.conf
mandates it.

This deprecates '--initdb-no-pwprompt'; and changes the default behaviour of
"repmgr create witness", which previously required a superuser password
unless '--initdb-no-pwprompt' was supplied.

This behaviour is more consistent with other PostgreSQL utilities such
as createuser.

Partial fix for GitHub issue #145.
2016-02-09 15:11:50 +09:00
Ian Barwick
16296bb1c3 Bump dev version number 2016-02-01 15:10:03 +09:00
Ian Barwick
c9c18d6216 Update comments in SQL upgrade file 2016-02-01 13:49:58 +09:00
Ian Barwick
d21f506614 Add placeholders for old FAILOVER.rst and QUICKSTART.md pointing to new README.md 2016-02-01 08:02:37 +09:00
Ian Barwick
fbad18085e README.md: formatting and spelling fixes 2016-01-29 17:09:12 +09:00
Ian Barwick
ca08b1c3bb REAMDE.md: formatting 2016-01-29 16:55:47 +09:00
Ian Barwick
3d95fab0ac README.md: minor corrections 2016-01-29 16:47:12 +09:00
Ian Barwick
12d6ce4629 Bump version 2016-01-28 16:09:06 +09:00
Ian Barwick
dfb34ae7b6 Update HISTORY 2016-01-28 11:44:28 +09:00
Ian Barwick
98c4eb002a README: add replication slot example 2016-01-28 11:43:42 +09:00
Ian Barwick
faed8a65f7 Add new README file
Replaces existing legacy documentation.
2016-01-28 11:32:19 +09:00
Ian Barwick
a81cf04614 Update various comments 2016-01-28 11:13:38 +09:00
Ian Barwick
ca6cbcf965 Add sanity checks to be sure pg_rewind can be used before executing a switchover 2016-01-28 09:25:00 +09:00
Ian Barwick
16c1e13019 Sanity check for presence of pg_rewind on remote server 2016-01-28 07:25:23 +09:00
Ian Barwick
1375adcac8 Standardize capitalisation in log messages 2016-01-28 07:24:45 +09:00
Ian Barwick
e859a58405 Change some repmgrd log messages to NOTICE
So key events during failover on promoted and following standbys
logged at the same level.
2016-01-27 18:39:27 +09:00
Ian Barwick
1a6d830314 Various logging, help output and comment tweaks 2016-01-27 17:10:54 +09:00
Ian Barwick
a96f478a43 Allow negative values in configuration parameters, where appropriate.
Make the code match the documentation.

As pointed out by GitHub user phyber (#142).

Also various other minor improvements to error reporting during
config file parsing.
2016-01-27 09:10:19 +09:00
Ian Barwick
8f20ab16dd Display default log level in -?/--help output 2016-01-25 14:39:21 +09:00
Ian Barwick
3ec436f30d Better document default values in repmgr.conf.sample 2016-01-22 15:22:22 +09:00
Ian Barwick
61e00bf1c7 Improve handling of default connection parameters 2016-01-22 14:14:14 +09:00
Ian Barwick
5d71869fc1 --help output: put default values in quotation marks
Similar to psql.
2016-01-20 15:56:26 +09:00
Ian Barwick
7598e08b6f Display default connection options in --help output 2016-01-20 15:46:41 +09:00
Ian Barwick
ba71e1eedf Use maxlen_snprintf() in do_witness_create() 2016-01-20 15:13:52 +09:00
Ian Barwick
a4c07b23fb Document --pg_rewind option in help output 2016-01-20 15:09:03 +09:00
Ian Barwick
0c36f921f7 help() -> do_help()
For consistency and easier location of the function body.
2016-01-20 15:02:03 +09:00
Ian Barwick
8ac5a5444e Enable repmgr standby switchover for 9.3/9.4 by recloning
A bit of a hack and unsuited for large databases - install
pg_rewind instead. Or upgrade to 9.5.
2016-01-20 14:51:16 +09:00
Ian Barwick
f60e7346e2 Don't copy 'recovery.done' or 'recovery.conf' when cloning a standby
'recovery.conf' will be overwritten, but we don't want a 'recovery.done'
for another server.
2016-01-20 14:34:13 +09:00
Ian Barwick
855ca8fe1a Support separately-compiled pg_rewind for "repmgr standby switchover" in 9.3/9.4 2016-01-20 14:21:02 +09:00
Ian Barwick
daa79d1a0f Remove any recovery.done file copied in by pg_rewind
It'll be the old standby/new primary's old recovery.conf file,
which we don't want floating about confusing things.
2016-01-20 13:18:52 +09:00
Ian Barwick
211768d911 Support pg_rewind when executing "repmgr standby switchover"
9.5 and later.
2016-01-20 13:05:47 +09:00
Ian Barwick
f982708b35 Add function test_db_connection()
The difference between this and establish_db_connection() is that
it outputs any connection failure as a [NOTICE] rather than an
[ERROR]; it's intended for use in e.g. polling a server to wait
for it to come up/go down, while preventing [ERROR] log lines
which may cause confusion.
2016-01-20 07:56:03 +09:00
Ian Barwick
995083d66c Change event record type for repmgr standby follow to 'standby_follow' for consistency 2016-01-19 14:53:16 +09:00
Ian Barwick
be58d6af96 Update TODO 2016-01-19 13:57:46 +09:00
Ian Barwick
a52e97e622 Update FAQ 2016-01-19 13:53:18 +09:00
Ian Barwick
cc1ea00333 Ensure event logging doesn't generate an error when cloning from a standby 2016-01-19 13:51:49 +09:00
Ian Barwick
ec3596521f Fix comment typo 2016-01-19 11:39:41 +09:00
Ian Barwick
66245ccc03 Add warning if -r used with -c 2016-01-14 23:12:41 +09:00
Ian Barwick
c7542063be Only display -c/--fast-checkpoint hint when using pg_basebackup 2016-01-14 22:51:02 +09:00
Ian Barwick
2633d994ef Add FAQ item 2016-01-14 21:00:30 +09:00
Ian Barwick
5359d45463 Remove deprecated -l/--local-port from --help output
We'll still parse it for backwards compatibility
2016-01-14 20:51:03 +09:00
Ian Barwick
efa60d142c Update HISTORY 2016-01-14 11:49:55 +09:00
Ian Barwick
f3d0ab9ab9 Improve "archive_mode" configuration check
There's no compelling reason to require "archive_mode" to be enabled
for streaming replication. It is of course a good idea to archive WAL
using e.g. barman ( http://www.pgbarman.org/ ) as part of a comprehensive
backup strategy, but repmgr and streaming replication work fine without
it.

Per GitHub #141.

Also revise the configuration check for "archive_command" to be
triggered only when "archive_mode" is not "off", as from PostgreSQL
9.5 onwards "archive_mode" can also be "on" or "always".
2016-01-14 09:33:08 +09:00
Ian Barwick
7e6bac1be6 Display a couple of repetitive log messages in verbose mode only 2016-01-08 11:10:34 +09:00
Ian Barwick
b72058dba8 Update copyright notice to 2016 2016-01-05 15:57:46 +09:00
Ian Barwick
79d1332f9c Update HISTORY 2016-01-05 13:54:17 +09:00
Ian Barwick
cde721e3fc repmgr: -r/--rsync-only does not require a parameter 2016-01-05 11:04:20 +09:00
Ian Barwick
7b2439b824 repmgrd: -v/--verbose option does not require a parameter 2016-01-05 10:45:47 +09:00
Martin
787cd94142 Changing the "Maintainer" field in the Debian control file to a generic
value.
Who ever wants to build a deb from the tar-ball can modify the control file
and add his contact information.
2016-01-04 13:33:25 -03:00
Martin
056e64f635 Bug in the control file: Version should not have a 'v' in front 2016-01-04 13:28:51 -03:00
Martín Marqués
6b5a609d30 Merge pull request #72 from Bengrunt/patch-1
Updated makefile for deb creation

There's still a bug in the Version field of the Control file (it shouldn't have a 'v' in front of the version).

Will fix that immediately after.
2016-01-04 13:26:59 -03:00
Ian Barwick
7a4d84379c Prevent invalid replication_lag values being written to the monitoring table
A fix for this was introduced with commit ee9270fe8d
and removed in 4f1c67a1bf.

Refactor the original fix to simply omit attempting to write an invalid entry
to the monitoring table.
2016-01-04 13:31:50 +09:00
Ian Barwick
490e12b1af Clean up whitespace and comments 2016-01-04 11:58:33 +09:00
Martín Marqués
7b9df3ac8f Merge pull request #133 from martinmarques/fix-standby-follows-other-node-repmgrd-fails
Fix standby follows other node repmgrd fails
2015-12-29 13:25:09 -03:00
Martín Marqués
d6bf870316 Merge pull request #131 from martinmarques/fix-failed-standby
Fix failed standby
2015-12-29 13:24:08 -03:00
Ian Barwick
b15e8debe1 No need to manually create repmgr schema. 2015-12-29 11:56:39 +09:00
Martín Marqués
310faf1bd9 Merge pull request #137 from martinmarques/create_view_repl_show_nodes
We need to update the repmgr.sql file with the new view and create a
2015-12-22 13:29:47 -03:00
Martin
35caeaa66a We need to update the repmgr.sql file with the new view and create a
repmgr3.0_repmgr3.1.sql for the upgrade process when passing to 3.1
2015-12-22 13:27:42 -03:00
Ian Barwick
ba300c58f7 Make "cluster show" output dynamic
Calculate the width of the "Name" and "Upstream" columns dynamically.

Based on pull request #135 by sengaya, edited and modified by myself
to include a psql-like separator line.
2015-12-22 15:03:04 +09:00
Ian Barwick
f2370de2fa Minor fixes related to changes in 56b9ca79 2015-12-22 13:35:52 +09:00
Ian Barwick
3920deb803 Merge branch 'create_view_repl_show_nodes' of https://github.com/martinmarques/repmgr 2015-12-22 13:24:42 +09:00
Ian Barwick
e452bf6601 Short option -c does not take a value 2015-12-22 12:35:42 +09:00
Ian Barwick
167b4efbb3 Add note about why 'hot_standby=on' is currently required 2015-12-22 10:47:44 +09:00
Martin
56b9ca7992 Re-write part of commit from #116 so we have the query as a view to
use from the database.
Use the view instead of the query in cluster_show()
2015-12-18 20:53:52 -03:00
Martín Marqués
9c002c7e38 Merge pull request #116 from renard/follow
Add more information in "cluster show"
2015-12-18 18:31:07 -03:00
Ian Barwick
cfec04d19f Modify log output to hint 2015-12-18 17:24:04 +09:00
Martin
4f1c67a1bf This doesn't really mean the standby s following a new master, so we are
removing it.
Basically, on startup the standby will start receiving again from the
begining of the WAL and so received will be lower then applied.

A proper code is needed to make sure the standby is still following the
correct master (as per node information)
2015-12-17 12:17:03 -03:00
Martín Marqués
2f4fd2b7fa Merge pull request #110 from kjoe/master
Debian init script repmgrd process stop fix

This means rejecting pull request on #121
2015-12-12 08:46:56 -03:00
Martín Marqués
aca2b9547f Change where we activate back the standby node that was failed.
We will do it where we are sending the message that says that the
standby has recovered, eliminating some complexity
2015-12-11 09:36:48 -03:00
Martín Marqués
c9db7f57d2 Fix bug discovered last week which prevents recovered standby from being
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
2015-12-07 16:14:19 -03:00
Martín Marqués
96ac39ba0f Fix bug discovered last week which prevents recovered standby from being
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
2015-12-07 15:59:28 -03:00
Ian Barwick
88a3378203 Update HISTORY 2015-11-30 16:56:34 +09:00
Ian Barwick
4db0efab47 pg_replslot will only exist in 9.4 and later
We need to clean this up regardless of whether "use_replication_slots"
is set.
2015-11-30 15:40:43 +09:00
Ian Barwick
864d57953a Ensure pg_replslot directory is cleaned up after "standby clone" with rsync
This ensures the directory is in the same state as it would be
after cloning the standby with pg_basebackup, i.e. empty.
2015-11-30 15:31:43 +09:00
Ian Barwick
84d2a292b2 Update TODO 2015-11-30 14:26:00 +09:00
Ian Barwick
62d53b7622 Drop a previously created replication slot if base backup fails for any reason
Per Github #129
2015-11-30 12:39:19 +09:00
Ian Barwick
77d52adb53 Use existing config file options when attempting to connect to server during switchover 2015-11-30 12:20:43 +09:00
Ian Barwick
7a3e2f2a3a Update help output with switchover-related commands 2015-11-30 12:20:37 +09:00
Ian Barwick
120688013e Add "standby switchover" mode
Perform a switchover by:
 - stopping current primary node
 - promoting this standby node to primary
 - forcing previous primary node to follow this node

Caveats:
 - repmgrd must not be running, otherwise it may
   attempt a failover
   (TODO: find some way of notifying repmgrd of planned
    activity like this)
 - currently only set up for two-node operation; any other
   standbys will probably become downstream cascaded standbys
   of the old primary once it's restarted
 - as we're executing repmgr remotely (on the old primary),
   we'll need the location of its configuration file; this
   can be provided explicitly with -C/--remote-config-file,
   otherwise repmgr will look in default locations on the
   remote server
 - this does not yet support "rewinding" stopped nodes
   which will be unable to catch up with the primary

TODO:
 - update help, docs
 - make connection test timeouts/intervals configurable
2015-11-30 12:20:24 +09:00
Ian Barwick
f6d1db5edb Ensure all failures encountered during a base backup jump to the stop_backup label 2015-11-30 12:12:59 +09:00
Ian Barwick
02729d299b Put "starting backup" notice after any slot creation 2015-11-30 12:12:49 +09:00
Abhijit Menon-Sen
88a6a1376e If we're using replication slots, we need to create them earlier
Otherwise, if the backup takes a long time, we might lose WAL we need
long before we create the slot.
2015-11-30 12:12:49 +09:00
Ian Barwick
67df082ee9 Ensure 'master register --force' can't create more than one active primary node record 2015-11-26 10:55:16 +09:00
Ian Barwick
9ed71d6317 Remove unusable setting
Not a configuration item or command line option;
variable is always false.
2015-11-26 10:20:54 +09:00
Ian Barwick
933647d6de Make t_node_info generally available
And have it include all the fields from the repl_nodes table.
2015-11-25 12:57:18 +09:00
Ian Barwick
f99018b202 Add item about hash indexes. 2015-11-24 13:26:36 +09:00
Ian Barwick
ced87373cd Remove hint about hash indexes entirely.
Anyone needing them, particularly in a replication context, should
know what they're doing anyway.

See also: http://www.postgresql.org/docs/current/interactive/sql-createindex.html#AEN74175

"Also, changes to hash indexes are not replicated over streaming or file-based
 replication after the initial base backup, so they give wrong answers to
 queries that subsequently use them. For these reasons, hash index use is presently
 discouraged."
2015-11-24 13:16:46 +09:00
Ian Barwick
1db22546a9 Add missing 'break' 2015-11-23 20:16:42 +09:00
Ian Barwick
7ae0df9c85 Remove unused variable 2015-11-23 16:42:53 +09:00
Ian Barwick
7a80f7a096 Shift some common but not terribly informative log messages to verbose mode only 2015-11-23 14:23:40 +09:00
Ian Barwick
8710e067d0 Logging fixes 2015-11-23 14:07:14 +09:00
Ian Barwick
793950eabd Update TODO 2015-11-23 10:53:59 +09:00
Ian Barwick
d1b4280182 Add /etc/repmgr.conf as a default configuration file location
Also refactor configuration file handling while we're at it.

Previously a configuration file would be ignored if it couldn't
be opened, however that is now treated as an error.
2015-11-19 15:16:18 +09:00
Ian Barwick
64d038c823 Simplify logger_init() parameters
We're passing the t_configuration_options structure anyway, no need to
pass items it contains as separate parameters.
2015-11-19 14:05:20 +09:00
Ian Barwick
46dd734b3d Update code comments 2015-11-19 12:55:13 +09:00
Ian Barwick
0a2e4466aa Update TODO 2015-11-19 11:38:44 +09:00
Ian Barwick
17ab86f7ac Don't display warnings about unused command line parameters in --terse mode 2015-11-19 11:09:36 +09:00
Ian Barwick
d433982af7 repmgr: don't error out on superfluous command line options
When parsing command line arguments in check_parameters_for_action(),
create warnings for paramters supplied but not required (e.g. -D/--data-dir
for MASTER REGISTER), rather than fail with error(s), as the
presence of the parameters won't cause any problems.

Errors will still be raised for required-but-missing parameters, of course.
2015-11-19 10:19:15 +09:00
Ian Barwick
869b6a7a06 Remove implemented TODO item 2015-11-19 09:38:20 +09:00
Ian Barwick
9018dc65de Metadata update also handled by repmgr 2015-11-18 13:17:51 +09:00
Ian Barwick
9cbd8df089 When following a new primary, have repmgr (not repmgrd) create the new slot 2015-11-18 13:06:56 +09:00
Ian Barwick
67a81d1d47 Minor log message fixes 2015-11-18 09:10:22 +09:00
Ian Barwick
ab70007b75 Add a TODO item 2015-11-17 17:13:26 +09:00
Ian Barwick
0145aa0fc3 Update TODO 2015-11-17 15:42:36 +09:00
Ian Barwick
493c307b23 Remove implemented items from TODO list
* repmgr: add explicit --log-level flag, repurpose --verbose flag to
  show extra detailed/repetitive output only (see item below too)

  -> e0cbdd5b31

* debug output: show some repetitive output only if --verbose flag set to prevent
  excessive log growth

  -> 8ab1901a93
2015-11-17 15:35:13 +09:00
Ian Barwick
fc6225a511 Refactor get_master_connection() and update description
Use 'remote_conn' instead of 'master_conn', as the connection
handle can potentially be used for any node.
2015-11-17 13:59:28 +09:00
Ian Barwick
e3111d37ba get_master_connection(): order node list by node type and priority
This should make it more likely that the actual primary is first
in the retrieved list, reducing the number of connections to
other nodes in the cluster which need to be made.
2015-11-17 13:59:28 +09:00
Ian Barwick
2a1a9f2e61 Code formatting 2015-11-17 11:01:53 +09:00
Ian Barwick
71a667ecb8 Fix variable argument handling with log_hint()/log_verbose() 2015-11-17 07:34:20 +09:00
Ian Barwick
3ab91730c3 get_master_connection(): possible to use is_standby() now 2015-11-16 17:49:02 +09:00
Ian Barwick
dd7f9b79ae Tidy up logging output in dbutils.c
Log all executed SQL if verbose mode is enabled.
2015-11-16 17:39:42 +09:00
Ian Barwick
8ab1901a93 Repurpose -v/--verbose; add -t/--terse option (repmgr only)
repmgr and particularly repmgrd currently produce substantial
amounts of log output. Much of this is only useful when troubleshooting
or debugging.

Previously the -v/--verbose option just forced the log level to
INFO. With repmgrd this is pretty pointless - just set the log
level in the configuration file. With repmgr the configuration
file can be overriden by the new -L/--log-level option.

-v/--verbose now provides an additional, chattier/pedantic level
of logging ("Opening *this* logfile", "Executing *this* query",
"running in *this* loop") which is helpful for understanding
repmgr/repmgrd's behaviour, particularly for troubleshooting.
What additional verbose logging is generated will of course a
also depends on the log level set, so e.g. someone trying to
work out which configuration file is actually being opened
can use '--log-level=INFO --verbose' without being bothered
by an avalanche of extra verbose debugging output.

-t/--terse option will silence certain non-essential output, at
the moment any HINTs.

Note that -v/--verbose and -t/--terse are not mutually exclusive
(suggestions for better names welcome).
2015-11-16 13:06:32 +09:00
Ian Barwick
e0cbdd5b31 Add -L/--log-level command line option to repmgr
Overrides any setting in the config file. This will replace the
-v/--verbose option.
2015-11-13 20:54:58 +09:00
Ian Barwick
d62aaeedd0 Change directory warning to a hint 2015-11-13 20:28:27 +09:00
Ian Barwick
05cc7091b5 Explicitly mark static functions as static 2015-11-13 19:46:12 +09:00
Ian Barwick
d192d5665c detect_log_level(): return -1 to indicate invalid log level
0 is EMERG, which is not actually used but is valid. Prior to this
change, repmgr would complain about an invalid log level if set to
this.
2015-11-13 19:39:43 +09:00
Ian Barwick
3848b9011b README.md: add note about setting repmgr user search path 2015-11-13 17:05:45 +09:00
Ian Barwick
487aadc4b9 Add TODO items 2015-11-13 14:51:27 +09:00
Ian Barwick
3f5920a395 Add hint about -c/--fast-checkpoint
When cloning a server without this option, and pg_start_backup() takes time
to complete, repmgr appears to hang and give no indication of what may
or may not be happening. The hint provides an explanation for any
delay and possible action which could be taken to mitigate it.
2015-11-13 14:47:10 +09:00
Ian Barwick
617ea8cb78 Add log_hint() function for logging hints
There are a few places where additional hints are written as log
output, usually LOG_NOTICE. Create an explicit function to provide
hints in a standardized manner; by storing the log level of the
previous logger call, we can ensure the hint is only displayed when
the log message itself would be.

Part of an ongoing effort to better control repmgr's logging output.
2015-11-13 14:29:11 +09:00
Ian Barwick
142517fcca Always use catalog path when calling system functions
Removes any risk of issues due to search path mangling etc.
2015-11-11 11:17:47 +09:00
Ian Barwick
d722e2c74b Clean up help output
master/primary register only has one option
2015-11-10 10:49:29 +09:00
Ian Barwick
abb02cab76 Improve configuration file parsing
Related to Github #127.

- use the previously introduced repmgr_atoi() function to parse
  integers better
- collate all detected errors and output as a list, rather than
  failing on the first error.
2015-11-09 14:56:35 +09:00
Ian Barwick
8e66e4811c Rename variable 'reconnect_intvl' to 'reconnect_interval'
For consistency with the configuration file parameter name
2015-11-09 11:04:42 +09:00
Ian Barwick
ce5a541960 Use strtol() to parse config file arguments too 2015-11-09 11:02:04 +09:00
Ian Barwick
e12be52fa8 Use strtol() in place of atoi() to better verify integer parameters
Per GitHub #127
2015-11-09 11:00:53 +09:00
Ian Barwick
c0911d3286 cluster cleanup: standardize error message and return code 2015-11-05 15:54:58 +09:00
Ian Barwick
6e94432282 Fix log wording 2015-11-05 15:42:07 +09:00
Ian Barwick
29d9232e2f Add informtative logging output for 'repmgr cluster cleanup'
Per Github issue #126.
2015-11-05 15:38:40 +09:00
Ian Barwick
8973812144 "How many" -> "Number of" 2015-11-05 13:51:05 +09:00
Ian Barwick
e775a962ad "in 9.4" -> "from 9.4" 2015-11-05 13:49:36 +09:00
Ian Barwick
12204f7e56 Add note about logfile rotation and repmgrd 2015-11-03 15:34:43 +09:00
Ian Barwick
684f7590b7 Point out existence of the FAQ 2015-11-03 15:22:55 +09:00
Ian Barwick
9d589a780d Wording tweak to prevent ambiguity. 2015-11-03 15:18:57 +09:00
Ian Barwick
83e6d15410 Add TODO item 2015-10-30 17:03:01 +09:00
Ian Barwick
6a10fe0cd9 Replace "slave" with "standby" for consistency
"standby" is used everywhere except in these two error messages.
2015-10-30 16:59:39 +09:00
Ian Barwick
c664682c05 Remove duplicated TODO item 2015-10-30 16:58:38 +09:00
Ian Barwick
44acc8d719 Update HISTORY 2015-10-28 16:13:41 +09:00
Ian Barwick
b911483d5e Specify relevant node in error message 2015-10-28 16:10:08 +09:00
Ian Barwick
ee9270fe8d Terminate repmgrd if standby is no longer connected to upstream 2015-10-28 16:05:35 +09:00
Ian Barwick
d0a4eebeec Update FAQ entry about witness port specification 2015-10-27 18:26:38 +09:00
Ian Barwick
0f5e71f029 Add FAQ item about repmgr permissions in pg_hba.conf 2015-10-27 18:24:02 +09:00
Ian Barwick
dbd90d45f5 Update TODO 2015-10-27 14:41:04 +09:00
Ian Barwick
c8d0fb401f Add note about default log level 2015-10-27 14:39:30 +09:00
Ian Barwick
afda3419cc Put declarations at top of file 2015-10-27 13:17:18 +09:00
Ian Barwick
a86fa4ad4a Explicitly set default value for 'use_replication_slots' 2015-10-27 13:01:20 +09:00
Ian Barwick
7e3007f6e8 Add 'primary_response_timeout' as synonym for 'master_response_timeout'
We'll switch terminology in a future release and maintain
'master_response_timeout' for backwards compatibility
2015-10-27 10:23:54 +09:00
Ian Barwick
8c797a8fea Clarify items in the sample repmgr.conf file
Based on customer feedback.
2015-10-26 16:49:45 +09:00
Ian Barwick
56cec22f22 Use pg_malloc0() instead of malloc()
See also d08bd352c1
2015-10-26 15:34:33 +09:00
Ian Barwick
b61649a3e3 Bump dev version number 2015-10-26 15:14:14 +09:00
Ian Barwick
ded716e403 Improve logging and event notifications when following new upstream node 2015-10-23 09:36:42 +09:00
Ian Barwick
d639dc3342 Add note about checking replication slots when following upstream node 2015-10-23 09:21:01 +09:00
Ian Barwick
17ed81ebb7 Improve log messages when following new primary 2015-10-23 09:17:20 +09:00
Ian Barwick
b00c507ee4 Minor formatting tweak 2015-10-23 08:51:26 +09:00
Ian Barwick
55d8b2ad9c Clarify some items in sample config file
Also change "master" to "primary" in the comments for consistency
with main PostgreSQL terminology. We'll need to add aliases
for the configuration parameters at some point...
2015-10-23 08:49:45 +09:00
Ian Barwick
c918aaad4a Update TODO 2015-10-23 08:49:44 +09:00
Ian Barwick
6e7eee4c01 Only log some debug items if verbose flag is set. 2015-10-23 08:29:35 +09:00
Ian Barwick
5c59e8fc5b Add missing space 2015-10-22 13:13:01 +09:00
Ian Barwick
eba0b6bb1e Update TODO 2015-10-14 16:57:24 +09:00
Ian Barwick
3bc0b80a71 Reword log level error message to be more like the Postgres one 2015-10-07 10:54:47 +09:00
Ian Barwick
06b9e0a8ec Clarify purpose of get_repmgr_schema() 2015-10-07 10:54:47 +09:00
Martín Marqués
120be2db1c Fix bug which prevents repmgrd from starting when the cluster name has
upper case letters.
2015-10-06 20:54:28 -03:00
Ian Barwick
12bd7da836 Clean up --help output
There are a confusing number of command line options, some
of which are only valid for particular operations, e.g. "standby clone".
2015-10-05 11:53:45 +09:00
Ian Barwick
2fd905cf9e Update HISTORY 2015-10-02 14:40:52 +09:00
Sébastien Gross
dd7ebdc1c7 Add more information in "cluster show"
- Add display of current node name.
- Add display of upstream node name.
2015-09-23 16:02:10 +02:00
József Kószó
1636805fa1 Debian init script repmgrd process stop fix 2015-09-21 03:13:19 +02:00
Benjamin Guillon
899d789699 Fixed version number and package dependencies in deb control file. 2015-06-04 11:26:36 +02:00
Benjamin Guillon
cd7a3215df Fixed tabs/spaces issues caused by lazy copy/paster in github web UI when creating the PR. 2015-06-04 11:25:45 +02:00
Benjamin Guillon
f8fd344d9f Updated makefile for deb creation
Added the ability to fetch the installed postgresql version when building the deb package so that install paths are correct.
2015-06-04 09:51:03 +02:00
48 changed files with 11967 additions and 3069 deletions

View File

@@ -2,7 +2,7 @@ License and Contributions
=========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2015, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
Copyright 2010-2017, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers.
@@ -21,9 +21,11 @@ copy of the relevant Copyright Assignment Form.
Code style
----------
Code in repmgr is formatted to a consistent style using the following command:
Code in repmgr should be formatted to the same standards as the main PostgreSQL
project. For more details see:
astyle --style=ansi --indent=tab --suffix=none *.c *.h
https://www.postgresql.org/docs/current/static/source-format.html
Contributors should reformat their code similarly before submitting code to
the project, in order to minimize merge conflicts with other work.
the project, in order to minimize merge conflicts with other work.

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2015, 2ndQuadrant Limited
Copyright (c) 2010-2017, 2ndQuadrant Limited
All rights reserved.
This program is free software: you can redistribute it and/or modify

View File

@@ -1,237 +1 @@
====================================================
PostgreSQL Automatic Failover - User Documentation
====================================================
Automatic Failover
==================
repmgr allows for automatic failover when it detects the failure of the master node.
Following is a quick setup for this.
Installation
============
For convenience, we define:
**node1**
is the fully qualified domain name of the Master server, IP 192.168.1.10
**node2**
is the fully qualified domain name of the Standby server, IP 192.168.1.11
**witness**
is the fully qualified domain name of the server used as a witness, IP 192.168.1.12
**Note:** We don't recommend using names with the status of a server like «masterserver»,
because it would be confusing once a failover takes place and the Master is
now on the «standbyserver».
Summary
-------
2 PostgreSQL servers are involved in the replication. Automatic failover needs
a vote to decide what server it should promote, so an odd number is required.
A witness-repmgrd is installed in a third server where it uses a PostgreSQL
cluster to communicate with other repmgrd daemons.
1. Install PostgreSQL in all the servers involved (including the witness server)
2. Install repmgr in all the servers involved (including the witness server)
3. Configure the Master PostreSQL
4. Clone the Master to the Standby using "repmgr standby clone" command
5. Configure repmgr in all the servers involved (including the witness server)
6. Register Master and Standby nodes
7. Initiate witness server
8. Start the repmgrd daemons in all nodes
**Note** A complete High-Availability design needs at least 3 servers to still have
a backup node after a first failure.
Install PostgreSQL
------------------
You can install PostgreSQL using any of the recommended methods. You should ensure
it's 9.0 or later.
Install repmgr
--------------
Install repmgr following the steps in the README file.
Configure PostreSQL
-------------------
Log in to node1.
Edit the file postgresql.conf and modify the parameters::
listen_addresses='*'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .' # we can also use exit 0, anything that
# just does nothing
max_wal_senders = 10
wal_keep_segments = 5000 # 80 GB required on pg_xlog
hot_standby = on
shared_preload_libraries = 'repmgr_funcs'
Edit the file pg_hba.conf and add lines for the replication::
host repmgr repmgr 127.0.0.1/32 trust
host repmgr repmgr 192.168.1.10/30 trust
host replication all 192.168.1.10/30 trust
**Note:** It is also possible to use a password authentication (md5), .pgpass file
should be edited to allow connection between each node.
Create the user and database to manage replication::
su - postgres
createuser -s repmgr
createdb -O repmgr repmgr
psql -f /usr/share/postgresql/9.0/contrib/repmgr_funcs.sql repmgr
Restart the PostgreSQL server::
pg_ctl -D $PGDATA restart
And check everything is fine in the server log.
Create the ssh-key for the postgres user and copy it to other servers::
su - postgres
ssh-keygen # /!\ do not use a passphrase /!\
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
exit
rsync -avz ~postgres/.ssh/authorized_keys node2:~postgres/.ssh/
rsync -avz ~postgres/.ssh/authorized_keys witness:~postgres/.ssh/
rsync -avz ~postgres/.ssh/id_rsa* node2:~postgres/.ssh/
rsync -avz ~postgres/.ssh/id_rsa* witness:~postgres/.ssh/
Clone Master
------------
Log in to node2.
Clone node1 (the current Master)::
su - postgres
repmgr -d repmgr -U repmgr -h node1 standby clone
Start the PostgreSQL server::
pg_ctl -D $PGDATA start
And check everything is fine in the server log.
Configure repmgr
----------------
Log in to each server and configure repmgr by editing the file
/etc/repmgr/repmgr.conf::
cluster=my_cluster
node=1
node_name=earth
conninfo='host=192.168.1.10 dbname=repmgr user=repmgr'
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic
promote_command='promote_command.sh'
follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf'
**cluster**
is the name of the current replication.
**node**
is the number of the current node (1, 2 or 3 in the current example).
**node_name**
is an identifier for every node.
**conninfo**
is used to connect to the local PostgreSQL server (where the configuration file is) from any node. In the witness server configuration you need to add a 'port=5499' to the conninfo.
**master_response_timeout**
is the maximum amount of time we are going to wait before deciding the master has died and start the failover procedure.
**reconnect_attempts**
is the number of times we will try to reconnect to master after a failure has been detected and before start the failover procedure.
**reconnect_interval**
is the amount of time between retries to reconnect to master after a failure has been detected and before start the failover procedure.
**failover**
configure behavior: *manual* or *automatic*.
**promote_command**
the command executed to do the failover (including the PostgreSQL failover itself). The command must return 0 on success.
**follow_command**
the command executed to address the current standby to another Master. The command must return 0 on success.
Register Master and Standby
---------------------------
Log in to node1.
Register the node as Master::
su - postgres
repmgr -f /etc/repmgr/repmgr.conf master register
Log in to node2. Register it as a standby::
su - postgres
repmgr -f /etc/repmgr/repmgr.conf standby register
Initialize witness server
-------------------------
Log in to witness.
Initialize the witness server::
su - postgres
repmgr -d repmgr -U repmgr -h 192.168.1.10 -D $WITNESS_PGDATA -f /etc/repmgr/repmgr.conf witness create
The witness server needs the following information from the command
line:
* Connection details for the current master, to copy the cluster
configuration.
* A location for initializing its own $PGDATA.
repmgr will also ask for the superuser password on the witness database so
it can reconnect when needed (the command line option --initdb-no-pwprompt
will set up a password-less superuser).
By default the witness server will listen on port 5499; this value can be
overridden by explicitly providing the port number in the conninfo string
in repmgr.conf. (Note that it is also possible to specify the port number
with the -l/--local-port option, however this option is now deprecated and
will be overridden by a port setting in the conninfo string).
Start the repmgrd daemons
-------------------------
Log in to node2 and witness::
su - postgres
repmgrd -f /etc/repmgr/repmgr.conf --daemonize -> /var/log/postgresql/repmgr.log 2>&1
**Note:** The Master does not need a repmgrd daemon.
Suspend Automatic behavior
==========================
Edit the repmgr.conf of the node to remove from automatic processing and change::
failover=manual
Then, signal repmgrd daemon::
su - postgres
kill -HUP $(pidof repmgrd)
Usage
=====
The repmgr documentation is in the README file (how to build, options, etc.)
The contents of this file have been incorporated into the main README.md document.

51
FAQ.md
View File

@@ -34,6 +34,11 @@ General
replication slots, setting a higher figure will make adding new nodes
easier.
- Does `repmgr` support hash indexes?
No. Hash indexes and replication do not mix well and their use is
explicitly discouraged; see:
https://www.postgresql.org/docs/current/interactive/sql-createindex.html#AEN74175
`repmgr`
--------
@@ -96,8 +101,9 @@ General
is intended to support running the witness server as a separate
instance on a normal node server, rather than on its own dedicated server.
To specify a port for the witness server, supply the port number to
repmgr with the `-l/--local-port` command line option.
To specify different port for the witness server, supply the port number
in the `conninfo` string in `repmgr.conf`
(repmgr 3.0.1 and earlier: use the `-l/--local-port` option)
- Do I need to include `shared_preload_libraries = 'repmgr_funcs'`
in `postgresql.conf` if I'm not using `repmgrd`?
@@ -106,6 +112,31 @@ General
If you later decide to run `repmgrd`, you just need to add
`shared_preload_libraries = 'repmgr_funcs'` and restart PostgreSQL.
- I've provided replication permission for the `repmgr` user in `pg_hba.conf`
but `repmgr`/`repmgrd` complains it can't connect to the server... Why?
`repmgr`/`repmgrd` need to be able to connect to the repmgr database
with a normal connection to query metadata. The `replication` connection
permission is for PostgreSQL's streaming replication and doesn't
necessarily need to be the `repmgr` user.
- When cloning a standby, why do I need to provide the connection parameters
for the primary server on the command line, not in the configuration file?
Cloning a standby is a one-time action; the role of the server being cloned
from could change, so fixing it in the configuration file would create
confusion. If `repmgr` needs to establish a connection to the primary
server, it can retrieve this from the `repl_nodes` table or if necessary
scan the replication cluster until it locates the active primary.
- Why is there no foreign key on the `node_id` column in the `repl_events`
table?
Under some circumstances event notifications can be generated for servers
which have not yet been registered; it's also useful to retain a record
of events which includes servers removed from the replication cluster
which no longer have an entry in the `repl_nodes` table.
`repmgrd`
---------
@@ -121,6 +152,9 @@ General
In `repmgr.conf`, set its priority to a value of 0 or less.
Additionally, if `failover` is set to `manual`, the node will never
be considered as a promotion candidate.
- Does `repmgrd` support delayed standbys?
`repmgrd` can monitor delayed standbys - those set up with
@@ -134,3 +168,16 @@ General
Note that after registering a delayed standby, `repmgrd` will only start
once the metadata added in the master node has been replicated.
- How can I get `repmgrd` to rotate its logfile?
Configure your system's `logrotate` service to do this; see example
in README.md
- I've recloned a failed master as a standby, but `repmgrd` refuses to start?
Check you registered the standby after recloning. If unregistered the standby
cannot be considered as a promotion candidate even if `failover` is set to
`automatic`, which is probably not what you want. `repmgrd` will start if
`failover` is set to `manual` so the node's replication status can still
be monitored, if desired.

156
HISTORY
View File

@@ -1,4 +1,154 @@
3.0.2 2015-09-
3.4.0 2019-02-
default log level is now INFO (Ian)
repmgr: fix `standby register --force` when updating existing node record (Ian)
repmgrd: set LSN shared memory value at standby startup (Ian)
repmgrd: improve logging during failover (Ian)
3.3.2 2017-06-01
Add support for PostgreSQL 10 (Ian)
repmgr: ensure --replication-user option is honoured when passing database
connection parameters as a conninfo string (Ian)
repmgr: improve detection of pg_rewind on remote server (Ian)
repmgr: add DETAIL log output for additional clarification of error messages (Ian)
repmgr: suppress various spurious error messages in `standby follow` and
`standby switchover` (Ian)
repmgr: add missing `-P` option (Ian)
repmgrd: monitoring statistic reporting fixes (Ian)
3.3.1 2017-03-13
repmgrd: prevent invalid apply lag value being written to the
monitoring table (Ian)
repmgrd: fix error in XLogRecPtr conversion when calculating
monitoring statistics (Ian)
repmgr: if replication slots in use, where possible delete slot on old
upstream node after following new upstream (Ian)
repmgr: improve logging of rsync actions (Ian)
repmgr: improve `standby clone` when synchronous replication in use (Ian)
repmgr: stricter checking of allowed node id values
repmgr: enable `master register --force` when there is a foreign key
dependency from a standby node (Ian)
3.3 2016-12-27
repmgr: always log to STDERR even if log facility defined (Ian)
repmgr: add --log-to-file to log repmgr output to the defined
log facility (Ian)
repmgr: improve handling of command line parameter errors (Ian)
repmgr: add option --upstream-conninfo to explicitly set
'primary_conninfo' in recovery.conf (Ian)
repmgr: enable a standby to be registered which isn't running (Ian)
repmgr: enable `standby register --force` to update a node record
with cascaded downstream node records (Ian)
repmgr: add option `--no-conninfo-password` (Abhijit, Ian)
repmgr: add initial support for PostgreSQL 10.0 (Ian)
repmgr: escape values in primary_conninfo if needed (Ian)
3.2.1 2016-10-24
repmgr: require a valid repmgr cluster name unless -F/--force
supplied (Ian)
repmgr: check master server is registered with repmgr before
cloning (Ian)
repmgr: ensure data directory defaults to that of the source node (Ian)
repmgr: various fixes to Barman cloning mode (Gianni, Ian)
repmgr: fix `repmgr cluster crosscheck` output (Ian)
3.2 2016-10-05
repmgr: add support for cloning from a Barman backup (Gianni)
repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni)
repmgr: suppress connection error display in `repmgr cluster show`
unless `--verbose` supplied (Ian)
repmgr: add commands `witness register` and `witness unregister` (Ian)
repmgr: enable `standby unregister` / `witness unregister` to be
executed for a node which is not running (Ian)
repmgr: remove deprecated command line options --initdb-no-pwprompt and
-l/--local-port (Ian)
repmgr: before cloning with pg_basebackup, check that sufficient free
walsenders are available (Ian)
repmgr: add option `--wait-sync` for `standby register` which causes
repmgr to wait for the registered node record to synchronise to
the standby (Ian)
repmgr: add option `--copy-external-config-files` for files outside
of the data directory (Ian)
repmgr: only require `wal_keep_segments` to be set in certain corner
cases (Ian)
repmgr: better support cloning from a node other than the one to
stream from (Ian)
repmgrd: add configuration options to override the default pg_ctl
commands (Jarkko Oranen, Ian)
repmgrd: don't start if node is inactive and failover=automatic (Ian)
packaging: improve "repmgr-auto" Debian package (Gianni)
3.1.5 2016-08-15
repmgrd: in a failover situation, prevent endless looping when
attempting to establish the status of a node with
`failover=manual` (Ian)
repmgrd: improve handling of failover events on standbys with
`failover=manual`, and create a new event notification
for this, `standby_disconnect_manual` (Ian)
repmgr: add further event notifications (Gianni)
repmgr: when executing `standby switchover`, don't collect remote
command output unless required (Gianni, Ian)
repmgrd: improve standby monitoring query (Ian, based on suggestion
from Álvaro)
repmgr: various command line handling improvements (Ian)
3.1.4 2016-07-12
repmgr: new configuration option for setting "restore_command"
in the recovery.conf file generated by repmgr (Martín)
repmgr: add --csv option to "repmgr cluster show" (Gianni)
repmgr: enable provision of a conninfo string as the -d/--dbname
parameter, similar to other PostgreSQL utilities (Ian)
repmgr: during switchover operations improve detection of
demotion candidate shutdown (Ian)
various bugfixes and documentation updates (Ian, Martín)
3.1.3 2016-05-17
repmgrd: enable monitoring when a standby is catching up by
replaying archived WAL (Ian)
repmgrd: when upstream_node_id is NULL, assume upstream node
to be current master (Ian)
repmgrd: check for reappearance of the master node if standby
promotion fails (Ian)
improve handling of rsync failure conditions (Martín)
3.1.2 2016-04-12
Fix pg_ctl path generation in do_standby_switchover() (Ian)
Regularly sync witness server repl_nodes table (Ian)
Documentation improvements (Gianni, dhyannataraj)
(Experimental) ensure repmgr handles failover slots when copying
in rsync mode (Craig, Ian)
rsync mode handling fixes (Martín)
Enable repmgr to compile against 9.6devel (Ian)
3.1.1 2016-02-24
Add '-P/--pwprompt' option for "repmgr create witness" (Ian)
Prevent repmgr/repmgrd running as root (Ian)
3.1.0 2016-02-01
Add "repmgr standby switchover" command (Ian)
Revised README file (Ian)
Remove requirement for 'archive_mode' to be enabled (Ian)
Improve -?/--help output, showing default values if relevant (Ian)
Various bugfixes to command line/configuration parameter handling (Ian)
3.0.3 2016-01-04
Create replication slot if required before base backup is run (Abhijit)
standy clone: when using rsync, clean up "pg_replslot" directory (Ian)
Improve --help output (Ian)
Improve config file parsing (Ian)
Various logging output improvements, including explicit HINTS (Ian)
Add --log-level to explicitly set log level on command line (Ian)
Repurpose --verbose to display extra log output (Ian)
Add --terse to hide hints and other non-critical output (Ian)
Reference internal functions with explicit catalog path (Ian)
When following a new primary, have repmgr (not repmgrd) create the new slot (Ian)
Add /etc/repmgr.conf as a default configuration file location (Ian)
Prevent repmgrd's -v/--verbose option expecting a parameter (Ian)
Prevent invalid replication_lag values being written to the monitoring table (Ian)
Improve repmgrd behaviour when monitored standby node is temporarily
unavailable (Martín)
3.0.2 2015-10-02
Improve handling of --help/--version options; and improve help output (Ian)
Improve handling of situation where logfile can't be opened (Ian)
Always pass -D/--pgdata option to pg_basebackup (Ian)
@@ -12,7 +162,9 @@
Update tablespace remapping in --rsync-only mode for 9.5 and later (Ian)
Deprecate `-l/--local-port` option - the port can be extracted
from the conninfo string in repmgr.conf (Ian)
Add STANDBY UNREGISTE (Vik Fearing)
Add STANDBY UNREGISTER (Vik Fearing)
Don't fail with error when registering master if schema already defined (Ian)
Fixes to whitespace handling when parsing config file (Ian)
3.0.1 2015-04-16
Prevent repmgrd from looping infinitely if node was not registered (Ian)

View File

@@ -1,24 +1,34 @@
#
# Makefile
# Copyright (c) 2ndQuadrant, 2010-2015
# Copyright (c) 2ndQuadrant, 2010-2017
HEADERS = $(wildcard *.h)
repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o
repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o
repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o dirmod.o compat.o
DATA = repmgr.sql uninstall_repmgr.sql
REGRESS = repmgr_funcs repmgr_test
PG_CPPFLAGS = -I$(libpq_srcdir)
PG_LIBS = $(libpq_pgport)
PG_CPPFLAGS = -I$(includedir_internal) -I$(libpq_srcdir)
PG_LIBS = $(libpq_pgport)
all: repmgrd repmgr
all: repmgrd repmgr
$(MAKE) -C sql
repmgrd: $(repmgrd_OBJS)
$(CC) $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgrd
$(CC) -o repmgrd $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX)
$(MAKE) -C sql
repmgr: $(repmgr_OBJS)
$(CC) $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgr
$(CC) -o repmgr $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX)
# Make all objects depend on all include files. This is a bit of a
# shotgun approach, but the codebase is small enough that a complete rebuild
# is very fast anyway.
$(repmgr_OBJS): $(HEADERS)
$(repmgrd_OBJS): $(HEADERS)
ifdef USE_PGXS
PG_CONFIG = pg_config
@@ -31,8 +41,8 @@ include $(top_builddir)/src/Makefile.global
include $(top_srcdir)/contrib/contrib-global.mk
endif
# XXX: Try to use PROGRAM construct (see pgxs.mk) someday. Right now
# is overriding pgxs install.
# XXX: This overrides the pgxs install target - we're building two binaries,
# which is not supported by pgxs.mk's PROGRAM construct.
install: install_prog install_ext
install_prog:
@@ -43,6 +53,12 @@ install_prog:
install_ext:
$(MAKE) -C sql install
# Distribution-specific package building targets
# ----------------------------------------------
#
# XXX we recommend using the PGDG-supplied packages where possible;
# see README.md for details.
install_rhel:
mkdir -p '$(DESTDIR)/etc/init.d/'
$(INSTALL_PROGRAM) RHEL/repmgrd.init '$(DESTDIR)/etc/init.d/repmgrd'
@@ -67,16 +83,23 @@ clean:
rm -f repmgr
$(MAKE) -C sql clean
# Get correct version numbers and install paths, depending on your postgres version
PG_VERSION = $(shell pg_config --version | cut -d ' ' -f 2 | cut -d '.' -f 1,2)
REPMGR_VERSION = $(shell grep REPMGR_VERSION version.h | cut -d ' ' -f 3 | cut -d '"' -f 2)
PKGLIBDIR = $(shell pg_config --pkglibdir)
SHAREDIR = $(shell pg_config --sharedir)
PGBINDIR = /usr/lib/postgresql/$(PG_VERSION)/bin
deb: repmgrd repmgr
mkdir -p ./debian/usr/bin
cp repmgrd repmgr ./debian/usr/bin/
mkdir -p ./debian/usr/share/postgresql/9.0/contrib/
cp sql/repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/
cp sql/uninstall_repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/
mkdir -p ./debian/usr/lib/postgresql/9.0/lib/
cp sql/repmgr_funcs.so ./debian/usr/lib/postgresql/9.0/lib/
mkdir -p ./debian/usr/bin ./debian$(PGBINDIR)
cp repmgrd repmgr ./debian$(PGBINDIR)
ln -s ../..$(PGBINDIR)/repmgr ./debian/usr/bin/repmgr
mkdir -p ./debian$(SHAREDIR)/contrib/
cp sql/repmgr_funcs.sql ./debian$(SHAREDIR)/contrib/
cp sql/uninstall_repmgr_funcs.sql ./debian$(SHAREDIR)/contrib/
mkdir -p ./debian$(PKGLIBDIR)/
cp sql/repmgr_funcs.so ./debian$(PKGLIBDIR)/
dpkg-deb --build debian
mv debian.deb ../postgresql-repmgr-9.0_1.0.0.deb
mv debian.deb ../postgresql-repmgr-$(PG_VERSION)_$(REPMGR_VERSION).deb
rm -rf ./debian/usr

View File

@@ -1,118 +1 @@
repmgr quickstart guide
=======================
This quickstart guide provides some annotated examples on basic
`repmgr` setup. It assumes you are familiar with PostgreSQL replication
concepts setup and Linux/UNIX system administration.
For the purposes of this guide, we'll assume the database user will be
`repmgr_usr` and the database will be `repmgr_db`.
Master setup
------------
1. Configure PostgreSQL
- create user and database:
```
CREATE ROLE repmgr_usr LOGIN SUPERUSER;
CREATE DATABASE repmgr_db OWNER repmgr_usr;
```
- configure `postgresql.conf` for replication (see README.md for sample
settings)
- update `pg_hba.conf`, e.g.:
```
host repmgr_db repmgr_usr 192.168.1.0/24 trust
host replication repmgr_usr 192.168.1.0/24 trust
```
Restart the PostgreSQL server after making these changes.
2. Create the `repmgr` configuration file:
$ cat /path/to/repmgr/node1/repmgr.conf
cluster=test
node=1
node_name=node1
conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/path/to/postgres/bin
(For an annotated `repmgr.conf` file, see `repmgr.conf.sample` in the
repository's root directory).
3. Register the master node with `repmgr`:
$ repmgr -f /path/to/repmgr/node1/repmgr.conf --verbose master register
[2015-03-03 17:45:53] [INFO] repmgr connecting to master database
[2015-03-03 17:45:53] [INFO] repmgr connected to master, checking its state
[2015-03-03 17:45:53] [INFO] master register: creating database objects inside the repmgr_test schema
[2015-03-03 17:45:53] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db)
Standby setup
-------------
1. Use `repmgr standby clone` to clone a standby from the master:
repmgr -D /path/to/standby/data -d repmgr_db -U repmgr_usr --verbose standby clone 192.168.1.2
[2015-03-03 18:18:21] [NOTICE] No configuration file provided and default file './repmgr.conf' not found - continuing with default values
[2015-03-03 18:18:21] [NOTICE] repmgr Destination directory ' /path/to/standby/data' provided
[2015-03-03 18:18:21] [INFO] repmgr connecting to upstream node
[2015-03-03 18:18:21] [INFO] repmgr connected to upstream node, checking its state
[2015-03-03 18:18:21] [INFO] Successfully connected to upstream node. Current installation size is 27 MB
[2015-03-03 18:18:21] [NOTICE] Starting backup...
[2015-03-03 18:18:21] [INFO] creating directory " /path/to/standby/data"...
[2015-03-03 18:18:21] [INFO] Executing: 'pg_basebackup -l "repmgr base backup" -h localhost -p 9595 -U repmgr_usr -D /path/to/standby/data '
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
[2015-03-03 18:18:23] [NOTICE] repmgr standby clone (using pg_basebackup) complete
[2015-03-03 18:18:23] [NOTICE] HINT: You can now start your postgresql server
[2015-03-03 18:18:23] [NOTICE] for example : pg_ctl -D /path/to/standby/data start
Note that the `repmgr.conf` file is not required when cloning a standby.
However we recommend providing a valid `repmgr.conf` if you wish to use
replication slots, or want `repmgr` to log the clone event to the
`repl_events` table.
This will clone the PostgreSQL database files from the master, including its
`postgresql.conf` and `pg_hba.conf` files, and additionally automatically create
the `recovery.conf` file containing the correct parameters to start streaming
from the primary node.
2. Start the PostgreSQL server
3. Create the `repmgr` configuration file:
$ cat /path/node2/repmgr/repmgr.conf
cluster=test
node=2
node_name=node2
conninfo='host=repmgr_node2 user=repmgr_usr dbname=repmgr_db'
pg_bindir=/path/to/postgres/bin
4. Register the standby node with `repmgr`:
$ repmgr -f /path/to/repmgr/node2/repmgr.conf --verbose standby register
[2015-03-03 18:24:34] [NOTICE] Opening configuration file: /path/to/repmgr/node2/repmgr.conf
[2015-03-03 18:24:34] [INFO] repmgr connecting to standby database
[2015-03-03 18:24:34] [INFO] repmgr connecting to master database
[2015-03-03 18:24:34] [INFO] finding node list for cluster 'test'
[2015-03-03 18:24:34] [INFO] checking role of cluster node '1'
[2015-03-03 18:24:34] [INFO] repmgr connected to master, checking its state
[2015-03-03 18:24:34] [INFO] repmgr registering the standby
[2015-03-03 18:24:34] [INFO] repmgr registering the standby complete
[2015-03-03 18:24:34] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db)
This concludes the basic `repmgr` setup of master and standby. The records
created in the `repl_nodes` table should look something like this:
repmgr_db=# SELECT * from repmgr_test.repl_nodes;
id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active
----+---------+------------------+---------+-------+----------------------------------------------------+-----------+----------+--------
1 | primary | | test | node1 | host=repmgr_node1 user=repmgr_usr dbname=repmgr_db | | 0 | t
2 | standby | 1 | test | node2 | host=repmgr_node2 user=repmgr_usr dbname=repmgr_db | | 0 | t
(2 rows)
The contents of this file have been incorporated into the main README.md document.

1981
README.md

File diff suppressed because it is too large Load Diff

View File

@@ -1,61 +0,0 @@
Summary: repmgr
Name: repmgr
Version: 3.0
Release: 1
License: GPLv3
Group: System Environment/Daemons
URL: http://repmgr.org
Packager: Ian Barwick <ian@2ndquadrant.com>
Vendor: 2ndQuadrant Limited
Distribution: centos
Source0: %{name}-%{version}.tar.gz
BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root
%description
repmgr is a utility suite which greatly simplifies
the process of setting up and managing replication
using streaming replication within a cluster of
PostgreSQL servers.
%prep
%setup
%build
export PATH=$PATH:/usr/pgsql-9.3/bin/
%{__make} USE_PGXS=1
%install
[ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot}
export PATH=$PATH:/usr/pgsql-9.3/bin/
%{__make} USE_PGXS=1 install DESTDIR=%{buildroot} INSTALL="install -p"
%{__make} USE_PGXS=1 install_prog DESTDIR=%{buildroot} INSTALL="install -p"
%{__make} USE_PGXS=1 install_rhel DESTDIR=%{buildroot} INSTALL="install -p"
%clean
[ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot}
%files
%defattr(-,root,root)
/usr/bin/repmgr
/usr/bin/repmgrd
/usr/pgsql-9.3/bin/repmgr
/usr/pgsql-9.3/bin/repmgrd
/usr/pgsql-9.3/lib/repmgr_funcs.so
/usr/pgsql-9.3/share/contrib/repmgr.sql
/usr/pgsql-9.3/share/contrib/repmgr_funcs.sql
/usr/pgsql-9.3/share/contrib/uninstall_repmgr.sql
/usr/pgsql-9.3/share/contrib/uninstall_repmgr_funcs.sql
%attr(0755,root,root)/etc/init.d/repmgrd
%attr(0644,root,root)/etc/sysconfig/repmgrd
%attr(0644,root,root)/etc/repmgr/repmgr.conf.sample
%changelog
* Tue Mar 10 2015 Ian Barwick ian@2ndquadrant.com>
- build for repmgr 3.0
* Thu Jun 05 2014 Nathan Van Overloop <nathan.van.overloop@nexperteam.be> 2.0.2
- fix witness creation to create db and user if needed
* Fri Apr 04 2014 Nathan Van Overloop <nathan.van.overloop@nexperteam.be> 2.0.1
- initial build for RHEL6

View File

@@ -1,133 +0,0 @@
#!/bin/sh
#
# chkconfig: - 75 16
# description: Enable repmgrd replication management and monitoring daemon for PostgreSQL
# processname: repmgrd
# pidfile="/var/run/${NAME}.pid"
# Source function library.
INITD=/etc/rc.d/init.d
. $INITD/functions
# Get function listing for cross-distribution logic.
TYPESET=`typeset -f|grep "declare"`
# Get network config.
. /etc/sysconfig/network
DESC="PostgreSQL replication management and monitoring daemon"
NAME=repmgrd
REPMGRD_ENABLED=no
REPMGRD_OPTS=
REPMGRD_USER=postgres
REPMGRD_BIN=/usr/pgsql-9.3/bin/repmgrd
REPMGRD_PIDFILE=/var/run/repmgrd.pid
REPMGRD_LOCK=/var/lock/subsys/${NAME}
REPMGRD_LOG=/var/lib/pgsql/9.3/data/pg_log/repmgrd.log
# Read configuration variable file if it is present
[ -r /etc/sysconfig/$NAME ] && . /etc/sysconfig/$NAME
# For SELinux we need to use 'runuser' not 'su'
if [ -x /sbin/runuser ]
then
SU=runuser
else
SU=su
fi
test -x $REPMGRD_BIN || exit 0
case "$REPMGRD_ENABLED" in
[Yy]*)
break
;;
*)
exit 0
;;
esac
if [ -z "${REPMGRD_OPTS}" ]
then
echo "Not starting ${NAME}, REPMGRD_OPTS not set in /etc/sysconfig/${NAME}"
exit 0
fi
start()
{
REPMGRD_START=$"Starting ${NAME} service: "
# Make sure startup-time log file is valid
if [ ! -e "${REPMGRD_LOG}" -a ! -h "${REPMGRD_LOG}" ]
then
touch "${REPMGRD_LOG}" || exit 1
chown ${REPMGRD_USER}:postgres "${REPMGRD_LOG}"
chmod go-rwx "${REPMGRD_LOG}"
[ -x /sbin/restorecon ] && /sbin/restorecon "${REPMGRD_LOG}"
fi
echo -n "${REPMGRD_START}"
$SU -l $REPMGRD_USER -c "${REPMGRD_BIN} ${REPMGRD_OPTS} -p ${REPMGRD_PIDFILE} &" >> "${REPMGRD_LOG}" 2>&1 < /dev/null
sleep 2
pid=`head -n 1 "${REPMGRD_PIDFILE}" 2>/dev/null`
if [ "x${pid}" != "x" ]
then
success "${REPMGRD_START}"
touch "${REPMGRD_LOCK}"
echo $pid > "${REPMGRD_PIDFILE}"
echo
else
failure "${REPMGRD_START}"
echo
script_result=1
fi
}
stop()
{
echo -n $"Stopping ${NAME} service: "
if [ -e "${REPMGRD_LOCK}" ]
then
killproc ${NAME}
ret=$?
if [ $ret -eq 0 ]
then
echo_success
rm -f "${REPMGRD_PIDFILE}"
rm -f "${REPMGRD_LOCK}"
else
echo_failure
script_result=1
fi
else
# not running; per LSB standards this is "ok"
echo_success
fi
echo
}
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status -p $REPMGRD_PIDFILE $NAME
script_result=$?
;;
restart)
stop
start
;;
*)
echo $"Usage: $0 {start|stop|status|restart}"
exit 2
esac
exit $script_result

View File

@@ -1,21 +0,0 @@
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=no
# Options for repmgrd (required)
#REPMGRD_OPTS="--verbose -d -f /var/lib/pgsql/repmgr/repmgr.conf"
# User to run repmgrd as
#REPMGRD_USER=postgres
# repmgrd binary
#REPMGRD_BIN=/usr/bin/repmgrd
# pid file
#REPMGRD_PIDFILE=/var/lib/pgsql/repmgr/repmgrd.pid
# log file
#REPMGRD_LOG=/var/lib/pgsql/repmgr/repmgrd.log

36
TODO
View File

@@ -7,6 +7,7 @@ Known issues in repmgr
* PGPASSFILE may not be passed to pg_basebackup
Planned feature improvements
============================
@@ -37,4 +38,37 @@ Planned feature improvements
before the primary. See github issue #80.
* make old master node ID available for event notification commands
(See github issue #80).
(See github issue #80).
* repmgr standby clone: possibility to use barman instead of performing a new base backup
* possibility to transform a failed master into a new standby with pg_rewind
* "repmgr standby switchover" to promote a standby in a controlled manner
and convert the existing primary into a standby
* make repmgrd more robust
* repmgr: when cloning a standby using pg_basebackup and replication slots are
requested, activate the replication slot using pg_receivexlog to negate the
need to set `wal_keep_segments` just for the initial clone (9.4 and 9.5).
* repmgr: enable "standby follow" to point a standby at another standby, not
just the replication cluster master (see GitHub #130)
Usability improvements
======================
* repmgr: add interrupt handler, so that if the program is interrupted
while running a backup, an attempt can be made to execute pg_stop_backup()
on the primary, to prevent an orphaned backup state existing.
* repmgr: when unregistering a node, delete any entries in the repl_monitoring
table.
* repmgr: for "standby unregister", accept connection parameters for the
primary and perform metadata updates (and slot removal) directly on
the primary, to allow a shutdown standby to be unregistered
(currently the standby must still be running, which means the replication
slot can't be dropped).

View File

@@ -1,6 +1,6 @@
/*
* check_dir.c - Directories management functions
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -320,10 +320,10 @@ _create_pg_dir(char *dir, bool force, bool for_witness)
}
else if (pg_dir && !force)
{
log_warning(_("\nThis looks like a PostgreSQL directory.\n"
"If you are sure you want to clone here, "
"please check there is no PostgreSQL server "
"running and use the --force option\n"));
log_hint(_("This looks like a PostgreSQL directory.\n"
"If you are sure you want to clone here, "
"please check there is no PostgreSQL server "
"running and use the -F/--force option\n"));
return false;
}

View File

@@ -1,6 +1,6 @@
/*
* check_dir.h
* Copyright (c) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by

107
compat.c Normal file
View File

@@ -0,0 +1,107 @@
/*
*
* compat.c
* Provides a couple of useful string utility functions adapted
* from the backend code, which are not publicly exposed. They're
* unlikely to change but it would be worth keeping an eye on them
* for any fixes/improvements
*
* Copyright (c) 2ndQuadrant, 2010-2017
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#include "repmgr.h"
#include "compat.h"
/*
* Append the given string to the buffer, with suitable quoting for passing
* the string as a value, in a keyword/pair value in a libpq connection
* string
*
* This function is adapted from src/fe_utils/string_utils.c (before 9.6
* located in: src/bin/pg_dump/dumputils.c)
*/
void
appendConnStrVal(PQExpBuffer buf, const char *str)
{
const char *s;
bool needquotes;
/*
* If the string is one or more plain ASCII characters, no need to quote
* it. This is quite conservative, but better safe than sorry.
*/
needquotes = true;
for (s = str; *s; s++)
{
if (!((*s >= 'a' && *s <= 'z') || (*s >= 'A' && *s <= 'Z') ||
(*s >= '0' && *s <= '9') || *s == '_' || *s == '.'))
{
needquotes = true;
break;
}
needquotes = false;
}
if (needquotes)
{
appendPQExpBufferChar(buf, '\'');
while (*str)
{
/* ' and \ must be escaped by to \' and \\ */
if (*str == '\'' || *str == '\\')
appendPQExpBufferChar(buf, '\\');
appendPQExpBufferChar(buf, *str);
str++;
}
appendPQExpBufferChar(buf, '\'');
}
else
appendPQExpBufferStr(buf, str);
}
/*
* Adapted from: src/fe_utils/string_utils.c
*/
void
appendShellString(PQExpBuffer buf, const char *str)
{
const char *p;
appendPQExpBufferChar(buf, '\'');
for (p = str; *p; p++)
{
if (*p == '\n' || *p == '\r')
{
fprintf(stderr,
_("shell command argument contains a newline or carriage return: \"%s\"\n"),
str);
exit(ERR_BAD_CONFIG);
}
if (*p == '\'')
appendPQExpBufferStr(buf, "'\"'\"'");
else
appendPQExpBufferChar(buf, *p);
}
appendPQExpBufferChar(buf, '\'');
}

29
compat.h Normal file
View File

@@ -0,0 +1,29 @@
/*
* compat.h
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef _COMPAT_H_
#define _COMPAT_H_
extern void
appendConnStrVal(PQExpBuffer buf, const char *str);
extern void
appendShellString(PQExpBuffer buf, const char *str);
#endif

757
config.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,7 @@
/*
* config.h
* Copyright (c) 2ndQuadrant, 2010-2015
*
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -24,6 +25,7 @@
#include "strutil.h"
#define CONFIG_FILE_NAME "repmgr.conf"
typedef struct EventNotificationListCell
{
@@ -56,37 +58,90 @@ typedef struct
int node;
int upstream_node;
char conninfo[MAXLEN];
char barman_server[MAXLEN];
char barman_config[MAXLEN];
int failover;
int priority;
char node_name[MAXLEN];
/* commands executed by repmgrd */
char promote_command[MAXLEN];
char follow_command[MAXLEN];
/* Overrides for pg_ctl commands */
char service_stop_command[MAXLEN];
char service_start_command[MAXLEN];
char service_restart_command[MAXLEN];
char service_reload_command[MAXLEN];
char service_promote_command[MAXLEN];
char loglevel[MAXLEN];
char logfacility[MAXLEN];
char rsync_options[QUERY_STR_LEN];
char ssh_options[QUERY_STR_LEN];
int master_response_timeout;
int reconnect_attempts;
int reconnect_intvl;
int reconnect_interval;
char pg_bindir[MAXLEN];
char pg_ctl_options[MAXLEN];
char pg_basebackup_options[MAXLEN];
char restore_command[MAXLEN];
char logfile[MAXLEN];
int monitor_interval_secs;
int retry_promote_interval_secs;
int witness_repl_nodes_sync_interval_secs;
int use_replication_slots;
char event_notification_command[MAXLEN];
EventNotificationList event_notifications;
TablespaceList tablespace_mapping;
} t_configuration_options;
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", 0, 0, 0, "", { NULL, NULL }, {NULL, NULL} }
/*
* The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file
*/
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", UNKNOWN_NODE_ID, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } }
typedef struct ItemListCell
{
struct ItemListCell *next;
char *string;
} ItemListCell;
bool load_config(const char *config_file, t_configuration_options *options, char *argv0);
bool reload_config(t_configuration_options *orig_options);
typedef struct ItemList
{
ItemListCell *head;
ItemListCell *tail;
} ItemList;
typedef struct TablespaceDataListCell
{
struct TablespaceDataListCell *next;
char *name;
char *oid;
char *location;
/* optional payload */
FILE *f;
} TablespaceDataListCell;
typedef struct TablespaceDataList
{
TablespaceDataListCell *head;
TablespaceDataListCell *tail;
} TablespaceDataList;
void set_progname(const char *argv0);
const char * progname(void);
bool load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0);
void _parse_config(t_configuration_options *options, ItemList *error_list);
bool parse_config(t_configuration_options *options);
bool reload_config(t_configuration_options *orig_options);
void parse_line(char *buff, char *name, char *value);
char *trim(char *s);
void item_list_append(ItemList *item_list, char *error_message);
int repmgr_atoi(const char *s,
const char *config_item,
ItemList *error_list,
bool allow_negative);
extern bool config_file_found;
#endif

1125
dbutils.c

File diff suppressed because it is too large Load Diff

101
dbutils.h
View File

@@ -1,6 +1,7 @@
/*
* dbutils.h
* Copyright (c) 2ndQuadrant, 2010-2015
*
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -20,13 +21,76 @@
#ifndef _REPMGR_DBUTILS_H_
#define _REPMGR_DBUTILS_H_
#include "access/xlogdefs.h"
#include "pqexpbuffer.h"
#include "config.h"
#include "strutil.h"
#define format_lsn(x) (uint32) (x >> 32), (uint32) x
typedef enum {
UNKNOWN = 0,
MASTER,
STANDBY,
WITNESS
} t_server_type;
/*
* Struct to store node information
*/
typedef struct s_node_info
{
int node_id;
int upstream_node_id;
t_server_type type;
char name[MAXLEN];
char conninfo_str[MAXLEN];
char slot_name[MAXLEN];
int priority;
bool active;
bool is_ready;
bool is_visible;
XLogRecPtr xlog_location;
} t_node_info;
#define T_NODE_INFO_INITIALIZER { \
NODE_NOT_FOUND, \
NO_UPSTREAM_NODE, \
UNKNOWN, \
"", \
"", \
"", \
DEFAULT_PRIORITY, \
true, \
false, \
false, \
InvalidXLogRecPtr \
}
/*
* Struct to store replication slot information
*/
typedef struct s_replication_slot
{
char slot_name[MAXLEN];
char slot_type[MAXLEN];
bool active;
} t_replication_slot;
extern char repmgr_schema[MAXLEN];
PGconn *_establish_db_connection(const char *conninfo,
const bool exit_on_error,
const bool log_notice,
const bool verbose_only);
PGconn *establish_db_connection(const char *conninfo,
const bool exit_on_error);
const bool exit_on_error);
PGconn *establish_db_connection_quiet(const char *conninfo);
PGconn *test_db_connection(const char *conninfo);
PGconn *establish_db_connection_by_params(const char *keywords[],
const char *values[],
const bool exit_on_error);
@@ -45,7 +109,7 @@ int guc_set(PGconn *conn, const char *parameter, const char *op,
const char *value);
int guc_set_typed(PGconn *conn, const char *parameter, const char *op,
const char *value, const char *datatype);
bool get_conninfo_value(const char *conninfo, const char *keyword, char *output);
PGconn *get_upstream_connection(PGconn *standby_conn, char *cluster,
int node_id,
int *upstream_node_id_ptr,
@@ -57,16 +121,31 @@ int wait_connection_availability(PGconn *conn, long long timeout);
bool cancel_query(PGconn *conn, int timeout);
char *get_repmgr_schema(void);
char *get_repmgr_schema_quoted(PGconn *conn);
bool create_replication_slot(PGconn *conn, char *slot_name);
bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint);
bool stop_backup(PGconn *conn, char *last_wal_segment);
bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
int get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
bool drop_replication_slot(PGconn *conn, char *slot_name);
bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint, int server_version_num);
bool stop_backup(PGconn *conn, char *last_wal_segment, int server_version_num);
bool set_config(PGconn *conn, const char *config_param, const char *config_value);
bool set_config_bool(PGconn *conn, const char *config_param, bool state);
bool copy_configuration(PGconn *masterconn, PGconn *witnessconn, char *cluster_name);
bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name);
bool witness_copy_node_records(PGconn *masterconn, PGconn *witnessconn, char *cluster_name);
bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
bool delete_node_record(PGconn *conn, int node, char *action);
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
int get_node_record(PGconn *conn, char *cluster, int node_id, t_node_info *node_info);
int get_node_record_by_name(PGconn *conn, char *cluster, const char *node_name, t_node_info *node_info);
bool update_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name, bool active);
bool update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active);
bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id);
PGresult * get_node_record(PGconn *conn, char *cluster, int node_id);
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
void create_checkpoint(PGconn *conn);
int get_node_replication_state(PGconn *conn, char *node_name, char *output);
t_server_type parse_node_type(const char *type);
int get_data_checksum_version(const char *data_directory);
/* backported from repmgr 4.x */
XLogRecPtr parse_lsn(const char *str);
XLogRecPtr get_last_wal_receive_location(PGconn *conn);
bool is_server_available(const char *conninfo);
#endif

View File

@@ -1,9 +1,9 @@
Package: repmgr-auto
Version: 2.0beta2
Version: 3.2dev
Section: database
Priority: optional
Architecture: all
Depends: rsync, postgresql-9.0 | postgresql-9.1 | postgresql-9.2 | postgresql-9.3 | postgresql-9.4
Maintainer: Jaime Casanova <jaime@2ndQuadrant.com>
Depends: rsync, postgresql-9.3 | postgresql-9.4 | postgresql-9.5
Maintainer: Self built package <user@localhost>
Description: PostgreSQL replication setup, magament and monitoring
has two main executables

View File

@@ -59,7 +59,7 @@ do_stop()
# 0 if daemon has been stopped
# 1 if daemon was already stopped
# other if daemon could not be stopped or a failure occurred
start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $REPMGRD_PIDFILE --exec $REPMGRD_BIN
start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $REPMGRD_PIDFILE --name "$(basename $REPMGRD_BIN)"
}
case "$1" in

194
dirmod.c Normal file
View File

@@ -0,0 +1,194 @@
/*
*
* dirmod.c
* directory handling functions
*
* Copyright (c) 2ndQuadrant, 2010-2017
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#include "postgres_fe.h"
/* Don't modify declarations in system headers */
#include <unistd.h>
#include <dirent.h>
#include <sys/stat.h>
/*
* pgfnames
*
* return a list of the names of objects in the argument directory. Caller
* must call pgfnames_cleanup later to free the memory allocated by this
* function.
*/
char **
pgfnames(const char *path)
{
DIR *dir;
struct dirent *file;
char **filenames;
int numnames = 0;
int fnsize = 200; /* enough for many small dbs */
dir = opendir(path);
if (dir == NULL)
{
return NULL;
}
filenames = (char **) palloc(fnsize * sizeof(char *));
while (errno = 0, (file = readdir(dir)) != NULL)
{
if (strcmp(file->d_name, ".") != 0 && strcmp(file->d_name, "..") != 0)
{
if (numnames + 1 >= fnsize)
{
fnsize *= 2;
filenames = (char **) repalloc(filenames,
fnsize * sizeof(char *));
}
filenames[numnames++] = pstrdup(file->d_name);
}
}
if (errno)
{
fprintf(stderr, _("could not read directory \"%s\": %s\n"),
path, strerror(errno));
}
filenames[numnames] = NULL;
if (closedir(dir))
{
fprintf(stderr, _("could not close directory \"%s\": %s\n"),
path, strerror(errno));
}
return filenames;
}
/*
* pgfnames_cleanup
*
* deallocate memory used for filenames
*/
void
pgfnames_cleanup(char **filenames)
{
char **fn;
for (fn = filenames; *fn; fn++)
pfree(*fn);
pfree(filenames);
}
/*
* rmtree
*
* Delete a directory tree recursively.
* Assumes path points to a valid directory.
* Deletes everything under path.
* If rmtopdir is true deletes the directory too.
* Returns true if successful, false if there was any problem.
* (The details of the problem are reported already, so caller
* doesn't really have to say anything more, but most do.)
*/
bool
rmtree(const char *path, bool rmtopdir)
{
bool result = true;
char pathbuf[MAXPGPATH];
char **filenames;
char **filename;
struct stat statbuf;
/*
* we copy all the names out of the directory before we start modifying
* it.
*/
filenames = pgfnames(path);
if (filenames == NULL)
return false;
/* now we have the names we can start removing things */
for (filename = filenames; *filename; filename++)
{
snprintf(pathbuf, MAXPGPATH, "%s/%s", path, *filename);
/*
* It's ok if the file is not there anymore; we were just about to
* delete it anyway.
*
* This is not an academic possibility. One scenario where this
* happens is when bgwriter has a pending unlink request for a file in
* a database that's being dropped. In dropdb(), we call
* ForgetDatabaseFsyncRequests() to flush out any such pending unlink
* requests, but because that's asynchronous, it's not guaranteed that
* the bgwriter receives the message in time.
*/
if (lstat(pathbuf, &statbuf) != 0)
{
if (errno != ENOENT)
{
result = false;
}
continue;
}
if (S_ISDIR(statbuf.st_mode))
{
/* call ourselves recursively for a directory */
if (!rmtree(pathbuf, true))
{
/* we already reported the error */
result = false;
}
}
else
{
if (unlink(pathbuf) != 0)
{
if (errno != ENOENT)
{
result = false;
}
}
}
}
if (rmtopdir)
{
if (rmdir(path) != 0)
{
result = false;
}
}
pgfnames_cleanup(filenames);
return result;
}

23
dirmod.h Normal file
View File

@@ -0,0 +1,23 @@
/*
* dirmod.h
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef _DIRMOD_H_
#define _DIRMOD_H_
#endif

View File

@@ -0,0 +1,75 @@
repmgrd's failover algorithm
============================
When implementing automatic failover, there are two factors which are critical in
ensuring the desired result is achieved:
- has the master node genuinely failed?
- which is the best node to promote to the new master?
This document outlines repmgrd's decision-making process during automatic failover
for standbys directly connected to the master node.
Master node failure detection
-----------------------------
If a `repmgrd` instance running on a PostgreSQL standby node is unable to connect to
the master node, this doesn't neccesarily mean that the master is down and a
failover is required. Factors such as network connectivity issues could mean that
even though the standby node is isolated, the replication cluster as a whole
is functioning correctly, and promoting the standby without further verification
could result in a "split-brain" situation.
In the event that `repmgrd` is unable to connect to the master node, it will attempt
to reconnect to the master server several times (as defined by the `reconnect_attempts`
parameter in `repmgr.conf`), with reconnection attempts occuring at the interval
specified by `reconnect_interval`. This happens to verify that the master is definitively
not accessible (e.g. that connection was not lost due to a brief network glitch).
Appropriate values for these settings will depend very much on the replication
cluster environment. There will necessarily be a trade-off between the time it
takes to assume the master is not reachable, and the reliability of that conclusion.
A standby in a different physical location to the master will probably need a longer
check interval to rule out possible network issues, whereas one located in the same
rack with a direct connection between servers could perform the check very quickly.
Note that it's possible the master comes back online after this point is reached,
but before a new master has been selected; in this case it will be noticed
during the selection of a new master and no actual failover will take place.
Promotion candidate selection
-----------------------------
Once `repmgrd` has decided the master is definitively unreachable, following checks
will be carried out:
* attempts to connect to all other nodes in the cluster (including the witness
node, if defined) to establish the state of the cluster, including their
current LSN
* If less than half of the nodes are visible (from the viewpoint
of this node), `repmgrd` will not take any further action. This is to ensure that
e.g. if a replication cluster is spread over multiple data centres, a split-brain
situation does not occur if there is a network failure between datacentres. Note
that if nodes are split evenly between data centres, a witness server can be
used to establish the "majority" data centre.
* `repmgrd` polls all visible servers and waits for each node to return a valid LSN;
it updates the LSN previously stored for this node if it has increased since
the initial check
* once all LSNs have been retrieved, `repmgrd` will check for the highest LSN; if
its own node has the highest LSN, it will attempt to promote itself (using the
command defined in `promote_command` in `repmgr.conf`. Note that if using
`repmgr standby promote` as the promotion command, and the original master becomes available
before the promotion takes effect, `repmgr` will return an error and no promotion
will take place, and `repmgrd` will resume monitoring as usual.
* if the node is not the promotion candidate, `repmgrd` will execute the
`follow_command` defined in `repmgr.conf`. If using `repmgr standby follow` here,
`repmgr` will attempt to detect the new master node and attach to that.

View File

@@ -0,0 +1,160 @@
Fencing a failed master node with repmgrd and pgbouncer
=======================================================
With automatic failover, it's essential to ensure that a failed master
remains inaccessible to your application, even if it comes back online
again, to avoid a split-brain situation.
By using `pgbouncer` together with `repmgrd`, it's possible to combine
automatic failover with a process to isolate the failed master from
your application and ensure that all connections which should go to
the master are directed there smoothly without having to reconfigure
your application. (Note that as a connection pooler, `pgbouncer` can
benefit your application in other ways, but those are beyond the scope
of this document).
* * *
> *WARNING*: automatic failover is tricky to get right. This document
> demonstrates one possible implementation method, however you should
> carefully configure and test any setup to suit the needs of your own
> replication cluster/application.
* * *
In a failover situation, `repmgrd` promotes a standby to master by executing
the command defined in `promote_command`. Normally this would be something like:
repmgr standby promote -f /etc/repmgr.conf
By wrapping this in a custom script which adjusts the `pgbouncer` configuration
on all nodes, it's possible to fence the failed master and redirect write
connections to the new master.
The script consists of three sections:
* commands to pause `pgbouncer` on all nodes
* the promotion command itself
* commands to reconfigure and restart `pgbouncer` on all nodes
Note that it requires password-less SSH access between all nodes to be able to
update the `pgbouncer` configuration files.
For the purposes of this demonstration, we'll assume there are 3 nodes (master
and two standbys), with `pgbouncer` listening on port 6432 handling connections
to a database called `appdb`. The `postgres` system user must have write
access to the `pgbouncer` configuration files on all nodes. We'll assume
there's a main `pgbouncer` configuration file, `/etc/pgbouncer.ini`, which uses
the `%include` directive (available from PgBouncer 1.6) to include a separate
configuration file, `/etc/pgbouncer.database.ini`, which will be modified by
`repmgr`.
* * *
> *NOTE*: in this self-contained demonstration, `pgbouncer` is running on the
> database servers, however in a production environment it will make more
> sense to run `pgbouncer` on either separate nodes or the application server.
* * *
`/etc/pgbouncer.ini` should look something like this:
[pgbouncer]
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid
listen_addr = *
listen_port = 6532
unix_socket_dir = /tmp
auth_type = trust
auth_file = /etc/pgbouncer.auth
admin_users = postgres
stats_users = postgres
pool_mode = transaction
max_client_conn = 100
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
%include /etc/pgbouncer.database.ini
The actual script is as follows; adjust the configurable items as appropriate:
`/var/lib/postgres/repmgr/promote.sh`
#!/usr/bin/env bash
set -u
set -e
# Configurable items
PGBOUNCER_HOSTS="node1 node2 node3"
PGBOUNCER_DATABASE_INI="/etc/pgbouncer.database.ini"
PGBOUNCER_DATABASE="appdb"
PGBOUNCER_PORT=6432
REPMGR_DB="repmgr"
REPMGR_USER="repmgr"
REPMGR_SCHEMA="repmgr_test"
# 1. Pause running pgbouncer instances
for HOST in $PGBOUNCER_HOSTS
do
psql -t -c "pause" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
done
# 2. Promote this node from standby to master
repmgr standby promote -f /etc/repmgr.conf
# 3. Reconfigure pgbouncer instances
PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini"
for HOST in $PGBOUNCER_HOSTS
do
# Recreate the pgbouncer config file
echo -e "[databases]\n" > $PGBOUNCER_DATABASE_INI_NEW
psql -d $REPMGR_DB -U $REPMGR_USER -t -A \
-c "SELECT '${PGBOUNCER_DATABASE}-rw= ' || conninfo || ' application_name=pgbouncer_${HOST}' \
FROM ${REPMGR_SCHEMA}.repl_nodes \
WHERE active = TRUE AND type='master'" >> $PGBOUNCER_DATABASE_INI_NEW
psql -d $REPMGR_DB -U $REPMGR_USER -t -A \
-c "SELECT '${PGBOUNCER_DATABASE}-ro= ' || conninfo || ' application_name=pgbouncer_${HOST}' \
FROM ${REPMGR_SCHEMA}.repl_nodes \
WHERE node_name='${HOST}'" >> $PGBOUNCER_DATABASE_INI_NEW
rsync $PGBOUNCER_DATABASE_INI_NEW $HOST:$PGBOUNCER_DATABASE_INI
psql -tc "reload" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
psql -tc "resume" -h $HOST -p $PGBOUNCER_PORT -U postgres pgbouncer
done
# Clean up generated file
rm $PGBOUNCER_DATABASE_INI_NEW
echo "Reconfiguration of pgbouncer complete"
Script and template file should be installed on each node where
`repmgrd` is running.
Finally, set `promote_command` in `repmgr.conf` on each node to
point to the custom promote script:
promote_command=/var/lib/postgres/repmgr/promote.sh
and reload/restart any running `repmgrd` instances for the changes to take
effect.

View File

@@ -1,6 +1,6 @@
/*
* errcode.h
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -29,12 +29,17 @@
#define ERR_DB_CON 6
#define ERR_DB_QUERY 7
#define ERR_PROMOTED 8
#define ERR_BAD_PASSWORD 9
#define ERR_STR_OVERFLOW 10
#define ERR_FAILOVER_FAIL 11
#define ERR_BAD_SSH 12
#define ERR_SYS_FAILURE 13
#define ERR_BAD_BASEBACKUP 14
#define ERR_INTERNAL 15
#define ERR_MONITORING_FAIL 16
#define ERR_BAD_BACKUP_LABEL 17
#define ERR_SWITCHOVER_FAIL 18
#define ERR_BARMAN 19
#define ERR_REGISTRATION_SYNC 20
#endif /* _ERRCODE_H_ */

18
expected/repmgr_funcs.out Normal file
View File

@@ -0,0 +1,18 @@
/*
* repmgr_function.sql
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/
-- SET SEARCH_PATH TO 'repmgr';
CREATE FUNCTION repmgr_update_standby_location(text) RETURNS boolean
AS '$libdir/repmgr_funcs', 'repmgr_update_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_standby_location() RETURNS text
AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_update_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS '$libdir/repmgr_funcs', 'repmgr_update_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION repmgr_get_last_updated() RETURNS TIMESTAMP WITH TIME ZONE
AS '$libdir/repmgr_funcs', 'repmgr_get_last_updated'
LANGUAGE C STRICT;

24
expected/repmgr_test.out Normal file
View File

@@ -0,0 +1,24 @@
select * from repmgr_update_standby_location('');
repmgr_update_standby_location
--------------------------------
f
(1 row)
select * from repmgr_get_last_standby_location();
repmgr_get_last_standby_location
----------------------------------
(1 row)
select * from repmgr_update_last_updated();
repmgr_update_last_updated
----------------------------
(1 row)
select * from repmgr_get_last_updated();
repmgr_get_last_updated
-------------------------
(1 row)

189
log.c
View File

@@ -1,6 +1,6 @@
/*
* log.c - Logging methods
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This module is a set of methods for logging (currently only syslog)
*
@@ -39,39 +39,142 @@
/* #define REPMGR_DEBUG */
void
static int detect_log_facility(const char *facility);
static void _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
int log_type = REPMGR_STDERR;
int log_level = LOG_INFO;
int last_log_level = LOG_INFO;
int verbose_logging = false;
int terse_logging = false;
/*
* Global variable to be set by the main application to ensure any log output
* emitted before logger_init is called, is output in the correct format
*/
int logger_output_mode = OM_DAEMON;
extern void
stderr_log_with_level(const char *level_name, int level, const char *fmt, ...)
{
time_t t;
struct tm *tm;
char buff[100];
va_list ap;
va_list arglist;
va_start(arglist, fmt);
_stderr_log_with_level(level_name, level, fmt, arglist);
va_end(arglist);
}
static void
_stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap)
{
char buf[100];
/*
* Store the requested level so that if there's a subsequent
* log_hint() or log_detail(), we can suppress that if appropriate.
*/
last_log_level = level;
if (log_level >= level)
{
time(&t);
tm = localtime(&t);
strftime(buff, 100, "[%Y-%m-%d %H:%M:%S]", tm);
fprintf(stderr, "%s [%s] ", buff, level_name);
va_start(ap, fmt);
/* Format log line prefix with timestamp if in daemon mode */
if (logger_output_mode == OM_DAEMON)
{
time_t t;
struct tm *tm;
time(&t);
tm = localtime(&t);
strftime(buf, 100, "[%Y-%m-%d %H:%M:%S]", tm);
fprintf(stderr, "%s [%s] ", buf, level_name);
}
else
{
fprintf(stderr, "%s: ", level_name);
}
vfprintf(stderr, fmt, ap);
va_end(ap);
fflush(stderr);
}
}
void
log_hint(const char *fmt, ...)
{
va_list ap;
static int detect_log_level(const char *level);
static int detect_log_facility(const char *facility);
if (terse_logging == false)
{
va_start(ap, fmt);
_stderr_log_with_level("HINT", last_log_level, fmt, ap);
va_end(ap);
}
}
void
log_detail(const char *fmt, ...)
{
va_list ap;
if (terse_logging == false)
{
va_start(ap, fmt);
_stderr_log_with_level("DETAIL", last_log_level, fmt, ap);
va_end(ap);
}
}
void
log_verbose(int level, const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
if (verbose_logging == true)
{
switch(level)
{
case LOG_EMERG:
_stderr_log_with_level("EMERG", level, fmt, ap);
break;
case LOG_ALERT:
_stderr_log_with_level("ALERT", level, fmt, ap);
break;
case LOG_CRIT:
_stderr_log_with_level("CRIT", level, fmt, ap);
break;
case LOG_ERR:
_stderr_log_with_level("ERR", level, fmt, ap);
break;
case LOG_WARNING:
_stderr_log_with_level("WARNING", level, fmt, ap);
break;
case LOG_NOTICE:
_stderr_log_with_level("NOTICE", level, fmt, ap);
break;
case LOG_INFO:
_stderr_log_with_level("INFO", level, fmt, ap);
break;
case LOG_DEBUG:
_stderr_log_with_level("DEBUG", level, fmt, ap);
break;
}
}
va_end(ap);
}
int log_type = REPMGR_STDERR;
int log_level = LOG_NOTICE;
bool
logger_init(t_configuration_options * opts, const char *ident, const char *level, const char *facility)
logger_init(t_configuration_options *opts, const char *ident)
{
char *level = opts->loglevel;
char *facility = opts->logfacility;
int l;
int f;
@@ -95,12 +198,19 @@ logger_init(t_configuration_options * opts, const char *ident, const char *level
printf("Assigned level for logger: %d\n", l);
#endif
if (l > 0)
if (l >= 0)
log_level = l;
else
stderr_log_warning(_("Cannot detect log level %s (use any of DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level);
stderr_log_warning(_("Invalid log level \"%s\" (available values: DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level);
}
/*
* STDERR only logging requested - finish here without setting up any further
* logging facility.
*/
if (logger_output_mode == OM_COMMAND_LINE)
return true;
if (facility && *facility)
{
@@ -161,9 +271,10 @@ logger_init(t_configuration_options * opts, const char *ident, const char *level
stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile);
fd = freopen(opts->logfile, "a", stderr);
/* It's possible freopen() may still fail due to e.g. a race condition;
as it's not feasible to restore stderr after a failed freopen(),
we'll write to stdout as a last resort.
/*
* It's possible freopen() may still fail due to e.g. a race condition;
* as it's not feasible to restore stderr after a failed freopen(),
* we'll write to stdout as a last resort.
*/
if (fd == NULL)
{
@@ -174,9 +285,9 @@ logger_init(t_configuration_options * opts, const char *ident, const char *level
}
return true;
}
bool
logger_shutdown(void)
{
@@ -189,17 +300,32 @@ logger_shutdown(void)
}
/*
* Set a minimum logging level. Intended for command line verbosity
* options, which might increase requested logging over what's specified
* in the regular configuration file.
* Indicate whether extra-verbose logging is required. This will
* generate a lot of output, particularly debug logging, and should
* not be permanently enabled in production.
*
* NOTE: in previous repmgr versions, this option forced the log
* level to INFO.
*/
void
logger_min_verbose(int minimum)
logger_set_verbose(void)
{
if (log_level < minimum)
log_level = minimum;
verbose_logging = true;
}
/*
* Indicate whether some non-critical log messages can be omitted.
* Currently this includes warnings about irrelevant command line
* options and hints.
*/
void logger_set_terse(void)
{
terse_logging = true;
}
int
detect_log_level(const char *level)
{
@@ -220,17 +346,16 @@ detect_log_level(const char *level)
if (!strcmp(level, "EMERG"))
return LOG_EMERG;
return 0;
return -1;
}
int
static int
detect_log_facility(const char *facility)
{
int local = 0;
if (!strncmp(facility, "LOCAL", 5) && strlen(facility) == 6)
{
local = atoi(&facility[5]);
switch (local)

28
log.h
View File

@@ -1,6 +1,6 @@
/*
* log.h
* Copyright (c) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -25,7 +25,10 @@
#define REPMGR_SYSLOG 1
#define REPMGR_STDERR 2
void
#define OM_COMMAND_LINE 1
#define OM_DAEMON 2
extern void
stderr_log_with_level(const char *level_name, int level, const char *fmt,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
@@ -112,15 +115,28 @@ __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4)));
#endif
int detect_log_level(const char *level);
/* Logger initialisation and shutdown */
bool logger_init(t_configuration_options * opts, const char *ident);
bool logger_shutdown(void);
bool logger_init(t_configuration_options * opts, const char *ident,
const char *level, const char *facility);
void logger_set_verbose(void);
void logger_set_terse(void);
void logger_min_verbose(int minimum);
void log_detail(const char *fmt, ...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
void log_hint(const char *fmt, ...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
void log_verbose(int level, const char *fmt, ...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
extern int log_type;
extern int log_level;
extern int verbose_logging;
extern int terse_logging;
extern int logger_output_mode;
#endif
#endif /* _REPMGR_LOG_H_ */

7460
repmgr.c

File diff suppressed because it is too large Load Diff

View File

@@ -2,6 +2,10 @@
# Replication Manager sample configuration file
###################################################
# Some configuration items will be set with a default value; this
# is noted for each item. Where no default value is shown, the
# parameter will be treated as empty or false.
# Required configuration items
# ============================
#
@@ -11,17 +15,32 @@
# schema (pattern: "repmgr_{cluster}"); while this name will be quoted
# to preserve case, we recommend using lower case and avoiding whitespace
# to facilitate easier querying of the repmgr views and tables.
cluster=example_cluster
#cluster=example_cluster
# Node ID and name
# (Note: we recommend to avoid naming nodes after their initial
# replication funcion, as this will cause confusion when e.g.
# "standby2" is promoted to master)
node=2
node_name=node2
# replication function, as this will cause confusion when e.g.
# "standby2" is promoted to primary)
#node=2 # a unique integer
#node_name=node2 # an arbitrary (but unique) string; we recommend using
# the server's hostname or another identifier unambiguously
# associated with the server to avoid confusion
# Database connection information
conninfo='host=192.168.204.104 dbname=repmgr_db user=repmgr_usr'
# Database connection information as a conninfo string (this must be a
# keyword/value string, not a connection URI).
#
# https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING
#
# All servers in the cluster must be able to access the database
# using this connection string.
#
#conninfo='host=192.168.204.104 dbname=repmgr user=repmgr'
#
# If repmgrd is in use, consider explicitly setting `connect_timeout` in the
# conninfo string to determine the length of time which elapses before
# a network connection attempt is abandoned; for details see:
#
# https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT
# Optional configuration items
# ============================
@@ -29,33 +48,40 @@ conninfo='host=192.168.204.104 dbname=repmgr_db user=repmgr_usr'
# Replication settings
# ---------------------
# when using cascading replication and a standby is to be connected to an
# upstream standby, specify that node's ID with 'upstream_node'. The node
# must exist before the new standby can be registered. If a standby is
# to connect directly to a master node, this parameter is not required.
#
# upstream_node=1
# When using cascading replication, a standby can connect to another
# upstream standby node which is specified by setting 'upstream_node'.
# In that case, the upstream node must exist before the new standby
# can be registered. If 'upstream_node' is not set, then the standby
# will connect directly to the primary node.
#upstream_node=1
# physical replication slots - PostgreSQL 9.4 and later only
# use physical replication slots - PostgreSQL 9.4 and later only
# (default: 0)
#
# use_replication_slots=0
#use_replication_slots=0
# NOTE: 'max_replication_slots' should be configured for at least the
# number of standbys which will connect to the primary.
# Logging and monitoring settings
# -------------------------------
# Log level: possible values are DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG
# (default: NOTICE)
loglevel=NOTICE
# (default: INFO)
#loglevel=INFO
# Note that logging facility settings will only apply to `repmgrd` by default;
# `repmgr` will always write to STDERR unless the switch `--log-to-file` is
# supplied, in which case it will log to the same destination as `repmgrd`.
# This is mainly intended for those cases when `repmgr` is executed directly
# by `repmgrd`.
# Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER
# (default: STDERR)
logfacility=STDERR
#logfacility=STDERR
# stderr can be redirected to an arbitrary file:
#
# logfile='/var/log/repmgr.log'
#logfile='/var/log/repmgr/repmgr.log'
# event notifications can be passed to an arbitrary external program
# together with the following parameters:
@@ -69,12 +95,12 @@ logfacility=STDERR
# the values provided for "%t" and "%d" will probably contain spaces,
# so should be quoted in the provided command configuration, e.g.:
#
# event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
#event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
# By default, all notifications will be passed; the notification types
# can be filtered to explicitly named ones:
#
# event_notifications=master_register,standby_register,witness_create
#event_notifications=master_register,standby_register,witness_create
# Environment/command settings
@@ -82,18 +108,53 @@ logfacility=STDERR
# path to PostgreSQL binary directory (location of pg_ctl, pg_basebackup etc.)
# (if not provided, defaults to system $PATH)
# pg_bindir=/usr/bin/
#pg_bindir=/usr/bin/
#
# Debian/Ubuntu users: you will probably need to set this to the directory
# where `pg_ctl` is located, e.g. /usr/lib/postgresql/9.5/bin/
# service control commands
#
# repmgr provides options to override the default pg_ctl commands
# used to stop, start, restart, reload and promote the PostgreSQL cluster
#
# NOTE: These commands must be runnable on remote nodes as well for switchover
# to function correctly.
#
# If you use sudo, the user repmgr runs as (usually 'postgres') must have
# passwordless sudo access to execute the command
#
# For example, to use systemd, you may use the following configuration:
#
# # this is required when running sudo over ssh without -t:
# Defaults:postgres !requiretty
# postgres ALL = NOPASSWD: /usr/bin/systemctl stop postgresql-9.5, \
# /usr/bin/systemctl start postgresql-9.5, \
# /usr/bin/systemctl restart postgresql-9.5
#
# service_start_command = systemctl start postgresql-9.5
# service_stop_command = systemctl stop postgresql-9.5
# service_restart_command = systemctl restart postgresql-9.5
# service_reload_command = pg_ctlcluster 9.5 main reload
# service_promote_command = pg_ctlcluster 9.5 main promote
# external command options
# rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""
# ssh_options=-o "StrictHostKeyChecking no"
#rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""
#ssh_options=-o "StrictHostKeyChecking no"
# external command arguments
# external command arguments. Values shown are examples.
# pg_ctl_options='-s'
# pg_basebackup_options='--xlog-method=s'
#pg_ctl_options='-s'
#pg_basebackup_options='--label=repmgr_backup'
# This is the host name of the barman server, which is used for connecting over
# to the barman server (passwordless ssh keys should be in place)
#barman_server='backup_server'
# If you are placing the barman.conf file in a non-standard path, or using
# a name other than barman.conf, use this parameter to specify the path and
# name of the barman configuration file.
#barman_config='/path/to/barman.conf'
# Standby clone settings
# ----------------------
@@ -104,34 +165,51 @@ logfacility=STDERR
#
# tablespace_mapping=/path/to/original/tablespace=/path/to/new/tablespace
# You can specify a restore_command to be used in the recovery.conf that
# will be placed in the cloned standby
#
# restore_command = cp /path/to/archived/wals/%f %p
# Failover settings (repmgrd)
# ---------------------------
#
# These settings are only applied when repmgrd is running.
# These settings are only applied when repmgrd is running. Values shown
# are defaults.
# How many seconds we wait for master response before declaring master failure
master_response_timeout=60
# monitoring interval in seconds; default is 2
#monitor_interval_secs=2
# How many time we try to reconnect to master before starting failover procedure
reconnect_attempts=6
reconnect_interval=10
# Maximum number of seconds to wait for a response from the primary server
# before deciding it has failed.
#master_response_timeout=60
# Number of attempts at what interval (in seconds) to try and
# connect to a server to establish its status (e.g. master
# during failover)
#reconnect_attempts=6
#reconnect_interval=10
# Autofailover options
failover=automatic # one of 'automatic', 'manual'
priority=100 # a value of zero or less prevents the node being promoted to master
promote_command='repmgr standby promote -f /path/to/repmgr.conf'
follow_command='repmgr standby follow -f /path/to/repmgr.conf -W'
#failover=manual # one of 'automatic', 'manual' (default: manual)
# defines the action to take in the event of upstream failure
#
# 'automatic': repmgrd will automatically attempt to promote the
# node or follow the new upstream node
# 'manual': repmgrd will take no action and the mode will require
# manual attention to reattach it to replication
# monitoring interval; default is 2s
#
# monitor_interval_secs=2
#priority=100 # indicate a preferred priorty for promoting nodes
# a value of zero or less prevents the node being promoted to primary
# (default: 100)
# change wait time for master; before we bail out and exit when the master
#promote_command='repmgr standby promote -f /path/to/repmgr.conf'
#follow_command='repmgr standby follow -f /path/to/repmgr.conf -W'
# change wait time for primary; before we bail out and exit when the primary
# disappears, we wait 'reconnect_attempts' * 'retry_promote_interval_secs'
# seconds; by default this would be half an hour, as 'retry_promote_interval_secs'
# default value is 300)
#
# retry_promote_interval_secs=300
#retry_promote_interval_secs=300
# Number of seconds after which the witness server resyncs the repl_nodes table
#witness_repl_nodes_sync_interval_secs=15

213
repmgr.h
View File

@@ -1,6 +1,6 @@
/*
* repmgr.h
* Copyright (c) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -23,24 +23,21 @@
#include <libpq-fe.h>
#include <postgres_fe.h>
#include <getopt_long.h>
#include "pqexpbuffer.h"
#include "strutil.h"
#include "dbutils.h"
#include "errcode.h"
#include "config.h"
#include "dirmod.h"
#define MIN_SUPPORTED_VERSION "9.3"
#define MIN_SUPPORTED_VERSION_NUM 90300
#include "config.h"
#define MAXFILENAME 1024
#define ERRBUFF_SIZE 512
#define DEFAULT_CONFIG_FILE "./repmgr.conf"
#define DEFAULT_WAL_KEEP_SEGMENTS "5000"
#define DEFAULT_WAL_KEEP_SEGMENTS "0"
#define DEFAULT_DEST_DIR "."
#define DEFAULT_MASTER_PORT "5432"
#define DEFAULT_DBNAME "postgres"
#define DEFAULT_REPMGR_SCHEMA_PREFIX "repmgr_"
#define DEFAULT_PRIORITY 100
#define FAILOVER_NODES_MAX_CHECK 50
@@ -49,62 +46,202 @@
#define AUTOMATIC_FAILOVER 1
#define NODE_NOT_FOUND -1
#define NO_UPSTREAM_NODE -1
#define UNKNOWN_NODE_ID -1
/* command line options without short versions */
#define OPT_HELP 1
#define OPT_CHECK_UPSTREAM_CONFIG 2
#define OPT_RECOVERY_MIN_APPLY_DELAY 3
#define OPT_COPY_EXTERNAL_CONFIG_FILES 4
#define OPT_CONFIG_ARCHIVE_DIR 5
#define OPT_PG_REWIND 6
#define OPT_CSV 8
#define OPT_NODE 9
#define OPT_WITHOUT_BARMAN 10
#define OPT_NO_UPSTREAM_CONNECTION 11
#define OPT_REGISTER_WAIT 12
#define OPT_CLUSTER 13
#define OPT_LOG_TO_FILE 14
#define OPT_UPSTREAM_CONNINFO 15
#define OPT_NO_CONNINFO_PASSWORD 16
#define OPT_REPLICATION_USER 17
typedef enum {
UNKNOWN = 0,
MASTER,
STANDBY,
WITNESS
} t_server_type;
/* deprecated command line options */
#define OPT_INITDB_NO_PWPROMPT 998
#define OPT_IGNORE_EXTERNAL_CONFIG_FILES 999
/* values for --copy-external-config-files */
#define CONFIG_FILE_SAMEPATH 1
#define CONFIG_FILE_PGDATA 2
/* Run time options type */
typedef struct
{
/* general repmgr options */
char config_file[MAXPGPATH];
bool verbose;
bool terse;
bool force;
char pg_bindir[MAXLEN]; /* overrides setting in repmgr.conf */
/* logging parameters */
char loglevel[MAXLEN]; /* overrides setting in repmgr.conf */
bool log_to_file;
/* connection parameters */
char dbname[MAXLEN];
char host[MAXLEN];
char username[MAXLEN];
char dest_dir[MAXFILENAME];
char config_file[MAXFILENAME];
char dest_dir[MAXPGPATH];
char remote_user[MAXLEN];
char superuser[MAXLEN];
char masterport[MAXLEN];
bool conninfo_provided;
bool connection_param_provided;
bool host_param_provided;
/* standby clone parameters */
bool wal_keep_segments_used;
char wal_keep_segments[MAXLEN];
bool verbose;
bool force;
bool wait_for_master;
bool ignore_rsync_warn;
bool initdb_no_pwprompt;
bool rsync_only;
bool fast_checkpoint;
bool ignore_external_config_files;
char masterport[MAXLEN];
char localport[MAXLEN];
/* parameter used by CLUSTER CLEANUP */
int keep_history;
char pg_bindir[MAXLEN];
bool without_barman;
bool no_upstream_connection;
bool no_conninfo_password;
bool copy_external_config_files;
int copy_external_config_files_destination;
char upstream_conninfo[MAXLEN];
char replication_user[MAXLEN];
char recovery_min_apply_delay[MAXLEN];
/* standby register parameters */
bool wait_register_sync;
int wait_register_sync_seconds;
/* witness create parameters */
bool witness_pwprompt;
/* standby follow parameters */
bool wait_for_master;
/* cluster {show|matrix|crosscheck} parameters */
bool csv_mode;
/* cluster cleanup parameters */
int keep_history;
/* standby switchover parameters */
char remote_config_file[MAXLEN];
bool pg_rewind_supplied;
char pg_rewind[MAXPGPATH];
char pg_ctl_mode[MAXLEN];
/* standby {archive_config | restore_config} parameters */
char config_archive_dir[MAXLEN];
/* {standby|witness} unregister parameters */
int node;
} t_runtime_options;
#define T_RUNTIME_OPTIONS_INITIALIZER { "", "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, false, false, "", "", 0, "", "" }
#define T_RUNTIME_OPTIONS_INITIALIZER { \
/* general repmgr options */ \
"", false, false, false, "", \
/* logging parameters */ \
"", false, \
/* connection parameters */ \
"", "", "", "", "", "", "", \
false, false, false, \
/* standby clone parameters */ \
false, DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, \
false, CONFIG_FILE_SAMEPATH, "", "", "", \
/* standby register paarameters */ \
false, 0, \
/* witness create parameters */ \
false, \
/* standby follow parameters */ \
false, \
/* cluster {show|matrix|crosscheck} parameters */ \
false, \
/* cluster cleanup parameters */ \
0, \
/* standby switchover parameters */ \
"", false, "", "fast", \
/* standby {archive_config | restore_config} parameters */ \
"", \
/* {standby|witness} unregister parameters */ \
UNKNOWN_NODE_ID }
extern char repmgr_schema[MAXLEN];
typedef struct ErrorListCell
struct BackupLabel
{
struct ErrorListCell *next;
char *error_message;
} ErrorListCell;
XLogRecPtr start_wal_location;
char start_wal_file[MAXLEN];
XLogRecPtr checkpoint_location;
char backup_from[MAXLEN];
char backup_method[MAXLEN];
char start_time[MAXLEN];
char label[MAXLEN];
XLogRecPtr min_failover_slot_lsn;
};
typedef struct ErrorList
typedef struct
{
ErrorListCell *head;
ErrorListCell *tail;
} ErrorList;
char slot[MAXLEN];
char xlog_method[MAXLEN];
bool no_slot; /* from PostgreSQL 10 */
} t_basebackup_options;
#define T_BASEBACKUP_OPTIONS_INITIALIZER { "", "", false }
typedef struct
{
int size;
char **keywords;
char **values;
} t_conninfo_param_list;
typedef struct
{
char filepath[MAXPGPATH];
char filename[MAXPGPATH];
bool in_data_directory;
} t_configfile_info;
typedef struct
{
int size;
int entries;
t_configfile_info **files;
} t_configfile_list;
#define T_CONFIGFILE_LIST_INITIALIZER { 0, 0, NULL }
typedef struct
{
int node_id;
int node_status;
} t_node_status_rec;
typedef struct
{
int node_id;
char node_name[MAXLEN];
t_node_status_rec **node_status_list;
} t_node_matrix_rec;
typedef struct
{
int node_id;
char node_name[MAXLEN];
t_node_matrix_rec **matrix_list_rec;
} t_node_status_cube;
#endif

View File

@@ -1,7 +1,7 @@
/*
* repmgr.sql
*
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/
@@ -59,3 +59,12 @@ WHERE (standby_node, last_monitor_time) IN (SELECT standby_node, MAX(last_monito
ALTER VIEW repl_status OWNER TO repmgr;
CREATE INDEX idx_repl_status_sort ON repl_monitor(last_monitor_time, standby_node);
/*
* This view shows the list of nodes with the information of which one is the upstream
* in each case (when appliable)
*/
CREATE VIEW repl_show_nodes AS
SELECT rn.id, rn.conninfo, rn.type, rn.name, rn.cluster,
rn.priority, rn.active, sq.name AS upstream_node_name
FROM repl_nodes as rn LEFT JOIN repl_nodes AS sq ON sq.id=rn.upstream_node_id;

1262
repmgrd.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
#
# Makefile
#
# Copyright (c) 2ndQuadrant, 2010-2015
# Copyright (c) 2ndQuadrant, 2010-2017
#
MODULE_big = repmgr_funcs

View File

@@ -63,6 +63,15 @@ UPDATE repl_nodes SET type = 'master' WHERE id = $master_id;
-- UPDATE repl_nodes SET active = FALSE WHERE id IN (...);
/* There's also an event table which we need to create */
CREATE TABLE repl_events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
/* When you're sure of your changes, commit them */
-- COMMIT;

View File

@@ -0,0 +1,35 @@
/*
* Update a repmgr 3.0 installation to repmgr 3.1
* ----------------------------------------------
*
* The new repmgr package should be installed first. Then
* carry out these steps:
*
* 1. (If repmgrd is used) stop any running repmgrd instances
* 2. On the master node, execute the SQL statements listed below
* 3. (If repmgrd is used) restart repmgrd
*/
/*
* If your repmgr installation is not included in your repmgr
* user's search path, please set the search path to the name
* of the repmgr schema to ensure objects are installed in
* the correct location.
*
* The repmgr schema is "repmgr_" + the cluster name defined in
* 'repmgr.conf'.
*/
-- SET search_path TO 'name_of_repmgr_schema';
BEGIN;
-- New view "repl_show_nodes" which also displays the server's
-- upstream node
CREATE VIEW repl_show_nodes AS
SELECT rn.id, rn.conninfo, rn.type, rn.name, rn.cluster,
rn.priority, rn.active, sq.name AS upstream_node_name
FROM repl_nodes as rn LEFT JOIN repl_nodes AS sq ON sq.id=rn.upstream_node_id;
COMMIT;

View File

@@ -0,0 +1,32 @@
/*
* Update a repmgr 3.1.1 installation to repmgr 3.1.2
* --------------------------------------------------
*
* This update is only required if repmgrd is being used in conjunction
* with a witness server.
*
* The new repmgr package should be installed first. Then
* carry out these steps:
*
* 1. (If repmgrd is used) stop any running repmgrd instances
* 2. On the master node, execute the SQL statement listed below
* 3. (If repmgrd is used) restart repmgrd
*/
/*
* If your repmgr installation is not included in your repmgr
* user's search path, please set the search path to the name
* of the repmgr schema to ensure objects are installed in
* the correct location.
*
* The repmgr schema is "repmgr_" + the cluster name defined in
* 'repmgr.conf'.
*/
-- SET search_path TO 'name_of_repmgr_schema';
BEGIN;
ALTER TABLE repl_nodes DROP CONSTRAINT repl_nodes_upstream_node_id_fkey,
ADD CONSTRAINT repl_nodes_upstream_node_id_fkey FOREIGN KEY (upstream_node_id) REFERENCES repl_nodes(id) DEFERRABLE;
COMMIT;

View File

@@ -83,7 +83,12 @@ _PG_init(void)
* resources in repmgr_shmem_startup().
*/
RequestAddinShmemSpace(repmgr_memsize());
#if (PG_VERSION_NUM >= 90600)
RequestNamedLWLockTranche("repmgr", 1);
#else
RequestAddinLWLocks(1);
#endif
/*
* Install hooks.
@@ -128,7 +133,11 @@ repmgr_shmem_startup(void)
if (!found)
{
/* First time through ... */
#if (PG_VERSION_NUM >= 90600)
shared_state->lock = &(GetNamedLWLockTranche("repmgr"))->lock;
#else
shared_state->lock = LWLockAssign();
#endif
snprintf(shared_state->location,
sizeof(shared_state->location), "%X/%X", 0, 0);
}

View File

@@ -1,6 +1,6 @@
/*
* repmgr_function.sql
* Copyright (c) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/

4
sql/repmgr_test.sql Normal file
View File

@@ -0,0 +1,4 @@
select * from repmgr_update_standby_location('');
select * from repmgr_get_last_standby_location();
select * from repmgr_update_last_updated();
select * from repmgr_get_last_updated();

View File

@@ -1,6 +1,6 @@
/*
* uninstall_repmgr_funcs.sql
* Copyright (c) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/

View File

@@ -1,7 +1,7 @@
/*
* strutil.c
*
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -87,3 +87,21 @@ maxlen_snprintf(char *str, const char *format,...)
return retval;
}
/*
* Escape a string for use as a parameter in recovery.conf
* Caller must free returned value
*/
char *
escape_recovery_conf_value(const char *src)
{
char *result = escape_single_quotes_ascii(src);
if (!result)
{
fprintf(stderr, _("%s: out of memory\n"), progname());
exit(ERR_INTERNAL);
}
return result;
}

View File

@@ -1,6 +1,6 @@
/*
* strutil.h
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
*
* This program is free software: you can redistribute it and/or modify
@@ -22,14 +22,20 @@
#define _STRUTIL_H_
#include <stdlib.h>
#include "pqexpbuffer.h"
#include "errcode.h"
#define QUERY_STR_LEN 8192
#define MAXLEN 1024
#define MAXLINELENGTH 4096
#define MAXVERSIONSTR 16
#define MAXCONNINFO 1024
/* Why? http://stackoverflow.com/a/5459929/398670 */
#define STR(x) CppAsString(x)
#define MAXLEN_STR STR(MAXLEN)
extern int
xsnprintf(char *str, size_t size, const char *format,...)
@@ -43,4 +49,6 @@ extern int
maxlen_snprintf(char *str, const char *format,...)
__attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3)));
extern char *
escape_recovery_conf_value(const char *src);
#endif /* _STRUTIL_H_ */

View File

@@ -1,7 +1,7 @@
/*
* uninstall_repmgr.sql
*
* Copyright (C) 2ndQuadrant, 2010-2015
* Copyright (c) 2ndQuadrant, 2010-2017
*
*/

View File

@@ -1,6 +1,6 @@
#ifndef _VERSION_H_
#define _VERSION_H_
#define REPMGR_VERSION "3.0.2dev"
#define REPMGR_VERSION "3.4.0"
#endif