Compare commits

...

89 Commits

Author SHA1 Message Date
Ian Barwick
a8232337d8 Catch various corner cases when restarting a PostgreSQL instance 2018-02-14 11:28:38 +09:00
Ian Barwick
c9eb1bfcc0 Always initialise t_conninfo_param_list structures 2018-02-13 10:48:18 +09:00
Ian Barwick
db552dfbc7 Bump version
4.0.3
2018-02-12 15:03:29 +09:00
Ian Barwick
9732f78565 repmgrd: check "repmgr" extension is installed before starting
Implements GitHub #361.
2018-02-12 11:31:59 +09:00
Ian Barwick
eb7dca2919 "node status": add warning about missing replication slots
Implements GitHub #364.
2018-02-12 10:53:31 +09:00
Ian Barwick
c113102926 Update repmgr.conf.sample
Add missing parameter "monitor_interval_secs"
2018-02-12 09:35:57 +09:00
Ian Barwick
ed6a167915 Execute a CHECKPOINT immediately after promoting the server
This ensures "pg_control" is updated with the latest timeline, mainly
to ensure that if "pg_rewind" is executed as part of a switchover
that it sees the latest timeline.

Per suggestion from GitHub user "superflav" in GitHub #378.

See also:

  https://www.postgresql.org/message-id/flat/20150428180253.GU30322%40tamriel.snowman.net
2018-02-09 12:09:16 +09:00
Ian Barwick
fbbe7afd61 doc: update HISTORY and release notes 2018-02-09 11:42:16 +09:00
Ian Barwick
ae1fc93e48 Ensure correct server version number used for replication stats query 2018-02-09 11:06:15 +09:00
Ian Barwick
7b4ee80af2 "standby switchover": check demotion candidate can make replication connection
Check it's actually possible for the demotion candidate to attach to
the promotion candidate before executing the switchover.

As with other checks of this nature, there's a faint possibility the
situation could change between the time the check is carried out and
the demotion candidate is restarted to connect to the promotion candidate,
but there's not a lot we can do about that. The main purpose is to
be able to catch existing misconfigurations before anything gets changed.

Implements GitHub #370.
2018-02-09 10:01:29 +09:00
Ian Barwick
0b8755e278 "witness register": fix primary node check
Addresses GitHub #377, based on report by user yonj1e in #373.
2018-02-08 16:28:50 +09:00
Ian Barwick
d3e1937808 "standby switchover": additional sanity checks
Check that sufficient walsenders will be available on the promotion
candidate, and if replication slots are in use check if enough of
those will be available.

Note these checks can't guarantee that the walsenders/slots will
be available at the appropriate points during the switchover process,
but do ensure that existing configuration problems will be caught.

Implements GitHub #371.
2018-02-08 15:23:10 +09:00
Ian Barwick
871d6fdee3 "standby clone": cowardly refuse to clone into an active data directory
By checking the PID file in the same way pg_ctl does, we can be pretty
much certain whether the target data directory contains an active
PostgreSQL instance.
2018-02-08 11:43:24 +09:00
Ian Barwick
c7dfe9e040 Fix "standby clone" in Barman mode with --no-upstream-connection
"--upstream-node-id", if provided, was not being passed through to
the SQL query executed via the Barman server.

Also modified the query to select the primary node if "--upstream-node-id"
is not provided.

Note: this is a very niche use case.
2018-02-07 16:36:44 +09:00
Ian Barwick
5c92a9e057 repmgr: simplify data directory checks when cloning
Attempting to use the contents of pg_control to tell whether the directory
is in use by PostgreSQL can result in false positives; we should use
a check based on the pidfile.

Also change the HINT to indicate a data directory can be overwritten
if -F/--force is provided.
2018-02-07 14:37:57 +09:00
Ian Barwick
aa5f025738 "standby clone": ensure "pg_subtrans" directory is created in Barman mode 2018-02-07 10:56:18 +09:00
Ian Barwick
5b91a2d409 Update HISTORY and release notes 2018-02-07 09:55:36 +09:00
Ian Barwick
596a19ee37 Move parse_output_to_argv() to configfile.c
So it can be used by parse_pg_basebackup_options().

Addresses GitHub #376.
2018-02-07 09:43:06 +09:00
Ian Barwick
23ff83b3b4 Fix typo in HINT 2018-02-07 08:55:51 +09:00
Ian Barwick
ba1f6bee0d doc: fix GitHub reference in release notes 2018-02-07 08:53:23 +09:00
Ian Barwick
da9c8f2491 Update HISTORY and release notes 2018-02-06 10:38:13 +09:00
Ian Barwick
64035ef701 "standby register/follow": provide primary node details for event notifications
For events generated by these commands, it may be useful to know details
of the primary node. This makes following additional parameters available
to event notification scripts:

- %p: node ID of the primary
- %a: node name of the primary
- %c: conninfo string for the primary

Implements GitHub #375
2018-02-06 09:36:46 +09:00
Ian Barwick
da3a5ab1dc doc: fix descriptions of %p event notification script parameter 2018-02-05 15:54:06 +09:00
Ian Barwick
9d301b4789 "standby register": add event notification "standby_register_sync"
Implements GitHub #374.
2018-02-05 15:21:38 +09:00
Ian Barwick
c070c649f7 doc: minor fixes to BDR docs
Also remove duplicate file.
2018-02-05 15:21:34 +09:00
Ian Barwick
3b823396eb doc: improve BDR failover documentation 2018-02-05 15:21:28 +09:00
Ian Barwick
c19e7f1025 "cluster show": output any connection error messagesin list of warnings
This ensures any connection errors are displayed by default in a
comprehensible, easily reportable way, and saves having to request/filter
DEBUG output.

Implements GitHub #369.
2018-02-05 10:32:20 +09:00
Ian Barwick
e4b5a1e19f "cluster show": minor code cleanup 2018-02-05 10:25:05 +09:00
Ian Barwick
f96cc3b906 "cluster show": improve handling of database errors
In particular, if running "repmgr cluster show" against a database
without the repmgr metadata, showing the error (rather than just
"no records found" etc.) will provide some clues about the problem.
2018-02-05 10:15:48 +09:00
Tony Finch
a481ca7ce2 "repmgr node status": correct upstream node info (#363)
repmgr was printing the name and ID of this node instead of its upstream

Signed-off-by: Tony Finch <dot@dotat.at>
2018-02-05 09:54:00 +09:00
Ian Barwick
32dc450a09 doc: add note about replication slots and PostgreSQL upgrades 2018-02-02 18:33:43 +09:00
Ian Barwick
34dbf64f50 Ensure an inactive PostgreSQL data directory can be deleted.
Addresses GitHub #366.
2018-02-02 17:12:25 +09:00
Ian Barwick
ea653a8dbc "standby follow": finalize implementation of --dry-run option 2018-02-02 15:42:08 +09:00
Ian Barwick
50894b6124 "standby follow": check for replication slot availability on target node 2018-02-02 15:01:23 +09:00
Ian Barwick
94e187c476 Improve "repmgr primary unregister" documentation and --help output
Per observations in GitHub #373
2018-02-02 14:12:15 +09:00
Ian Barwick
de6284ae79 doc: note password SSH requirements for "standby switchover" 2018-02-02 14:01:58 +09:00
Ian Barwick
c54045bcd8 "standby follow": initial implementation of --dry-run option
GitHub #363.
2018-02-01 14:18:40 +09:00
Ian Barwick
c0a53471e1 "standby switchover": improve log messages and add new exit code
Previously, if an issue was encountered with the old primary, but user
provided -F/--force to have repmgr promote the standby anyway, repmgr
would exit with the log message "STANDBY SWITCHOVER is complete"
and exit code 0 (SUCCESS).

To better report this partial completion, repmgr will now emit the message
"STANDBY SWITCHOVER has completed with issues" (and a HINT to check preceding
log messages) and new exit code 22 (ERR_SWITCHOVER_INCOMPLETE).
2018-01-31 10:25:15 +09:00
Ian Barwick
2eec8b5d79 Have do_standby_follow_internal() not abort on error
Pass the error code back to the caller instead, mainly so
"repmgr node rejoin" can better report errors.
2018-01-30 16:53:04 +09:00
Ian Barwick
c11e92cf2a repmgr: improve switchover handling when "pg_ctl" used
If logging output not explicitly rediretced with "-l" in the pg_ctl
options, repmgr would hang waiting for pg_ctl output.

Note that we recommend using the OS-level service commands where
available.
2018-01-30 13:43:37 +09:00
Ian Barwick
f294d09034 "repmgr standby register": improve error output when standby not running
Add explicit HINT
2018-01-26 22:13:11 +09:00
Ian Barwick
26c597ef5a doc: expand upgrade documentation
Include section about using pg_upgrade
2018-01-23 10:57:19 +09:00
Vlad
b8efbb7a15 doc: add missing word in overview
GitHub pull request #362
2018-01-19 09:11:54 +09:00
Ian Barwick
3044696c05 doc: update 4.0.2 release notes
Add details about upgrading.
2018-01-19 09:09:59 +09:00
Ian Barwick
6dc1969ad5 Remove --bdr-only configuration option
This was required for a specific use case during pre-release
development and is no longer needed now the physical streaming
replication handling is implemented.
2018-01-18 13:30:47 +09:00
Ian Barwick
cb41ef1733 doc: update list of event notifications 2018-01-18 11:48:10 +09:00
Ian Barwick
d10f1f289e Bump version in configure.in
4.0.2
2018-01-16 13:55:58 +09:00
Ian Barwick
5731ba6043 Update version and release date 2018-01-16 12:58:11 +09:00
Ian Barwick
3d6437c8f8 repmgr: assume node is actually shutting down if pingable and that's the reported status 2018-01-16 11:17:06 +09:00
Ian Barwick
54b5c8ad94 repmgrd: log execution error in "repmgrd_get_local_node_id()"
That shouldn't happen, but if it does it will make it easier to
identify the issue.
2018-01-16 11:14:04 +09:00
Ian Barwick
0eca08ffaf doc: improve switchover documentation
Emphasize need to set the "service_*_command" options when repmgr is
installed from a package.
2018-01-16 11:06:39 +09:00
Ian Barwick
05c1dc2b92 doc: add 4.0.2 release notes 2018-01-11 16:39:58 +09:00
Ian Barwick
2bd300073d doc: minor readbility fix 2018-01-11 15:49:56 +09:00
Ian Barwick
01e020df8e doc: note change of shared library name from "repmgr_funcs" to "repmgr" 2018-01-11 15:47:35 +09:00
Ian Barwick
ae7963dc64 repmgr: automatically create slot name if missing
It's possible that a node was registered with "use_replication_slots=false"
but that was later changed to "use_replication_slots=true". If the node
was not subsequently re-registered, the node record will contain an empty
slot name, which will cause any slot creation operation during
"standby follow" or "node rejoin" to fail.

To prevent this happening, check for an empty slot name and automatically
set before proceeding.

Addresses GitHub #343.
2018-01-11 11:13:41 +09:00
Ian Barwick
faffb2a6e7 repmgr: catch possible corner case when checking node shutdown status
It's conceivable that PQping is returning "no response" but the
shutdown hasn't quite completed.
2018-01-10 14:56:00 +09:00
Ian Barwick
5d57044118 repmgr: during switchover, correctly detect unclean shutdown status 2018-01-10 12:21:04 +09:00
Ian Barwick
07a88c78a5 repmgr standby switchover: add "%p" event notification parameter
This will contain the node ID of the former primary.
2018-01-10 11:01:00 +09:00
Ian Barwick
f7df8b9c80 doc: document command line options for "standby switchover" 2018-01-10 10:19:36 +09:00
Ian Barwick
20920b3da1 repmgr standby switchover: add event details 2018-01-10 09:55:24 +09:00
Ian Barwick
683f4de182 Bump version
4.0.2
2018-01-09 13:43:58 +09:00
Ian Barwick
0c62821ffb Consolidate parsing of output from executing repmgr on a remote server
This should also fix the issue reported in GitHub #349.
2018-01-09 13:33:38 +09:00
Ian Barwick
6b70e8bbe6 doc: list repmgr.conf parameters relevant during switchover 2018-01-08 11:13:39 +09:00
Ian Barwick
6b223698c9 Fix call to is_active_bdr_node() in BDR repmgrd
Following the fix to "is_active_bdr_node()" in 841f03ae, it turns out
the call in repmgrd-bdr.c was only accidentally working; explicitly
test for a false return value.
2018-01-04 21:06:45 +09:00
Ian Barwick
aee12dc2c7 "repmgr bdr register": create missing connection replication set if needed
Previously the assumption was that the "repmgr" replication set would be
set up when the nodes are created, however no checks were implemented
and this was not well-documented.

Addresses GitHub #347.
2018-01-04 17:12:52 +09:00
Ian Barwick
c5c86e1ada "repmgr bdr register": improve node name check
We'll use "bdr.bdr_get_local_node_name()" to check the local BDR node
name and the repmgr one match.
2018-01-04 16:07:06 +09:00
Ian Barwick
7476dc84f2 doc: link event notification page from relevate command reference pages 2018-01-04 14:54:14 +09:00
Ian Barwick
f6d63f5216 doc: update package documentation 2018-01-04 13:11:44 +09:00
Ian Barwick
a608b0bc18 "repmgr standby register": add --wait-start option
Implements GitHub #356.
2018-01-04 12:48:12 +09:00
Ian Barwick
469ebba656 doc: fix typos in "repmgr primary unregister" command reference 2018-01-04 12:31:29 +09:00
Ian Barwick
647c21ad0e doc: add link to event notifications page from "repmgr cluster event" 2018-01-04 10:57:54 +09:00
Ian Barwick
3d2530d6f9 Fix query in is_active_bdr_node()
Boolean column was not being checked correctly.

Also add detail output in "repmgr node role --check", where the function
is called.
2018-01-04 10:48:31 +09:00
Ian Barwick
b26e400199 "repmgr cluster event": move query to dbutils.c 2018-01-04 10:06:54 +09:00
Ian Barwick
152e9545a4 docs: document "repmgr cluster event --terse" 2018-01-04 09:53:54 +09:00
Ian Barwick
83b8f05221 "repmgr cluster events": optionally omit "Details" column with --terse
Implements GitHub #360.
2018-01-04 09:48:00 +09:00
Ian Barwick
486f8e5a2c repmgrd: document standby_[failure|recovery] event notifications
Also clean up the relevant code section.

Addresses GitHub #359.
2018-01-04 09:34:49 +09:00
Ian Barwick
e517cc74d1 repmgr node rejoin: handle missing node record correctly
If a connection was provided for a database other than the "repmgr"
database, error was logged but execution continued, resulting in
the connection being finished twice.

Addresses GitHub #358.
2018-01-03 15:20:10 +09:00
Ian Barwick
26285b470f doc: add appendix with details about packages
work-in-progress
2018-01-02 17:24:51 +09:00
Ian Barwick
1521657965 Update copyright notices to 2018 2018-01-02 10:20:09 +09:00
Ian Barwick
041604e303 doc: Fix event notification placeholder typo
Per report from Carlos.
2018-01-01 10:29:34 +09:00
Ian Barwick
0be0100a7c docs: update HISTORY 2017-12-27 10:24:56 +09:00
Ian Barwick
2133834dda doc: update documentation build instructions
Describe how to build documentation as a single file, and also note
requirement to build against 9.6 or earlier.
2017-12-27 10:24:22 +09:00
Ian Barwick
d5fd93c350 repmgr.conf.sample: fix command line argument
"repmgr node check --archive-ready" is correct, however abbreviated
versions will be accepted by getopt_long() if they don't match
or partially match any other options.

Per report by "chaintng" in GitHub #355.
2017-12-27 10:24:17 +09:00
Tony Finch
5804778b58 doc: an optional all-in-one-file manual 2017-12-27 10:24:10 +09:00
Ian Barwick
407a7ea2f4 repmgr: add missing -W option to getopt_long() invocation
Addresses GitHub #350.
2017-12-20 10:28:31 +09:00
Martín Marqués
4d2eca0978 Switch spaces for tabs in repmgr.conf sample file.
This makes comments stay aligned in most cases the conf file is
modified, and when indentation changes, it's easy to re-align
(by removing or adding a tab)

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2017-12-20 09:27:06 +09:00
Martín Marqués
9d25544ab5 Add more information to the setting up sudo without requiretty in
the documentation

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2017-12-20 09:27:02 +09:00
Daymel Bonne Solís
8506607388 Fix package name 2017-12-20 09:26:57 +09:00
Ian Barwick
e8e059c26d docs: update 4.0.1 release date 2017-12-13 15:15:13 +09:00
76 changed files with 3020 additions and 1133 deletions

View File

@@ -2,7 +2,7 @@ License and Contributions
========================= =========================
`repmgr` is licensed under the GPL v3. All of its code and documentation is `repmgr` is licensed under the GPL v3. All of its code and documentation is
Copyright 2010-2017, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for Copyright 2010-2018, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for
details. details.
The development of repmgr has primarily been sponsored by 2ndQuadrant customers. The development of repmgr has primarily been sponsored by 2ndQuadrant customers.

View File

@@ -1,4 +1,4 @@
Copyright (c) 2010-2017, 2ndQuadrant Limited Copyright (c) 2010-2018, 2ndQuadrant Limited
All rights reserved. All rights reserved.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify

41
HISTORY
View File

@@ -1,4 +1,42 @@
4.0.1 2017-12-04 4.0.3 2018-02-
repmgr: improve switchover handling when "pg_ctl" used to control the
server and logging output is not explicitly redirected (Ian)
repmgr: improve switchover log messages and exit code when old primary could
not be shut down cleanly (Ian)
repmgr: check demotion candidate can make a replication connection to the
promotion candidate before executing a switchover; GitHub #370 (Ian)
repmgr: add check for sufficient walsenders/replication slots before executing
a switchover; GitHub #371 (Ian)
repmgr: add --dry-run mode to "repmgr standby follow"; GitHub #368 (Ian)
repmgr: provide information about the primary node for "standby_register" and
"standby_follow" event notifications; GitHub #375 (Ian)
repmgr: add "standby_register_sync" event notification; GitHub #374 (Ian)
repmgr: output any connection error messages in "cluster show"'s list of
warnings; GitHub #369 (Ian)
repmgr: ensure an inactive data directory can be deleted; GitHub #366 (Ian)
repmgr: fix upstream node display in "repmgr node status"; GitHub #363 (fanf2)
repmgr: improve/clarify documentation and update --help output for
"primary unregister"; GitHub #373 (Ian)
repmgr: fix parsing of "pg_basebackup_options"; GitHub #376 (Ian)
repmgr: ensure "pg_subtrans" directory is created when cloning a standby in
Barman mode (Ian)
repmgr: fix primary node check in "witness register"; GitHub #377 (Ian)
4.0.2 2018-01-18
repmgr: add missing -W option to getopt_long() invocation; GitHub #350 (Ian)
repmgr: automatically create slot name if missing; GitHub #343 (Ian)
repmgr: fixes to parsing output of remote repmgr invocations; GitHub #349 (Ian)
repmgr: BDR support - create missing connection replication set
if required; GitHub #347 (Ian)
repmgr: handle missing node record in "repmgr node rejoin"; GitHub #358 (Ian)
repmgr: enable documentation to be build as single HTML file; GitHub #353 (fanf2)
repmgr: recognize "--terse" option for "repmgr cluster event"; GitHub #360 (Ian)
repmgr: add "--wait-start" option for "repmgr standby register"; GitHub #356 (Ian)
repmgr: add "%p" event notification parameter for "repmgr standby switchover"
containing the node ID of the demoted primary (Ian)
docs: various fixes and updates (Ian, Daymel, Martín, ams)
4.0.1 2017-12-13
repmgr: ensure "repmgr node check --action=" returns appropriate return repmgr: ensure "repmgr node check --action=" returns appropriate return
code; GitHub #340 (Ian) code; GitHub #340 (Ian)
repmgr: add missing schema qualification in get_all_node_records_with_upstream() repmgr: add missing schema qualification in get_all_node_records_with_upstream()
@@ -7,7 +45,6 @@
GitHub #344 (Ian) GitHub #344 (Ian)
repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian) repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian)
repmgr: fix configuration file sanity check; GitHub #342 (Ian) repmgr: fix configuration file sanity check; GitHub #342 (Ian)
Improve event notification documentation (Ian)
4.0.0 2017-11-21 4.0.0 2017-11-21
Complete rewrite with many changes; for details see the repmgr 4.0.0 release Complete rewrite with many changes; for details see the repmgr 4.0.0 release

View File

@@ -6,7 +6,7 @@
* supported PostgreSQL versions. They're unlikely to change but * supported PostgreSQL versions. They're unlikely to change but
* it would be worth keeping an eye on them for any fixes/improvements. * it would be worth keeping an eye on them for any fixes/improvements.
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,6 +1,6 @@
/* /*
* compat.h * compat.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,4 +1,2 @@
/* config.h.in. Generated from configure.in by autoheader. */ /* config.h.in. Generated from configure.in by autoheader. */
/* Only build repmgr for BDR */
#undef BDR_ONLY

View File

@@ -1,7 +1,7 @@
/* /*
* config.c - parse repmgr.conf and other configuration-related functionality * config.c - parse repmgr.conf and other configuration-related functionality
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -671,7 +671,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
* Raise an error if a known parameter is provided with an empty * Raise an error if a known parameter is provided with an empty
* value. Currently there's no reason why empty parameters are needed; * value. Currently there's no reason why empty parameters are needed;
* if we want to accept those, we'd need to add stricter default * if we want to accept those, we'd need to add stricter default
* checking, as currently e.g. an empty `node` value will be converted * checking, as currently e.g. an empty `node_id` value will be converted
* to '0'. * to '0'.
*/ */
if (known_parameter == true && !strlen(value)) if (known_parameter == true && !strlen(value))
@@ -1600,31 +1600,109 @@ clear_event_notification_list(t_configuration_options *options)
} }
bool int
parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list) parse_output_to_argv(const char *string, char ***argv_array)
{ {
int options_len = 0; int options_len = 0;
char *options_string = NULL; char *options_string = NULL;
char *options_string_ptr = NULL; char *options_string_ptr = NULL;
int c = 1,
argc_item = 1;
char *argv_item = NULL;
char **local_argv_array = NULL;
ItemListCell *cell;
/* /*
* Add parsed options to this list, then copy to an array to pass to * Add parsed options to this list, then copy to an array to pass to
* getopt * getopt
*/ */
static ItemList option_argv = {NULL, NULL}; ItemList option_argv = {NULL, NULL};
char *argv_item = NULL; options_len = strlen(string) + 1;
int c, options_string = pg_malloc0(options_len);
argc_item = 1; options_string_ptr = options_string;
/* Copy the string before operating on it with strtok() */
strncpy(options_string, string, options_len);
/* Extract arguments into a list and keep a count of the total */
while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
{
item_list_append(&option_argv, trim(argv_item));
argc_item++;
if (options_string_ptr != NULL)
options_string_ptr = NULL;
}
pfree(options_string);
/*
* Array of argument values to pass to getopt_long - this will need to
* include an empty string as the first value (normally this would be the
* program name)
*/
local_argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
/* Insert a blank dummy program name at the start of the array */
local_argv_array[0] = pg_malloc0(1);
/*
* Copy the previously extracted arguments from our list to the array
*/
for (cell = option_argv.head; cell; cell = cell->next)
{
int argv_len = strlen(cell->string) + 1;
local_argv_array[c] = (char *)pg_malloc0(argv_len);
strncpy(local_argv_array[c], cell->string, argv_len);
c++;
}
local_argv_array[c] = NULL;
item_list_free(&option_argv);
*argv_array = local_argv_array;
return argc_item;
}
void
free_parsed_argv(char ***argv_array)
{
char **local_argv_array = *argv_array;
int i = 0;
while (local_argv_array[i] != NULL)
{
pfree((char *)local_argv_array[i]);
i++;
}
pfree((char **)local_argv_array);
*argv_array = NULL;
}
bool
parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
{
bool backup_options_ok = true;
int c = 0,
argc_item = 0;
char **argv_array = NULL; char **argv_array = NULL;
ItemListCell *cell = NULL;
int optindex = 0; int optindex = 0;
struct option *long_options = NULL; struct option *long_options = NULL;
bool backup_options_ok = true;
/* We're only interested in these options */ /* We're only interested in these options */
static struct option long_options_9[] = static struct option long_options_9[] =
@@ -1650,56 +1728,12 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
if (!strlen(pg_basebackup_options)) if (!strlen(pg_basebackup_options))
return backup_options_ok; return backup_options_ok;
options_len = strlen(pg_basebackup_options) + 1;
options_string = pg_malloc(options_len);
options_string_ptr = options_string;
if (server_version_num >= 100000) if (server_version_num >= 100000)
long_options = long_options_10; long_options = long_options_10;
else else
long_options = long_options_9; long_options = long_options_9;
/* Copy the string before operating on it with strtok() */ argc_item = parse_output_to_argv(pg_basebackup_options, &argv_array);
strncpy(options_string, pg_basebackup_options, options_len);
/* Extract arguments into a list and keep a count of the total */
while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
{
item_list_append(&option_argv, argv_item);
argc_item++;
if (options_string_ptr != NULL)
options_string_ptr = NULL;
}
/*
* Array of argument values to pass to getopt_long - this will need to
* include an empty string as the first value (normally this would be the
* program name)
*/
argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
/* Insert a blank dummy program name at the start of the array */
argv_array[0] = pg_malloc0(1);
c = 1;
/*
* Copy the previously extracted arguments from our list to the array
*/
for (cell = option_argv.head; cell; cell = cell->next)
{
int argv_len = strlen(cell->string) + 1;
argv_array[c] = pg_malloc0(argv_len);
strncpy(argv_array[c], cell->string, argv_len);
c++;
}
argv_array[c] = NULL;
/* Reset getopt's optind variable */ /* Reset getopt's optind variable */
optind = 0; optind = 0;
@@ -1743,15 +1777,7 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
backup_options_ok = false; backup_options_ok = false;
} }
pfree(options_string); free_parsed_argv(&argv_array);
{
int i;
for (i = 0; i < argc_item + 2; i++)
pfree(argv_array[i]);
}
pfree(argv_array);
return backup_options_ok; return backup_options_ok;
} }

View File

@@ -1,7 +1,7 @@
/* /*
* configfile.h * configfile.h
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
@@ -248,7 +248,6 @@ typedef struct
} }
void set_progname(const char *argv0); void set_progname(const char *argv0);
const char *progname(void); const char *progname(void);
@@ -263,12 +262,15 @@ int repmgr_atoi(const char *s,
ItemList *error_list, ItemList *error_list,
int minval); int minval);
bool parse_pg_basebackup_options(const char *pg_basebackup_options, bool parse_pg_basebackup_options(const char *pg_basebackup_options,
t_basebackup_options *backup_options, t_basebackup_options *backup_options,
int server_version_num, int server_version_num,
ItemList *error_list); ItemList *error_list);
int parse_output_to_argv(const char *string, char ***argv_array);
void free_parsed_argv(char ***argv_array);
/* called by repmgr-client and repmgrd */ /* called by repmgr-client and repmgrd */
void exit_with_cli_errors(ItemList *error_list); void exit_with_cli_errors(ItemList *error_list);
void print_item_list(ItemList *item_list); void print_item_list(ItemList *item_list);

42
configure vendored
View File

@@ -1,6 +1,6 @@
#! /bin/sh #! /bin/sh
# Guess values for system-dependent variables and create Makefiles. # Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for repmgr 4.0.1. # Generated by GNU Autoconf 2.69 for repmgr 4.0.3.
# #
# Report bugs to <pgsql-bugs@postgresql.org>. # Report bugs to <pgsql-bugs@postgresql.org>.
# #
@@ -11,7 +11,7 @@
# This configure script is free software; the Free Software Foundation # This configure script is free software; the Free Software Foundation
# gives unlimited permission to copy, distribute and modify it. # gives unlimited permission to copy, distribute and modify it.
# #
# Copyright (c) 2010-2017, 2ndQuadrant Ltd. # Copyright (c) 2010-2018, 2ndQuadrant Ltd.
## -------------------- ## ## -------------------- ##
## M4sh Initialization. ## ## M4sh Initialization. ##
## -------------------- ## ## -------------------- ##
@@ -582,8 +582,8 @@ MAKEFLAGS=
# Identity of this package. # Identity of this package.
PACKAGE_NAME='repmgr' PACKAGE_NAME='repmgr'
PACKAGE_TARNAME='repmgr' PACKAGE_TARNAME='repmgr'
PACKAGE_VERSION='4.0.1' PACKAGE_VERSION='4.0.3'
PACKAGE_STRING='repmgr 4.0.1' PACKAGE_STRING='repmgr 4.0.3'
PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org' PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org'
PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/' PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/'
@@ -633,7 +633,6 @@ SHELL'
ac_subst_files='' ac_subst_files=''
ac_user_opts=' ac_user_opts='
enable_option_checking enable_option_checking
with_bdr_only
' '
ac_precious_vars='build_alias ac_precious_vars='build_alias
host_alias host_alias
@@ -1179,7 +1178,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing. # Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh. # This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF cat <<_ACEOF
\`configure' configures repmgr 4.0.1 to adapt to many kinds of systems. \`configure' configures repmgr 4.0.3 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]... Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1240,15 +1239,10 @@ fi
if test -n "$ac_init_help"; then if test -n "$ac_init_help"; then
case $ac_init_help in case $ac_init_help in
short | recursive ) echo "Configuration of repmgr 4.0.1:";; short | recursive ) echo "Configuration of repmgr 4.0.3:";;
esac esac
cat <<\_ACEOF cat <<\_ACEOF
Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-bdr-only BDR-only build
Some influential environment variables: Some influential environment variables:
PG_CONFIG Location to find pg_config for target PostgreSQL (default PATH) PG_CONFIG Location to find pg_config for target PostgreSQL (default PATH)
@@ -1319,14 +1313,14 @@ fi
test -n "$ac_init_help" && exit $ac_status test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then if $ac_init_version; then
cat <<\_ACEOF cat <<\_ACEOF
repmgr configure 4.0.1 repmgr configure 4.0.3
generated by GNU Autoconf 2.69 generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc. Copyright (C) 2012 Free Software Foundation, Inc.
This configure script is free software; the Free Software Foundation This configure script is free software; the Free Software Foundation
gives unlimited permission to copy, distribute and modify it. gives unlimited permission to copy, distribute and modify it.
Copyright (c) 2010-2017, 2ndQuadrant Ltd. Copyright (c) 2010-2018, 2ndQuadrant Ltd.
_ACEOF _ACEOF
exit exit
fi fi
@@ -1338,7 +1332,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake. running configure, to aid debugging if configure makes a mistake.
It was created by repmgr $as_me 4.0.1, which was It was created by repmgr $as_me 4.0.3, which was
generated by GNU Autoconf 2.69. Invocation command line was generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@ $ $0 $@
@@ -1694,20 +1688,6 @@ ac_config_headers="$ac_config_headers config.h"
# Check whether --with-bdr_only was given.
if test "${with_bdr_only+set}" = set; then :
withval=$with_bdr_only;
fi
if test "x$with_bdr_only" != "x"; then :
$as_echo "#define BDR_ONLY \"1\"" >>confdefs.h
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5
$as_echo_n "checking for a sed that does not truncate output... " >&6; } $as_echo_n "checking for a sed that does not truncate output... " >&6; }
if ${ac_cv_path_SED+:} false; then : if ${ac_cv_path_SED+:} false; then :
@@ -2379,7 +2359,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their # report actual input values of CONFIG_FILES etc. instead of their
# values after options handling. # values after options handling.
ac_log=" ac_log="
This file was extended by repmgr $as_me 4.0.1, which was This file was extended by repmgr $as_me 4.0.3, which was
generated by GNU Autoconf 2.69. Invocation command line was generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES CONFIG_FILES = $CONFIG_FILES
@@ -2442,7 +2422,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\ ac_cs_version="\\
repmgr config.status 4.0.1 repmgr config.status 4.0.3
configured by $0, generated by GNU Autoconf 2.69, configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\" with options \\"\$ac_cs_config\\"

View File

@@ -1,17 +1,11 @@
AC_INIT([repmgr], [4.0.1], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/]) AC_INIT([repmgr], [4.0.3], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
AC_COPYRIGHT([Copyright (c) 2010-2017, 2ndQuadrant Ltd.]) AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.])
AC_CONFIG_HEADER(config.h) AC_CONFIG_HEADER(config.h)
AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)]) AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)])
AC_ARG_WITH([bdr_only], [AS_HELP_STRING([--with-bdr-only], [BDR-only build])])
AS_IF([test "x$with_bdr_only" != "x"],
[AC_DEFINE([BDR_ONLY], ["1"], [Only build repmgr for BDR])]
)
AC_PROG_SED AC_PROG_SED
if test -z "$PG_CONFIG"; then if test -z "$PG_CONFIG"; then

View File

@@ -1,6 +1,6 @@
/* /*
* controldata.c * controldata.c
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California

View File

@@ -1,6 +1,6 @@
/* /*
* controldata.h * controldata.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California

443
dbutils.c
View File

@@ -1,7 +1,7 @@
/* /*
* dbutils.c - Database connection/management functions * dbutils.c - Database connection/management functions
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
@@ -437,15 +437,18 @@ free_conninfo_params(t_conninfo_param_list *param_list)
for (c = 0; c < param_list->size; c++) for (c = 0; c < param_list->size; c++)
{ {
if (param_list->keywords[c] != NULL) if (param_list->keywords != NULL && param_list->keywords[c] != NULL)
pfree(param_list->keywords[c]); pfree(param_list->keywords[c]);
if (param_list->values[c] != NULL) if (param_list->values != NULL && param_list->values[c] != NULL)
pfree(param_list->values[c]); pfree(param_list->values[c]);
} }
pfree(param_list->keywords); if (param_list->keywords != NULL)
pfree(param_list->values); pfree(param_list->keywords);
if (param_list->values != NULL)
pfree(param_list->values);
} }
@@ -1255,7 +1258,7 @@ get_primary_node_id(PGconn *conn)
initPQExpBuffer(&query); initPQExpBuffer(&query);
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
"SELECT node_id " "SELECT node_id "
" FROM repmgr.nodes " " FROM repmgr.nodes "
" WHERE type = 'primary' " " WHERE type = 'primary' "
" AND active IS TRUE "); " AND active IS TRUE ");
@@ -1608,7 +1611,12 @@ repmgrd_get_local_node_id(PGconn *conn)
res = PQexec(conn, "SELECT repmgr.get_local_node_id()"); res = PQexec(conn, "SELECT repmgr.get_local_node_id()");
if (!PQgetisnull(res, 0, 0)) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_error(_("unable to execute \"SELECT repmgr.get_local_node_id()\""));
log_detail("%s", PQerrorMessage(conn));
}
else if (!PQgetisnull(res, 0, 0))
{ {
local_node_id = atoi(PQgetvalue(res, 0, 0)); local_node_id = atoi(PQgetvalue(res, 0, 0));
} }
@@ -1861,8 +1869,8 @@ get_node_record(PGconn *conn, int node_id, t_node_info *node_info)
initPQExpBuffer(&query); initPQExpBuffer(&query);
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
"SELECT " REPMGR_NODES_COLUMNS "SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
" WHERE node_id = %i", " WHERE n.node_id = %i",
node_id); node_id);
log_verbose(LOG_DEBUG, "get_node_record():\n %s", query.data); log_verbose(LOG_DEBUG, "get_node_record():\n %s", query.data);
@@ -1889,8 +1897,8 @@ get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_i
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
"SELECT " REPMGR_NODES_COLUMNS "SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
" WHERE node_name = '%s' ", " WHERE n.node_name = '%s' ",
node_name); node_name);
log_verbose(LOG_DEBUG, "get_node_record_by_name():\n %s", query.data); log_verbose(LOG_DEBUG, "get_node_record_by_name():\n %s", query.data);
@@ -2015,8 +2023,8 @@ get_all_node_records(PGconn *conn, NodeInfoList *node_list)
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" SELECT " REPMGR_NODES_COLUMNS " SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
"ORDER BY node_id "); "ORDER BY n.node_id ");
log_verbose(LOG_DEBUG, "get_all_node_records():\n%s", query.data); log_verbose(LOG_DEBUG, "get_all_node_records():\n%s", query.data);
@@ -2041,9 +2049,9 @@ get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *node_list)
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" SELECT " REPMGR_NODES_COLUMNS " SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
" WHERE upstream_node_id = %i " " WHERE n.upstream_node_id = %i "
"ORDER BY node_id ", "ORDER BY n.node_id ",
node_id); node_id);
log_verbose(LOG_DEBUG, "get_downstream_node_records():\n%s", query.data); log_verbose(LOG_DEBUG, "get_downstream_node_records():\n%s", query.data);
@@ -2070,11 +2078,11 @@ get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id,
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" SELECT " REPMGR_NODES_COLUMNS " SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
" WHERE upstream_node_id = %i " " WHERE n.upstream_node_id = %i "
" AND node_id != %i " " AND n.node_id != %i "
" AND active IS TRUE " " AND n.active IS TRUE "
"ORDER BY node_id ", "ORDER BY n.node_id ",
upstream_node_id, upstream_node_id,
node_id); node_id);
@@ -2102,8 +2110,8 @@ get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list)
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" SELECT " REPMGR_NODES_COLUMNS " SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes " " FROM repmgr.nodes n "
"ORDER BY priority DESC, node_name "); "ORDER BY n.priority DESC, n.node_name ");
log_verbose(LOG_DEBUG, "get_node_records_by_priority():\n%s", query.data); log_verbose(LOG_DEBUG, "get_node_records_by_priority():\n%s", query.data);
@@ -2118,7 +2126,11 @@ get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list)
return; return;
} }
void /*
* return all node records together with their upstream's node name,
* if available.
*/
bool
get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list) get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list)
{ {
PQExpBufferData query; PQExpBufferData query;
@@ -2140,15 +2152,61 @@ get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list)
termPQExpBuffer(&query); termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_error(_("unable to retrieve node records"));
log_detail("%s", PQerrorMessage(conn));
PQclear(res);
return false;
}
_populate_node_records(res, node_list); _populate_node_records(res, node_list);
PQclear(res); PQclear(res);
return; return true;
} }
bool
get_downsteam_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *node_list)
{
PQExpBufferData query;
PGresult *res = NULL;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" SELECT " REPMGR_NODES_COLUMNS
" FROM repmgr.nodes n "
"LEFT JOIN pg_catalog.pg_replication_slots rs "
" ON rs.slot_name = n.node_name "
" WHERE rs.slot_name IS NULL "
" AND n.node_id != %i ",
this_node_id);
log_verbose(LOG_DEBUG, "get_all_node_records_with_missing_slot():\n%s", query.data);
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_error(_("unable to retrieve node records"));
log_detail("%s", PQerrorMessage(conn));
PQclear(res);
return false;
}
_populate_node_records(res, node_list);
PQclear(res);
return true;
}
bool bool
create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info) create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info)
{ {
@@ -2266,9 +2324,11 @@ _create_update_node_record(PGconn *conn, char *action, t_node_info *node_info)
if (PQresultStatus(res) != PGRES_COMMAND_OK) if (PQresultStatus(res) != PGRES_COMMAND_OK)
{ {
log_error(_("unable to %s node record:\n %s"), log_error(_("unable to %s node record for node \"%s\" (ID: %i)"),
action, action,
PQerrorMessage(conn)); node_info->node_name,
node_info->node_id);
log_detail("%s", PQerrorMessage(conn));
PQclear(res); PQclear(res);
return false; return false;
} }
@@ -2592,12 +2652,47 @@ truncate_node_records(PGconn *conn)
return true; return true;
} }
bool
update_node_record_slot_name(PGconn *primary_conn, int node_id, char *slot_name)
{
PQExpBufferData query;
PGresult *res = NULL;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" UPDATE repmgr.nodes "
" SET slot_name = '%s' "
" WHERE node_id = %i ",
slot_name,
node_id);
res = PQexec(primary_conn, query.data);
termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
log_error(_("unable to set node record slot name:\n %s"),
PQerrorMessage(primary_conn));
PQclear(res);
return false;
}
PQclear(res);
return true;
}
void void
get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *node_info) get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *node_info)
{ {
PQExpBufferData query; PQExpBufferData query;
PGresult *res = NULL; PGresult *res = NULL;
if (server_version_num == UNKNOWN_SERVER_VERSION_NUM)
server_version_num = get_server_version(conn, NULL);
Assert(server_version_num != UNKNOWN_SERVER_VERSION_NUM);
initPQExpBuffer(&query); initPQExpBuffer(&query);
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
@@ -2618,8 +2713,8 @@ get_node_replication_stats(PGconn *conn, int server_version_num, t_node_info *no
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" current_setting('max_replication_slots')::INT AS max_replication_slots, " " current_setting('max_replication_slots')::INT AS max_replication_slots, "
" (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots) AS total_replication_slots, " " (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots) AS total_replication_slots, "
" (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE active = TRUE) AS active_replication_slots, " " (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE active IS TRUE) AS active_replication_slots, "
" (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE active = FALSE) AS inactive_replication_slots, "); " (SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE active IS FALSE) AS inactive_replication_slots, ");
} }
@@ -2962,7 +3057,6 @@ create_event_record(PGconn *conn, t_configuration_options *options, int node_id,
} }
/* /*
* create_event_notification() * create_event_notification()
* *
@@ -3063,7 +3157,7 @@ _create_event(PGconn *conn, t_configuration_options *options, int node_id, char
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
/* we don't treat this as an error */ /* we don't treat this as a fatal error */
log_warning(_("unable to create event record:\n %s"), log_warning(_("unable to create event record:\n %s"),
PQerrorMessage(conn)); PQerrorMessage(conn));
@@ -3216,6 +3310,20 @@ _create_event(PGconn *conn, t_configuration_options *options, int node_id, char
dst_ptr += strlen(dst_ptr); dst_ptr += strlen(dst_ptr);
} }
break; break;
case 'p':
/* %p: primary id ("standby_switchover": former primary id) */
src_ptr++;
if (event_info->node_id != UNKNOWN_NODE_ID)
{
PQExpBufferData node_id;
initPQExpBuffer(&node_id);
appendPQExpBuffer(&node_id,
"%i", event_info->node_id);
strlcpy(dst_ptr, node_id.data, end_ptr - dst_ptr);
dst_ptr += strlen(dst_ptr);
termPQExpBuffer(&node_id);
}
break;
default: default:
/* otherwise treat the % as not special */ /* otherwise treat the % as not special */
if (dst_ptr < end_ptr) if (dst_ptr < end_ptr)
@@ -3249,10 +3357,102 @@ _create_event(PGconn *conn, t_configuration_options *options, int node_id, char
} }
PGresult *
get_event_records(PGconn *conn, int node_id, const char *node_name, const char *event, bool all, int limit)
{
PGresult *res;
PQExpBufferData query;
PQExpBufferData where_clause;
initPQExpBuffer(&query);
initPQExpBuffer(&where_clause);
/* LEFT JOIN used here as a node record may have been removed */
appendPQExpBuffer(&query,
" SELECT e.node_id, n.node_name, e.event, e.successful, "
" TO_CHAR(e.event_timestamp, 'YYYY-MM-DD HH24:MI:SS') AS timestamp, "
" e.details "
" FROM repmgr.events e "
"LEFT JOIN repmgr.nodes n ON e.node_id = n.node_id ");
if (node_id != UNKNOWN_NODE_ID)
{
append_where_clause(&where_clause,
"n.node_id=%i", node_id);
}
else if (node_name[0] != '\0')
{
char *escaped = escape_string(conn, node_name);
if (escaped == NULL)
{
log_error(_("unable to escape value provided for node name"));
log_detail(_("node name is: \"%s\""), node_name);
}
else
{
append_where_clause(&where_clause,
"n.node_name='%s'",
escaped);
pfree(escaped);
}
}
if (event[0] != '\0')
{
char *escaped = escape_string(conn, event);
if (escaped == NULL)
{
log_error(_("unable to escape value provided for event"));
log_detail(_("event is: \"%s\""), event);
}
else
{
append_where_clause(&where_clause,
"e.event='%s'",
escaped);
pfree(escaped);
}
}
appendPQExpBuffer(&query, "\n%s\n",
where_clause.data);
appendPQExpBuffer(&query,
" ORDER BY e.event_timestamp DESC");
if (all == false && limit > 0)
{
appendPQExpBuffer(&query, " LIMIT %i",
limit);
}
log_debug("do_cluster_event():\n%s", query.data);
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
termPQExpBuffer(&where_clause);
return res;
}
/* ========================== */ /* ========================== */
/* replication slot functions */ /* replication slot functions */
/* ========================== */ /* ========================== */
void
create_slot_name(char *slot_name, int node_id)
{
maxlen_snprintf(slot_name, "repmgr_slot_%i", node_id);
}
bool bool
create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg) create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg)
{ {
@@ -3410,6 +3610,45 @@ get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record)
return RECORD_FOUND; return RECORD_FOUND;
} }
int
get_free_replication_slots(PGconn *conn)
{
PQExpBufferData query;
PGresult *res = NULL;
int free_slots = 0;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" SELECT pg_catalog.current_setting('max_replication_slots')::INT - "
" COUNT(*) AS free_slots"
" FROM pg_catalog.pg_replication_slots");
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
log_error(_("unable to execute replication slot query"));
log_detail("%s", PQerrorMessage(conn));
PQclear(res);
return -1;
}
if (PQntuples(res) == 0)
{
PQclear(res);
return -1;
}
free_slots = atoi(PQgetvalue(res, 0, 0));
PQclear(res);
return free_slots;
}
/* ==================== */ /* ==================== */
/* tablespace functions */ /* tablespace functions */
/* ==================== */ /* ==================== */
@@ -4079,20 +4318,23 @@ is_active_bdr_node(PGconn *conn, const char *node_name)
" SELECT COALESCE(s.active, TRUE) AS active" " SELECT COALESCE(s.active, TRUE) AS active"
" FROM bdr.bdr_nodes n " " FROM bdr.bdr_nodes n "
" LEFT JOIN pg_catalog.pg_replication_slots s " " LEFT JOIN pg_catalog.pg_replication_slots s "
" ON slot_name=bdr.bdr_format_slot_name(n.node_sysid, n.node_timeline, n.node_dboid, (SELECT oid FROM pg_database WHERE datname = current_database())) " " ON s.slot_name=bdr.bdr_format_slot_name(n.node_sysid, n.node_timeline, n.node_dboid, (SELECT oid FROM pg_catalog.pg_database WHERE datname = pg_catalog.current_database())) "
" WHERE node_name='%s' ", " WHERE n.node_name='%s' ",
node_name); node_name);
log_verbose(LOG_DEBUG, "is_active_bdr_node():\n %s", query.data);
res = PQexec(conn, query.data); res = PQexec(conn, query.data);
termPQExpBuffer(&query); termPQExpBuffer(&query);
/* we don't care if the query fails */
if (PQresultStatus(res) != PGRES_TUPLES_OK || PQntuples(res) == 0) if (PQresultStatus(res) != PGRES_TUPLES_OK || PQntuples(res) == 0)
{ {
is_active_bdr_node = false; is_active_bdr_node = false;
} }
else else
{ {
is_active_bdr_node = atoi(PQgetvalue(res, 0, 0)) == 1 ? true : false; is_active_bdr_node = atobool(PQgetvalue(res, 0, 0));
} }
PQclear(res); PQclear(res);
@@ -4112,8 +4354,8 @@ is_bdr_repmgr(PGconn *conn)
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
"SELECT COUNT(*)" "SELECT COUNT(*)"
" FROM repmgr.nodes" " FROM repmgr.nodes n"
" WHERE type != 'bdr' "); " WHERE n.type != 'bdr' ");
res = PQexec(conn, query.data); res = PQexec(conn, query.data);
termPQExpBuffer(&query); termPQExpBuffer(&query);
@@ -4202,7 +4444,7 @@ add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char
bool bool
bdr_node_exists(PGconn *conn, const char *node_name) bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name)
{ {
PQExpBufferData query; PQExpBufferData query;
PGresult *res = NULL; PGresult *res = NULL;
@@ -4211,10 +4453,7 @@ bdr_node_exists(PGconn *conn, const char *node_name)
initPQExpBuffer(&query); initPQExpBuffer(&query);
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
"SELECT COUNT(*)" "SELECT bdr.bdr_get_local_node_name() AS node_name");
" FROM bdr.bdr_nodes"
" WHERE node_name = '%s'",
node_name);
res = PQexec(conn, query.data); res = PQexec(conn, query.data);
termPQExpBuffer(&query); termPQExpBuffer(&query);
@@ -4225,7 +4464,9 @@ bdr_node_exists(PGconn *conn, const char *node_name)
} }
else else
{ {
node_exists = atoi(PQgetvalue(res, 0, 0)) == 1 ? true : false; node_exists = true;
appendPQExpBuffer(bdr_local_node_name,
"%s", PQgetvalue(res, 0, 0));
} }
PQclear(res); PQclear(res);
@@ -4283,9 +4524,9 @@ get_bdr_other_node_name(PGconn *conn, int node_id, char *node_name)
initPQExpBuffer(&query); initPQExpBuffer(&query);
appendPQExpBuffer(&query, appendPQExpBuffer(&query,
" SELECT node_name " " SELECT n.node_name "
" FROM repmgr.nodes " " FROM repmgr.nodes n "
" WHERE node_id != %i", " WHERE n.node_id != %i",
node_id); node_id);
log_verbose(LOG_DEBUG, "get_bdr_other_node_name():\n %s", query.data); log_verbose(LOG_DEBUG, "get_bdr_other_node_name():\n %s", query.data);
@@ -4510,3 +4751,115 @@ unset_bdr_failover_handler(PGconn *conn)
PQclear(res); PQclear(res);
return; return;
} }
bool
bdr_node_has_repmgr_set(PGconn *conn, const char *node_name)
{
PQExpBufferData query;
PGresult *res = NULL;
bool has_repmgr_set = false;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" SELECT COUNT(*) "
" FROM UNNEST(bdr.connection_get_replication_sets('%s') AS repset "
" WHERE repset = 'repmgr'",
node_name);
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_TUPLES_OK || PQntuples(res) == 0)
{
has_repmgr_set = false;
}
else
{
has_repmgr_set = atoi(PQgetvalue(res, 0, 0)) == 1 ? true : false;
}
PQclear(res);
return has_repmgr_set;
}
bool
bdr_node_set_repmgr_set(PGconn *conn, const char *node_name)
{
PQExpBufferData query;
PGresult *res = NULL;
bool success = true;
initPQExpBuffer(&query);
appendPQExpBuffer(&query,
" SELECT bdr.connection_set_replication_sets( "
" ARRAY( "
" SELECT repset::TEXT "
" FROM UNNEST(bdr.connection_get_replication_sets('%s')) AS repset "
" UNION "
" SELECT 'repmgr'::TEXT "
" ), "
" '%s' "
" ) ",
node_name,
node_name);
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
success = false;
}
PQclear(res);
return success;
}
/* miscellaneous debugging functions */
const char *
print_node_status(NodeStatus node_status)
{
switch (node_status)
{
case NODE_STATUS_UNKNOWN:
return "UNKNOWN";
case NODE_STATUS_UP:
return "UP";
case NODE_STATUS_SHUTTING_DOWN:
return "SHUTTING_DOWN";
case NODE_STATUS_DOWN:
return "DOWN";
case NODE_STATUS_UNCLEAN_SHUTDOWN:
return "UNCLEAN_SHUTDOWN";
}
return "UNIDENTIFIED_STATUS";
}
const char *
print_pqping_status(PGPing ping_status)
{
switch (ping_status)
{
case PQPING_OK:
return "PQPING_OK";
case PQPING_REJECT:
return "PQPING_REJECT";
case PQPING_NO_RESPONSE:
return "PQPING_NO_RESPONSE";
case PQPING_NO_ATTEMPT:
return "PQPING_NO_ATTEMPT";
}
return "PQPING_UNKNOWN_STATUS";
}

View File

@@ -1,7 +1,7 @@
/* /*
* dbutils.h * dbutils.h
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@
#include "strutil.h" #include "strutil.h"
#include "voting.h" #include "voting.h"
#define REPMGR_NODES_COLUMNS "node_id, type, upstream_node_id, node_name, conninfo, repluser, slot_name, location, priority, active, config_file, '' AS upstream_node_name " #define REPMGR_NODES_COLUMNS "n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name "
#define BDR_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_status, node_name, node_local_dsn, node_init_from_dsn, node_read_only, node_seq_id" #define BDR_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_status, node_name, node_local_dsn, node_init_from_dsn, node_read_only, node_seq_id"
#define ERRBUFF_SIZE 512 #define ERRBUFF_SIZE 512
@@ -74,10 +74,19 @@ typedef enum
{ {
NODE_STATUS_UNKNOWN = -1, NODE_STATUS_UNKNOWN = -1,
NODE_STATUS_UP, NODE_STATUS_UP,
NODE_STATUS_SHUTTING_DOWN,
NODE_STATUS_DOWN, NODE_STATUS_DOWN,
NODE_STATUS_UNCLEAN_SHUTDOWN NODE_STATUS_UNCLEAN_SHUTDOWN
} NodeStatus; } NodeStatus;
typedef enum
{
CONN_UNKNOWN = -1,
CONN_OK,
CONN_BAD,
CONN_ERROR
} ConnectionStatus;
typedef enum typedef enum
{ {
SLOT_UNKNOWN = -1, SLOT_UNKNOWN = -1,
@@ -174,11 +183,13 @@ typedef struct s_event_info
{ {
char *node_name; char *node_name;
char *conninfo_str; char *conninfo_str;
int node_id;
} t_event_info; } t_event_info;
#define T_EVENT_INFO_INITIALIZER { \ #define T_EVENT_INFO_INITIALIZER { \
NULL, \ NULL, \
NULL \ NULL, \
UNKNOWN_NODE_ID \
} }
@@ -369,10 +380,8 @@ bool check_cluster_schema(PGconn *conn);
/* GUC manipulation functions */ /* GUC manipulation functions */
bool set_config(PGconn *conn, const char *config_param, const char *config_value); bool set_config(PGconn *conn, const char *config_param, const char *config_value);
bool set_config_bool(PGconn *conn, const char *config_param, bool state); bool set_config_bool(PGconn *conn, const char *config_param, bool state);
int guc_set(PGconn *conn, const char *parameter, const char *op, int guc_set(PGconn *conn, const char *parameter, const char *op, const char *value);
const char *value); int guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype);
int guc_set_typed(PGconn *conn, const char *parameter, const char *op,
const char *value, const char *datatype);
bool get_pg_setting(PGconn *conn, const char *setting, char *output); bool get_pg_setting(PGconn *conn, const char *setting, char *output);
/* server information functions */ /* server information functions */
@@ -409,7 +418,8 @@ void get_all_node_records(PGconn *conn, NodeInfoList *node_list);
void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes); void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list); void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
void get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list); void get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list);
void get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list); bool get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
bool get_downsteam_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *noede_list);
bool create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info); bool create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
bool update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info); bool update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
@@ -421,10 +431,10 @@ bool update_node_record_set_primary(PGconn *conn, int this_node_id);
bool update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id); bool update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id);
bool update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active); bool update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active);
bool update_node_record_conn_priority(PGconn *conn, t_configuration_options *options); bool update_node_record_conn_priority(PGconn *conn, t_configuration_options *options);
bool update_node_record_slot_name(PGconn *primary_conn, int node_id, char *slot_name);
bool witness_copy_node_records(PGconn *primary_conn, PGconn *witness_conn); bool witness_copy_node_records(PGconn *primary_conn, PGconn *witness_conn);
void clear_node_info_list(NodeInfoList *nodes); void clear_node_info_list(NodeInfoList *nodes);
/* PostgreSQL configuration file location functions */ /* PostgreSQL configuration file location functions */
@@ -437,11 +447,14 @@ void config_file_list_add(t_configfile_list *list, const char *file, const char
bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details); bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
bool create_event_notification(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details); bool create_event_notification(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details);
bool create_event_notification_extended(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details, t_event_info *event_info); bool create_event_notification_extended(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details, t_event_info *event_info);
PGresult *get_event_records(PGconn *conn, int node_id, const char *node_name, const char *event, bool all, int limit);
/* replication slot functions */ /* replication slot functions */
void create_slot_name(char *slot_name, int node_id);
bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg); bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
bool drop_replication_slot(PGconn *conn, char *slot_name); bool drop_replication_slot(PGconn *conn, char *slot_name);
RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record); RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
int get_free_replication_slots(PGconn *conn);
/* tablespace functions */ /* tablespace functions */
bool get_tablespace_name_by_location(PGconn *conn, const char *location, char *name); bool get_tablespace_name_by_location(PGconn *conn, const char *location, char *name);
@@ -499,12 +512,17 @@ bool is_bdr_repmgr(PGconn *conn);
bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set); bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set); bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
void add_extension_tables_to_bdr_replication_set(PGconn *conn); void add_extension_tables_to_bdr_replication_set(PGconn *conn);
bool bdr_node_name_matches(PGconn *conn, const char *node_name, PQExpBufferData *bdr_local_node_name);
bool bdr_node_exists(PGconn *conn, const char *node_name);
ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name); ReplSlotStatus get_bdr_node_replication_slot_status(PGconn *conn, const char *node_name);
void get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf); void get_bdr_other_node_name(PGconn *conn, int node_id, char *name_buf);
bool am_bdr_failover_handler(PGconn *conn, int node_id); bool am_bdr_failover_handler(PGconn *conn, int node_id);
void unset_bdr_failover_handler(PGconn *conn); void unset_bdr_failover_handler(PGconn *conn);
bool bdr_node_has_repmgr_set(PGconn *conn, const char *node_name);
bool bdr_node_set_repmgr_set(PGconn *conn, const char *node_name);
/* miscellaneous debugging functions */
const char *print_node_status(NodeStatus node_status);
const char *print_pqping_status(PGPing ping_status);
#endif /* _REPMGR_DBUTILS_H_ */ #endif /* _REPMGR_DBUTILS_H_ */

200
dirutil.c
View File

@@ -3,7 +3,7 @@
* dirmod.c * dirmod.c
* directory handling functions * directory handling functions
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -21,6 +21,7 @@
#include <unistd.h> #include <unistd.h>
#include <dirent.h> #include <dirent.h>
#include <signal.h>
#include <sys/stat.h> #include <sys/stat.h>
#include <errno.h> #include <errno.h>
#include <stdio.h> #include <stdio.h>
@@ -34,34 +35,33 @@
#include "dirutil.h" #include "dirutil.h"
#include "strutil.h" #include "strutil.h"
#include "log.h" #include "log.h"
#include "controldata.h"
static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf); static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf);
/* PID can be negative if backend is standalone */
typedef long pgpid_t;
/* /*
* make sure the directory either doesn't exist or is empty * Check if a directory exists, and if so whether it is empty.
* we use this function to check the new data directory and
* the directories for tablespaces
* *
* This is the same check initdb does on the new PGDATA dir * This function is used for checking both the data directory
* * and tablespace directories.
* Returns 0 if nonexistent, 1 if exists and empty, 2 if not empty,
* or -1 if trouble accessing directory
*/ */
int DataDirState
check_dir(char *path) check_dir(char *path)
{ {
DIR *chkdir; DIR *chkdir = NULL;
struct dirent *file; struct dirent *file = NULL;
int result = 1; int result = DIR_EMPTY;
errno = 0; errno = 0;
chkdir = opendir(path); chkdir = opendir(path);
if (!chkdir) if (!chkdir)
return (errno == ENOENT) ? 0 : -1; return (errno == ENOENT) ? DIR_NOENT : DIR_ERROR;
while ((file = readdir(chkdir)) != NULL) while ((file = readdir(chkdir)) != NULL)
{ {
@@ -73,25 +73,15 @@ check_dir(char *path)
} }
else else
{ {
result = 2; /* not empty */ result = DIR_NOT_EMPTY;
break; break;
} }
} }
#ifdef WIN32
/*
* This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
* released version
*/
if (GetLastError() == ERROR_NO_MORE_FILES)
errno = 0;
#endif
closedir(chkdir); closedir(chkdir);
if (errno != 0) if (errno != 0)
return -1; /* some kind of I/O error? */ return DIR_ERROR; /* some kind of I/O error? */
return result; return result;
} }
@@ -106,12 +96,13 @@ create_dir(char *path)
if (mkdir_p(path, 0700) == 0) if (mkdir_p(path, 0700) == 0)
return true; return true;
log_error(_("unable to create directory \"%s\": %s"), log_error(_("unable to create directory \"%s\""), path);
path, strerror(errno)); log_detail("%s", strerror(errno));
return false; return false;
} }
bool bool
set_dir_permissions(char *path) set_dir_permissions(char *path)
{ {
@@ -146,26 +137,6 @@ mkdir_p(char *path, mode_t omode)
oumask = 0; oumask = 0;
retval = 0; retval = 0;
#ifdef WIN32
/* skip network and drive specifiers for win32 */
if (strlen(p) >= 2)
{
if (p[0] == '/' && p[1] == '/')
{
/* network drive */
p = strstr(p + 2, "/");
if (p == NULL)
return 1;
}
else if (p[1] == ':' &&
((p[0] >= 'a' && p[0] <= 'z') ||
(p[0] >= 'A' && p[0] <= 'Z')))
{
/* local drive */
p += 2;
}
}
#endif
if (p[0] == '/') /* Skip leading '/'. */ if (p[0] == '/') /* Skip leading '/'. */
++p; ++p;
@@ -242,17 +213,91 @@ is_pg_dir(char *path)
return false; return false;
} }
/*
* Attempt to determine if a PostgreSQL data directory is in use
* by reading the pidfile. This is the same mechanism used by
* "pg_ctl".
*
* This function will abort with appropriate log messages if a file error
* is encountered, as the user will need to address the situation before
* any further useful progress can be made.
*/
PgDirState
is_pg_running(char *path)
{
long pid;
FILE *pidf;
char pid_file[MAXPGPATH];
/* it's reasonable to assume the pidfile name will not change */
snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", path);
pidf = fopen(pid_file, "r");
if (pidf == NULL)
{
/*
* No PID file - PostgreSQL shouldn't be running. From 9.3 (the
* earliesty version we care about) removal of the PID file will
* cause the postmaster to shut down, so it's highly unlikely
* that PostgreSQL will still be running.
*/
if (errno == ENOENT)
{
return PG_DIR_NOT_RUNNING;
}
else
{
log_error(_("unable to open PostgreSQL PID file \"%s\""), pid_file);
log_detail("%s", strerror(errno));
exit(ERR_BAD_CONFIG);
}
}
/*
* In the unlikely event we're unable to extract a PID from the PID file,
* log a warning but assume we're not dealing with a running instance
* as PostgreSQL should have shut itself down in these cases anyway.
*/
if (fscanf(pidf, "%ld", &pid) != 1)
{
/* Is the file empty? */
if (ftell(pidf) == 0 && feof(pidf))
{
log_warning(_("PostgreSQL PID file \"%s\" is empty"), path);
}
else
{
log_warning(_("invalid data in PostgreSQL PID file \"%s\""), path);
}
return PG_DIR_NOT_RUNNING;
}
fclose(pidf);
if (pid == getpid())
return PG_DIR_NOT_RUNNING;
if (pid == getppid())
return PG_DIR_NOT_RUNNING;
if (kill(pid, 0) == 0)
return PG_DIR_RUNNING;
return PG_DIR_NOT_RUNNING;
}
bool bool
create_pg_dir(char *path, bool force) create_pg_dir(char *path, bool force)
{ {
bool pg_dir = false; /* Check this directory can be used as a PGDATA dir */
/* Check this directory could be used as a PGDATA dir */
switch (check_dir(path)) switch (check_dir(path))
{ {
case 0: case DIR_NOENT:
/* dir not there, must create it */ /* directory does not exist, attempt to create it */
log_info(_("creating directory \"%s\"..."), path); log_info(_("creating directory \"%s\"..."), path);
if (!create_dir(path)) if (!create_dir(path))
@@ -262,52 +307,51 @@ create_pg_dir(char *path, bool force)
return false; return false;
} }
break; break;
case 1: case DIR_EMPTY:
/* Present but empty, fix permissions and use it */ /* exists but empty, fix permissions and use it */
log_info(_("checking and correcting permissions on existing directory %s"), log_info(_("checking and correcting permissions on existing directory \"%s\""),
path); path);
if (!set_dir_permissions(path)) if (!set_dir_permissions(path))
{ {
log_error(_("unable to change permissions of directory \"%s\":\n %s"), log_error(_("unable to change permissions of directory \"%s\""), path);
path, strerror(errno)); log_detail("%s", strerror(errno));
return false; return false;
} }
break; break;
case 2: case DIR_NOT_EMPTY:
/* Present and not empty */ /* exists but is not empty */
log_warning(_("directory \"%s\" exists but is not empty"), log_warning(_("directory \"%s\" exists but is not empty"),
path); path);
pg_dir = is_pg_dir(path); if (is_pg_dir(path))
if (pg_dir && force)
{ {
/* TODO: check DB state, if not running overwrite */ if (force == true)
if (false)
{ {
log_notice(_("deleting existing data directory \"%s\""), path); log_notice(_("-F/--force provided - deleting existing data directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS); nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
return true;
} }
/* Let it continue */
break;
}
else if (pg_dir && !force)
{
log_hint(_("This looks like a PostgreSQL directory.\n"
"If you are sure you want to clone here, "
"please check there is no PostgreSQL server "
"running and use the -F/--force option"));
return false; return false;
} }
else
return false; {
default: if (force == true)
{
log_notice(_("deleting existing directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
return true;
}
return false;
}
break;
case DIR_ERROR:
log_error(_("could not access directory \"%s\": %s"), log_error(_("could not access directory \"%s\": %s"),
path, strerror(errno)); path, strerror(errno));
return false; return false;
} }
return true; return true;
} }

View File

@@ -1,6 +1,6 @@
/* /*
* dirutil.h * dirutil.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -19,12 +19,29 @@
#ifndef _DIRUTIL_H_ #ifndef _DIRUTIL_H_
#define _DIRUTIL_H_ #define _DIRUTIL_H_
typedef enum
{
DIR_ERROR = -1,
DIR_NOENT,
DIR_EMPTY,
DIR_NOT_EMPTY
} DataDirState;
typedef enum
{
PG_DIR_ERROR = -1,
PG_DIR_NOT_RUNNING,
PG_DIR_RUNNING
} PgDirState;
extern int mkdir_p(char *path, mode_t omode); extern int mkdir_p(char *path, mode_t omode);
extern bool set_dir_permissions(char *path); extern bool set_dir_permissions(char *path);
extern int check_dir(char *path); extern DataDirState check_dir(char *path);
extern bool create_dir(char *path); extern bool create_dir(char *path);
extern bool is_pg_dir(char *path); extern bool is_pg_dir(char *path);
extern PgDirState is_pg_running(char *path);
extern bool create_pg_dir(char *path, bool force); extern bool create_pg_dir(char *path, bool force);
extern int rmdir_recursive(char *path); extern int rmdir_recursive(char *path);
#endif #endif

2
doc/.gitignore vendored
View File

@@ -2,4 +2,6 @@ HTML.index
bookindex.sgml bookindex.sgml
html-stamp html-stamp
html/ html/
nochunks.dsl
repmgr.html
version.sgml version.sgml

View File

@@ -10,7 +10,7 @@ SGMLINCLUDE = -D . -D ${srcdir}
SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged SPFLAGS += -wall -wno-unused-param -wno-empty -wfully-tagged
JADE.html.call = $(JADE) $(JADEFLAGS) $(SPFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -t sgml -i output-html JADE.html.call = $(JADE) $(JADEFLAGS) $(SPFLAGS) $(SGMLINCLUDE) $(CATALOG) -t sgml -i output-html
ALLSGML := $(wildcard $(srcdir)/*.sgml) ALLSGML := $(wildcard $(srcdir)/*.sgml)
# to build bookindex # to build bookindex
@@ -26,10 +26,15 @@ html: html-stamp
html-stamp: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css html-stamp: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
$(MKDIR_P) html $(MKDIR_P) html
$(JADE.html.call) -i include-index $< $(JADE.html.call) -d stylesheet.dsl -i include-index $<
cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/ cp $(srcdir)/stylesheet.css $(srcdir)/website-docs.css html/
touch $@ touch $@
repmgr.html: repmgr.sgml $(ALLSGML) $(GENERATED_SGML) stylesheet.dsl website-docs.css
sed '/html-index-filename/a\
(define nochunks #t)' <stylesheet.dsl >nochunks.dsl
$(JADE.html.call) -d nochunks.dsl -i include-index $< >repmgr.html
version.sgml: ${repmgr_top_builddir}/repmgr_version.h version.sgml: ${repmgr_top_builddir}/repmgr_version.h
{ \ { \
echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \ echo "<!ENTITY repmgrversion \"$(REPMGR_VERSION)\">"; \
@@ -37,7 +42,7 @@ version.sgml: ${repmgr_top_builddir}/repmgr_version.h
HTML.index: repmgr.sgml $(ALMOSTALLSGML) stylesheet.dsl HTML.index: repmgr.sgml $(ALMOSTALLSGML) stylesheet.dsl
@$(MKDIR_P) html @$(MKDIR_P) html
$(JADE.html.call) -V html-index $< $(JADE.html.call) -d stylesheet.dsl -V html-index $<
website-docs.css: website-docs.css:
@$(MKDIR_P) html @$(MKDIR_P) html

View File

@@ -1,5 +1,5 @@
<appendix id="appendix-faq" xreflabel="FAQ"> <appendix id="appendix-faq" xreflabel="FAQ">
<indexterm> <indexterm>
<primary>FAQ (Frequently Asked Questions)</primary> <primary>FAQ (Frequently Asked Questions)</primary>
</indexterm> </indexterm>

158
doc/appendix-packages.sgml Normal file
View File

@@ -0,0 +1,158 @@
<appendix id="appendix-packages" xreflabel="Package details">
<indexterm>
<primary>packages</primary>
</indexterm>
<title>&repmgr; package details</title>
<para>
This section provides technical details about various &repmgr; binary
packages, such as location of the installed binaries and
configuration files.
</para>
<sect1 id="packages-centos" xreflabel="CentOS packages">
<title>CentOS, RHEL, Scientific Linux etc.</title>
<para>
Currently packages are provided for versions 6.x and 7.x of CentOS et al.
</para>
<note>
<para>
For PostgreSQL 9.6 and lower, the CentOS packages use a mixture of <literal>9.6</literal>
and <literal>96</literal> in various places to designate the major version;
from PostgreSQL 10, the first part of the version number (e.g. <literal>10</literal>) is
the major version, so there is more consistency in file/path/package naming.
</para>
</note>
<table id="centos-7-packages">
<title>CentOS 7 packages</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
</row>
<row>
<entry>Package name example:</entry>
<entry><filename>repmgr10-4.0.0-1.rhel7.x86_64</filename></entry>
</row>
<row>
<entry>Metapackage:</entry>
<entry>(none)</entry>
</row>
<row>
<entry>Installation command:</entry>
<entry><literal>yum install -y repmgr10</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/pgsql-10/bin</filename></entry>
</row>
<row>
<entry>In default path:</entry>
<entry>NO</entry>
</row>
<row>
<entry>Configuration file location:</entry>
<entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
</row>
<row>
<entry>repmgrd service command:</entry>
<entry><literal>service repmgr10</literal></entry>
</row>
<row>
<entry>repmgrd service file location:</entry>
<entry><filename>/usr/lib/systemd/system/repmgr10.service</filename></entry>
</row>
<row>
<entry>repmgrd log file location:</entry>
<entry>(not specified)</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="centos-6-packages">
<title>CentOS 6 packages</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
</row>
<row>
<entry>Package name example:</entry>
<entry><filename>repmgr96-4.0.0-1.rhel6.x86_64</filename></entry>
</row>
<row>
<entry>Metapackage:</entry>
<entry>NO</entry>
</row>
<row>
<entry>Installation command:</entry>
<entry><literal>yum install -y repmgr96</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/pgsql-9.6/bin</filename></entry>
</row>
<row>
<entry>In default path:</entry>
<entry>NO</entry>
</row>
<row>
<entry>Configuration file location:</entry>
<entry><filename>/etc/repmgr/9.6/repmgr.conf</filename></entry>
</row>
<row>
<entry>repmgrd service command:</entry>
<entry>service repmgr-9.6</entry>
</row>
<row>
<entry>repmgrd service file location:</entry>
<entry><literal>/etc/init.d/repmgr-9.6</literal></entry>
</row>
<row>
<entry>repmgrd log file location:</entry>
<entry><filename>/var/log/repmgr/repmgrd-9.6.log</filename></entry>
</row>
</tbody>
</tgroup>
</table>
</sect1>
</appendix>

View File

@@ -1,28 +1,259 @@
<appendix id="appendix-release-notes"> <appendix id="appendix-release-notes">
<title>Release notes</title> <title>Release notes</title>
<indexterm> <indexterm>
<primary>Release notes</primary> <primary>Release notes</primary>
</indexterm> </indexterm>
<para> <para>
Changes to each &repmgr; release are documented in the release notes. Changes to each &repmgr; release are documented in the release notes.
Please read the release notes for all versions between Please read the release notes for all versions between
your current version and the version you are plan to upgrade to your current version and the version you are plan to upgrade to
before performing an upgrade, as there may be version-specific upgrade steps. before performing an upgrade, as there may be version-specific upgrade steps.
</para> </para>
<para>
See also: <xref linkend="upgrading-repmgr">
</para>
<sect1 id="release-4.0.3">
<title>Release 4.0.3</title>
<para><emphasis>??? Feb ??, 2018</emphasis></para>
<para>
&repmgr; 4.0.3 contains some bug fixes and and a number of
usability enhancements related to logging/diagnostics,
event notifications and pre-action checks.
</para>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
behaviour when <command>pg_ctl</command> is used to control the server and logging output is
not explicitly redirected
</para>
</listitem>
<listitem>
<para>
improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
log messages and provide new exit code <literal>ERR_SWITCHOVER_INCOMPLETE</literal> when old primary could
not be shut down cleanly
</para>
</listitem>
<listitem>
<para>
add check to verify the demotion candidate can make a replication connection to the
promotion candidate before executing a switchover (GitHub #370)
</para>
</listitem>
<listitem>
<para>
add check for sufficient walsenders and replication slots on the promotion candidate before executing
<command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
(GitHub #371)
</para>
</listitem>
<listitem>
<para>
add --dry-run mode to <command><link linkend="repmgr-standby-switchover">repmgr standby follow</link></command>
(GitHub #369)
</para>
</listitem>
<listitem>
<para>
add <literal>standby_register_sync</literal> event notification, which is fired when
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command>
is run with the <option>--wait-sync</option> option and the new or updated standby node
record has synchronised to the standy (GitHub #374)
</para>
</listitem>
<listitem>
<para>
when running <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>,
if any node is unreachable, output the error message encountered in the list of warnings
(GitHub #369)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
ensure an inactive data directory can be overwritten when
cloning a standby (GitHub #366)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-node-status">repmgr node status</link></command>
upstream node display fixed (GitHub #363)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-primary-unregister">repmgr primary unregister</link></command>:
clarify usage and fix <literal>--help</literal> output (GitHub #373)
</para>
</listitem>
<listitem>
<para>
parsing of <varname>pg_basebackup_options</varname> fixed (GitHub #376)
</para>
</listitem>
<listitem>
<para>
ensure the <filename>pg_subtrans</filename> directory is created when cloning a
standby in Barman mode
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-witness-register">repmgr witness register</link></command>:
fix primary node check (GitHub #377).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<para> <sect1 id="release-4.0.2">
See also: <xref linkend="upgrading-repmgr"> <title>Release 4.0.2</title>
</para> <para><emphasis>Thu Jan 18, 2018</emphasis></para>
<para>
&repmgr; 4.0.2 contains some bug fixes and small usability enhancements.
</para>
<para>
This release can be installed as a simple package upgrade from &repmgr; 4.0.1 or 4.0;
<application>repmgrd</application> (if running) should be restarted.
</para>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
Recognize the <option>-t</option>/<option>--terse</option> option for
<command><link linkend="repmgr-cluster-event">repmgr cluster event</link></command> to hide
the <literal>Details</literal> column (GitHub #360)
</para>
</listitem>
<listitem>
<para>
Add "--wait-start" option for
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command>
(GitHub #356)
</para>
</listitem>
<listitem>
<para>
Add <literal>%p</literal> <link linkend="event-notifications">event notification parameter</link>
for <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Add missing -W option to <literal>getopt_long()</literal> invocation (GitHub #350)
</para>
</listitem>
<listitem>
<para>
Automatically create slot name if missing (GitHub #343)
</para>
</listitem>
<listitem>
<para>
Fixes to parsing output of remote repmgr invocations (GitHub #349)
</para>
</listitem>
<listitem>
<para>
When registering BDR nodes, automatically create missing connection replication set (GitHub #347)
</para>
</listitem>
<listitem>
<para>
Handle missing node record in <command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>
(GitHub #358)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Documentation</title>
<para>
<itemizedlist>
<listitem>
<para>
The documentation can now be built as a single HTML file (GitHub pull request #353)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.1"> <sect1 id="release-4.0.1">
<title>Release 4.0.1</title> <title>Release 4.0.1</title>
<para><emphasis>Mon Dec 4, 2017</emphasis></para> <para><emphasis>Wed Dec 13, 2017</emphasis></para>
<para> <para>
repmgr 4.0.1 is a bugfix release. &repmgr; 4.0.1 is a bugfix release.
</para> </para>
<sect2> <sect2>
<title>Bug fixes</title> <title>Bug fixes</title>
@@ -416,7 +647,7 @@
<sect2> <sect2>
<title>repmgrd</title> <title>repmgrd</title>
<para> <para>
The `repmgr` shared library has been renamed from <literal>repmgr_funcs</literal> to The shared library has been renamed from <literal>repmgr_funcs</literal> to
<literal>repmgr</literal>, meaning <varname>shared_preload_libraries</varname> <literal>repmgr</literal>, meaning <varname>shared_preload_libraries</varname>
in <filename>postgresql.conf</filename> needs to be updated to the new name: in <filename>postgresql.conf</filename> needs to be updated to the new name:
<programlisting> <programlisting>

View File

@@ -37,7 +37,7 @@
<filename>repmgr.conf</filename>. <filename>repmgr.conf</filename>.
</para> </para>
<para> <para>
This parameter accepts the following format placeholders: The following format placeholders are provided for all event notifications:
</para> </para>
<variablelist> <variablelist>
@@ -60,10 +60,10 @@
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term><option>%t</option></term> <term><option>%s</option></term>
<listitem> <listitem>
<para> <para>
success (1 or 0) success (1) or failure (0)
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@@ -85,6 +85,7 @@
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para> <para>
The values provided for <literal>%t</literal> and <literal>%d</literal> The values provided for <literal>%t</literal> and <literal>%d</literal>
will probably contain spaces, so should be quoted in the provided command will probably contain spaces, so should be quoted in the provided command
@@ -93,34 +94,61 @@
event_notification_command='/path/to/some/script %n %e %s "%t" "%d"' event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
</programlisting> </programlisting>
</para> </para>
<para> <para>
Additionally the following format placeholders are available for the event The following parameters are provided for a subset of event notifications:
type <varname>bdr_failover</varname> and optionally <varname>bdr_recovery</varname>:
</para> </para>
<variablelist> <variablelist>
<varlistentry>
<term><option>%p</option></term>
<listitem>
<para>
node ID of the current primary (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"> only)
</para>
</listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term><option>%c</option></term> <term><option>%c</option></term>
<listitem> <listitem>
<para> <para>
conninfo string of the next available node <literal>conninfo</literal> string of the primary node
(<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
<literal>conninfo</literal> string of the next available node
(<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term><option>%a</option></term> <term><option>%a</option></term>
<listitem> <listitem>
<para> <para>
name of the next available node name of the current primary node (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para> <para>
These should always be quoted. The values provided for <literal>%c</literal> and <literal>%a</literal>
will probably contain spaces, so should always be quoted.
</para> </para>
<para> <para>
By default, all notification types will be passed to the designated script; By default, all notification types will be passed to the designated script;
the notification types can be filtered to explicitly named ones: the notification types can be filtered to explicitly named ones using the
<varname>event_notifications</varname> parameter:
<itemizedlist spacing="compact" mark="bullet"> <itemizedlist spacing="compact" mark="bullet">
<listitem> <listitem>
@@ -132,6 +160,9 @@
<listitem> <listitem>
<simpara><literal>standby_register</literal></simpara> <simpara><literal>standby_register</literal></simpara>
</listitem> </listitem>
<listitem>
<simpara><literal>standby_register_sync</literal></simpara>
</listitem>
<listitem> <listitem>
<simpara><literal>standby_unregister</literal></simpara> <simpara><literal>standby_unregister</literal></simpara>
</listitem> </listitem>
@@ -147,6 +178,12 @@
<listitem> <listitem>
<simpara><literal>standby_disconnect_manual</literal></simpara> <simpara><literal>standby_disconnect_manual</literal></simpara>
</listitem> </listitem>
<listitem>
<simpara><literal>standby_failure</literal></simpara>
</listitem>
<listitem>
<simpara><literal>standby_recovery</literal></simpara>
</listitem>
<listitem> <listitem>
<simpara><literal>witness_register</literal></simpara> <simpara><literal>witness_register</literal></simpara>
</listitem> </listitem>
@@ -168,6 +205,18 @@
<listitem> <listitem>
<simpara><literal>repmgrd_failover_follow</literal></simpara> <simpara><literal>repmgrd_failover_follow</literal></simpara>
</listitem> </listitem>
<listitem>
<simpara><literal>repmgrd_upstream_disconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_upstream_reconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_promote_error</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_failover_promote</literal></simpara>
</listitem>
<listitem> <listitem>
<simpara><literal>bdr_failover</literal></simpara> <simpara><literal>bdr_failover</literal></simpara>
</listitem> </listitem>
@@ -186,6 +235,7 @@
</itemizedlist> </itemizedlist>
</para> </para>
<para> <para>
Note that under some circumstances (e.g. when no replication cluster primary Note that under some circumstances (e.g. when no replication cluster primary
could be located), it will not be possible to write an entry into the could be located), it will not be possible to write an entry into the

View File

@@ -80,6 +80,7 @@
<!ENTITY appendix-release-notes SYSTEM "appendix-release-notes.sgml"> <!ENTITY appendix-release-notes SYSTEM "appendix-release-notes.sgml">
<!ENTITY appendix-faq SYSTEM "appendix-faq.sgml"> <!ENTITY appendix-faq SYSTEM "appendix-faq.sgml">
<!ENTITY appendix-signatures SYSTEM "appendix-signatures.sgml"> <!ENTITY appendix-signatures SYSTEM "appendix-signatures.sgml">
<!ENTITY appendix-packages SYSTEM "appendix-packages.sgml">
<!ENTITY bookindex SYSTEM "bookindex.sgml"> <!ENTITY bookindex SYSTEM "bookindex.sgml">

View File

@@ -81,7 +81,7 @@
<para> <para>
Install the repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr96</literal>), e.g.: Install the repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr96</literal>), e.g.:
<programlisting> <programlisting>
$ yum install repmg96</programlisting> $ yum install repmgr96</programlisting>
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>

View File

@@ -155,6 +155,20 @@
The generated HTML files will be placed in the <filename>doc/html</filename> The generated HTML files will be placed in the <filename>doc/html</filename>
subdirectory of your source tree. subdirectory of your source tree.
</para> </para>
<para>
To build the documentation as a single HTML file, execute:
<programlisting>
cd doc/ && make repmgr.html</programlisting>
</para>
<note>
<simpara>
Due to changes in PostgreSQL's documentation build system from PostgreSQL 10,
the documentation can currently only be built agains PostgreSQL 9.6 or earlier.
This limitation will be fixed when time and resources permit.
</simpara>
</note>
</sect2> </sect2>

View File

@@ -3,7 +3,7 @@
<date>2017</date> <date>2017</date>
<copyright> <copyright>
<year>2010-2017</year> <year>2010-2018</year>
<holder>2ndQuadrant, Ltd.</holder> <holder>2ndQuadrant, Ltd.</holder>
</copyright> </copyright>
@@ -11,7 +11,7 @@
<title>Legal Notice</title> <title>Legal Notice</title>
<para> <para>
<productname>repmgr</productname> is Copyright &copy; 2010-2017 <productname>repmgr</productname> is Copyright &copy; 2010-2018
by 2ndQuadrant, Ltd. All rights reserved. by 2ndQuadrant, Ltd. All rights reserved.
</para> </para>

View File

@@ -178,8 +178,8 @@
<para> <para>
In order to effectively manage a replication cluster, &repmgr; needs to store In order to effectively manage a replication cluster, &repmgr; needs to store
information about the servers in the cluster in a dedicated database schema. information about the servers in the cluster in a dedicated database schema.
This schema is automatically by the &repmgr; extension, which is installed This schema is automatically created by the &repmgr; extension, which is installed
during the first step in initialising a &repmgr;-administered cluster during the first step in initializing a &repmgr;-administered cluster
(<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>) (<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
and contains the following objects: and contains the following objects:
<variablelist> <variablelist>

View File

@@ -1,37 +0,0 @@
<chapter id="repmgrd-bdr">
<indexterm>
<primary>repmgrd</primary>
<secondary>BDR</secondary>
</indexterm>
<indexterm>
<primary>BDR</primary>
</indexterm>
<title>BDR failover with repmgrd</title>
<para>
&repmgr; 4.x provides support for monitoring BDR nodes and taking action in
case one of the nodes fails.
</para>
<note>
<simpara>
Due to the nature of BDR, it's only safe to use this solution for
a two-node scenario. Introducing additional nodes will create an inherent
risk of node desynchronisation if a node goes down without being cleanly
removed from the cluster.
</simpara>
</note>
<para>
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with `repmgrd` and redirecting queries from the failed node to the remaining
active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically
reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
</para>
<sect1 id="prerequisites" xreflable="BDR prequisites">
</sect1>
</chapter>

View File

@@ -40,10 +40,13 @@
<simpara><literal>--node-name</literal>: restrict entries to node with this name</simpara> <simpara><literal>--node-name</literal>: restrict entries to node with this name</simpara>
</listitem> </listitem>
<listitem> <listitem>
<simpara><literal>--event</literal>: filter specific event</simpara> <simpara><literal>--event</literal>: filter specific event (see <xref linkend="event-notifications"> for a full list)</simpara>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
</para> </para>
<para>
The "Details" column can be omitted by providing <literal>--terse</literal>.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>

View File

@@ -48,7 +48,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>node_rejoin</literal> event notification will be generated. A <literal>node_rejoin</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -51,7 +51,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>primary_register</literal> event notification will be generated. A <literal>primary_register</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -13,7 +13,7 @@
<refsect1> <refsect1>
<title>Description</title> <title>Description</title>
<para> <para>
<command>repmgr primary register</command> unregisters an inactive primary node <command>repmgr primary unregister</command> unregisters an inactive primary node
from the &repmgr; metadata. This is typically when the primary has failed and is from the &repmgr; metadata. This is typically when the primary has failed and is
being removed from the cluster after a new primary has been promoted. being removed from the cluster after a new primary has been promoted.
</para> </para>
@@ -21,6 +21,10 @@
<refsect1> <refsect1>
<title>Execution</title> <title>Execution</title>
<para>
<command>repmgr primary unregister</command> should be run on the current primary,
with the ID of the node to unregister passed as <option>--node-id</option>.
</para>
<para> <para>
Execute with the <literal>--dry-run</literal> option to check what would happen without Execute with the <literal>--dry-run</literal> option to check what would happen without
actually unregistering the node. actually unregistering the node.
@@ -28,14 +32,42 @@
<para> <para>
<command>repmgr master unregister</command> can be used as an alias for <command>repmgr master unregister</command> can be used as an alias for
<command>repmgr primary unregister</command>/ <command>repmgr primary unregister</command>.
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually unregister the primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--node-id</option></term>
<listitem>
<para>
ID of the inactive primary to be unregistered.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>primary_unregister</literal> event notification will be generated. A <literal>primary_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -103,7 +103,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_clone</literal> event notification will be generated. A <literal>standby_clone</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -48,14 +48,53 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually follow a new standby.
</para>
<important>
<para>
This does not guarantee the standby can follow the primary; in
particular, whether the primary and standby timelines have diverged,
can currently only be determined by actually attempting to
attach the standby to the primary.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-W</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for a primary to appear.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_follow</literal> event notification will be generated. A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para>
<para>
If provided, &repmgr; will subsitute the placeholders <literal>%p</literal> with the node ID of the primary
being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para> </para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>See also</title> <title>See also</title>
<para> <para>
<xref linkend="repmgr-node-rejoin"> <xref linkend="repmgr-node-rejoin">

View File

@@ -45,7 +45,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_promote</literal> event notification will be generated. A <literal>standby_promote</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -37,19 +37,36 @@
</note> </note>
</refsect1> </refsect1>
<refsect1 id="repmgr-standby-register-wait" xreflabel="repmgr standby register --wait"> <refsect1 id="repmgr-standby-register-wait-start" xreflabel="repmgr standby register --wait-start">
<title>Waiting for the the standby to start</title>
<para>
By default, &repmgr; will wait 30 seconds for the standby to become available before
aborting with a connection error. This is useful when setting up a standby from a script,
as the standby may not have fully started up by the time <command>repmgr standby register</command>
is executed.
</para>
<para>
To change the timeout, pass the desired value with the <literal>--wait-start</literal> option.
A value of <literal>0</literal> will disable the timeout.
</para>
<para>
The timeout will be ignored if <literal>-F/--force</literal> was provided.
</para>
</refsect1>
<refsect1 id="repmgr-standby-register-wait-sync" xreflabel="repmgr standby register --wait-sync">
<title>Waiting for the registration to propagate to the standby</title> <title>Waiting for the registration to propagate to the standby</title>
<para> <para>
Depending on your environment and workload, it may take some time for Depending on your environment and workload, it may take some time for the standby's node record
the standby's node record to propagate from the primary to the standby. Some to propagate from the primary to the standby. Some actions (such as starting
actions (such as starting <application>repmgrd</application>) require that the standby's node record <application>repmgrd</application>) require that the standby's node record
is present and up-to-date to function correctly. is present and up-to-date to function correctly.
</para> </para>
<para> <para>
By providing the option <literal>--wait-sync</literal> to the By providing the option <option>--wait-sync</option> to the
<command>repmgr standby register</command> command, &repmgr; will wait <command>repmgr standby register</command> command, &repmgr; will wait
until the record is synchronised before exiting. An optional timeout (in until the record is synchronised before exiting. An optional timeout (in
seconds) can be added to this option (e.g. <literal>--wait-sync=60</literal>). seconds) can be added to this option (e.g. <option>--wait-sync=60</option>).
</para> </para>
</refsect1> </refsect1>
@@ -58,29 +75,42 @@
<para> <para>
Under some circumstances you may wish to register a standby which is not Under some circumstances you may wish to register a standby which is not
yet running; this can be the case when using provisioning tools to create yet running; this can be the case when using provisioning tools to create
a complex replication cluster. In this case, by using the <literal>-F/--force</literal> a complex replication cluster. In this case, by using the <option>-F/--force</option>
option and providing the connection parameters to the primary server, option and providing the connection parameters to the primary server,
the standby can be registered. the standby can be registered.
</para> </para>
<para> <para>
Similarly, with cascading replication it may be necessary to register Similarly, with cascading replication it may be necessary to register
a standby whose upstream node has not yet been registered - in this case, a standby whose upstream node has not yet been registered - in this case,
using <literal>-F/--force</literal> will result in the creation of an inactive placeholder using <option>-F/--force</option> will result in the creation of an inactive placeholder
record for the upstream node, which will however later need to be registered record for the upstream node, which will however later need to be registered
with the <literal>-F/--force</literal> option too. with the <option>-F/--force</option> option too.
</para> </para>
<para> <para>
When used with <command>repmgr standby register</command>, care should be taken that use of the When used with <command>repmgr standby register</command>, care should be taken that use of the
<literal>-F/--force</literal> option does not result in an incorrectly configured cluster. <option>-F/--force</option> option does not result in an incorrectly configured cluster.
</para> </para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_register</literal> event notification will be generated. A <literal>standby_register</literal> <link linkend="event-notifications">event notification</link>
will be generated immediately after the node record is updated on the primary.
</para> </para>
<para>
If the <option>--wait-sync</option> option is provided, a <literal>standby_register_sync</literal>
event notification will be generated immediately after the node record has synchronised to the
standby.
</para>
<para>
If provided, &repmgr; will subsitute the placeholders <literal>%p</literal> with the node ID of the
primary node, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para>
</refsect1> </refsect1>
</refentry> </refentry>

View File

@@ -22,9 +22,97 @@
</para> </para>
<para> <para>
If other standbys are connected to the demotion candidate, &repmgr; can instruct If other standbys are connected to the demotion candidate, &repmgr; can instruct
these to follow the new primary if the option <literal>--siblings-follow</literal> these to follow the new primary if the option <literal>--siblings-follow</literal>
is specified. is specified. This requires a passwordless SSH connection between the promotion
candidate (new primary) and the standbys attached to the demotion candidate
(existing primary).
</para> </para>
<note>
<para>
Performing a switchover is a non-trivial operation. In particular it
relies on the current primary being able to shut down cleanly and quickly.
&repmgr; will attempt to check for potential issues but cannot guarantee
a successful switchover.
</para>
</note>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--always-promote</option></term>
<listitem>
<para>
Promote standby to primary, even if it is behind original primary
(original primary will be shut down in any case).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute a switchover.
</para>
<important>
<para>
Success of <option>--dry-run</option> does not imply the switchover will
complete successfully, only that
the prerequisites for performing the operation are met.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option></term>
<term><option>--force</option></term>
<listitem>
<para>
Ignore warnings and continue anyway.
</para>
<para>
Specifically, if a problem is encountered when shutting down the current primary,
using <option>-F/--force</option> will cause &repmgr; to continue by promoting
the standby to be the new primary, and if <option>--siblings-follow</option> is
specified, attach any other standbys to the new primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind</option></term>
<listitem>
<para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(PostgreSQL 9.5 and later).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R</option></term>
<term><option>--remote-user</option></term>
<listitem>
<para>
System username for remote SSH operations (defaults to local system user).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--siblings-follow</option></term>
<listitem>
<para>
Have standbys attached to the old primary follow the new primary.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1> </refsect1>
<refsect1> <refsect1>
@@ -38,16 +126,65 @@
<application>repmgrd</application> should not be active on any nodes while a switchover is being <application>repmgrd</application> should not be active on any nodes while a switchover is being
executed. This restriction may be lifted in a later version. executed. This restriction may be lifted in a later version.
</para> </para>
<para>
External database connections, e.g. from an application, should not be permitted while
the switchover is taking place. In particular, active transactions on the primary
can potentially disrupt the shutdown process.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_promote</literal> event notification will be generated on the new primary, <literal>standby_switchover</literal> and <literal>standby_promote</literal>
and a <literal>node_rejoin</literal> event notification on the former primary (new standby). <link linkend="event-notifications">event notifications</link> will be generated for the new primary,
and a <literal>node_rejoin</literal> event notification for the former primary (new standby).
</para>
<para>
If using an event notification script, <literal>standby_switchover</literal>
will populate the placeholder parameter <literal>%p</literal> with the node ID of
the former primary.
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <literal>repmgr standby switchover</literal>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The switchover completed successfully.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
<listitem>
<para>
The switchover could not be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
<listitem>
<para>
The switchover was executed but a problem was encountered.
Typically this means the former primary could not be reattached
as a standby.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>See also</title> <title>See also</title>

View File

@@ -46,7 +46,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_unregister</literal> event notification will be generated. A <literal>standby_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -53,7 +53,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>witness_register</literal> event notification will be generated. A <literal>witness_register</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -66,7 +66,7 @@
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>witness_unregister</literal> event notification will be generated. A <literal>witness_unregister</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
</refsect1> </refsect1>

View File

@@ -117,6 +117,7 @@
&appendix-release-notes; &appendix-release-notes;
&appendix-signatures; &appendix-signatures;
&appendix-faq; &appendix-faq;
&appendix-packages;
<![%include-index;[&bookindex;]]> <![%include-index;[&bookindex;]]>
<![%include-xslt-index;[<index id="bookindex"></index>]]> <![%include-xslt-index;[<index id="bookindex"></index>]]>

View File

@@ -24,7 +24,7 @@
<para> <para>
In contrast to streaming replication, there's no concept of "promoting" a new In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes primary node with BDR. Instead, "failover" involves monitoring both nodes
with `repmgrd` and redirecting queries from the failed node to the remaining with <application>repmgrd</application> and redirecting queries from the failed node to the remaining
active node. This can be done by using an active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script <link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically which is called by <application>repmgrd</application> to dynamically
@@ -174,17 +174,13 @@
<para> <para>
Key to "failover" execution is the <literal>event_notification_command</literal>, Key to "failover" execution is the <literal>event_notification_command</literal>,
which is a user-definable script specified in <filename>repmpgr.conf</filename> which is a user-definable script specified in <filename>repmpgr.conf</filename>
and which should reconfigure the proxy server/ connection pooler to point and which can use a &repmgr; <link linkend="event-notifications">event notification</link>
to the other, still-active node. to reconfigure the proxy server / connection pooler so it points to the other, still-active node.
Details of the event will be passed as parameters to the script.
</para> </para>
<para> <para>
Each time &repmgr; (or <application>repmgrd</application>) records an event, Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>;
it can optionally execute the script defined in these will be replaced with the appropriate value when the script is executed:
<literal>event_notification_command</literal> to take further action;
details of the event will be passed as parameters.
</para>
<para>
Following placeholders are available to the script:
</para> </para>
<variablelist> <variablelist>
@@ -231,20 +227,37 @@
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><option>%c</option></term>
<listitem>
<para>
conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%a</option></term>
<listitem>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
</variablelist> </variablelist>
<para> <para>
Note that <literal>%c</literal> and <literal>%a</literal> will only be provided during Note that <literal>%c</literal> and <literal>%a</literal> are only provided with
<varname>bdr_failover</varname> events, which is what is of interest here. particular failover events, in this case <varname>bdr_failover</varname>.
</para> </para>
<para> <para>
The provided sample script (`scripts/bdr-pgbouncer.sh`) is configured like The provided sample script
this: (<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>)
is configured as follows:
<programlisting> <programlisting>
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting> event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
</para> </para>
<para> <para>
and parses the configures parameters like this: and parses the placeholder parameters like this:
<programlisting> <programlisting>
NODE_ID=$1 NODE_ID=$1
EVENT_TYPE=$2 EVENT_TYPE=$2
@@ -252,12 +265,14 @@
NEXT_CONNINFO=$4 NEXT_CONNINFO=$4
NEXT_NODE_NAME=$5</programlisting> NEXT_NODE_NAME=$5</programlisting>
</para> </para>
<para> <note>
The script also contains some hard-coded values about the <application>PgBouncer</application> <para>
configuration for both nodes; these will need to be adjusted for your local environment The sample script also contains some hard-coded values for the <application>PgBouncer</application>
(ideally the scripts would be maintained as templates and generated by some configuration for both nodes; these will need to be adjusted for your local environment
kind of provisioning system). (ideally the scripts would be maintained as templates and generated by some
</para> kind of provisioning system).
</para>
</note>
<para> <para>
The script performs following steps: The script performs following steps:

View File

@@ -54,45 +54,84 @@
<secondary>preparation</secondary> <secondary>preparation</secondary>
</indexterm> </indexterm>
<title>Preparing for switchover</title> <title>Preparing for switchover</title>
<para> <para>
As mentioned above, success of the switchover operation depends on &repmgr; As mentioned in the previous section, success of the switchover operation depends on
being able to shut down the current primary server quickly and cleanly. &repmgr; being able to shut down the current primary server quickly and cleanly.
</para> </para>
<para>
Ensure that a passwordless SSH connection is possible from the promotion candidate
(standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
will be used, ensure that passwordless SSH connections are possible from the
promotion candidate to all standbys attached to the demotion candidate.
</para>
<para> <para>
Double-check which commands will be used to stop/start/restart the current Double-check which commands will be used to stop/start/restart the current
primary; on the primary execute: primary; on the primary execute:
<programlisting> <programlisting>
repmgr -f /etc/repmgr.conf node service --list --action=stop repmgr -f /etc/repmgr.conf node service --list --action=stop
repmgr -f /etc/repmgr.conf node service --list --action=start repmgr -f /etc/repmgr.conf node service --list --action=start
repmgr -f /etc/repmgr.conf node service --list --action=restart repmgr -f /etc/repmgr.conf node service --list --action=restart</programlisting>
</programlisting>
</para> </para>
<para>
These commands can be defined in <filename>repmgr.conf</filename> with
<option>service_start_command</option>, <option>service_stop_command</option>
and <option>service_restart_command</option>.
</para>
<important>
<para>
If &repmgr; is installed from a package. you should set these commands
to use the appropriate service commands defined by the package/operating
system as these will ensure PostgreSQL is stopped/started properly
taking into account configuration and log file locations etc.
</para>
<para>
If the <option>service_*_command</option> options aren't defined, &repmgr; will
fall back to using <application>pg_ctl</application> to stop/start/restart
PostgreSQL, which may not work properly.
</para>
</important>
<note> <note>
<simpara> <simpara>
On <literal>systemd</literal> systems we strongly recommend using the appropriate On <literal>systemd</literal> systems we strongly recommend using the appropriate
<command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure <command>systemctl</command> commands (typically run via <command>sudo</command>) to ensure
<literal>systemd</literal> informed about the status of the PostgreSQL service. <literal>systemd</literal> is informed about the status of the PostgreSQL service.
</simpara>
<simpara>
If using <command>sudo</command> for the <command>systemctl</command> calls, make sure the
<command>sudo</command> specification doesn't require a real tty for the user. If not set
this way, <command>repmgr</command> will fail to stop the primary.
</simpara> </simpara>
</note> </note>
<para> <para>
Check that access from applications is minimalized or preferably blocked Check that access from applications is minimalized or preferably blocked
completely, so applications are not unexpectedly interrupted. completely, so applications are not unexpectedly interrupted.
</para> </para>
<para> <para>
Check there is no significant replication lag on standbys attached to the Check there is no significant replication lag on standbys attached to the
current primary. current primary.
</para> </para>
<para> <para>
If WAL file archiving is set up, check that there is no backlog of files waiting If WAL file archiving is set up, check that there is no backlog of files waiting
to be archived, as PostgreSQL will not finally shut down until all these have been to be archived, as PostgreSQL will not finally shut down until all of these have been
archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files, archived. If there is a backlog exceeding <varname>archive_ready_warning</varname> WAL files,
&repmgr; will emit a warning before attempting to perform a switchover; you can also check &repmgr; will emit a warning before attempting to perform a switchover; you can also check
manually with <command>repmgr node check --archive-ready</command>. manually with <command>repmgr node check --archive-ready</command>.
</para> </para>
<para> <para>
Ensure that <application>repmgrd</application> is *not* running anywhere to prevent it unintentionally Ensure that <application>repmgrd</application> is *not* running anywhere to prevent it unintentionally
promoting a node. promoting a node.
</para> </para>
<para> <para>
Finally, consider executing <command>repmgr standby switchover</command> with the Finally, consider executing <command>repmgr standby switchover</command> with the
<literal>--dry-run</literal> option; this will perform any necessary checks and inform you about <literal>--dry-run</literal> option; this will perform any necessary checks and inform you about
@@ -110,6 +149,48 @@
"pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop" "pg_ctl -l /var/log/postgresql/startup.log -D '/var/lib/postgresql/data' -m fast -W stop"
</programlisting> </programlisting>
</para> </para>
<important>
<para>
Be aware that <option>--dry-run</option> checks the prerequisites
for performing the switchover and some basic sanity checks on the
state of the database which might effect the switchover operation
(e.g. replication lag); it cannot however guarantee the switchover
operation will succeed. In particular, if the current primary
does not shut down cleanly, &repmgr; will not be able to reliably
execute the switchover (as there would be a danger of divergence
between the former and new primary nodes).
</para>
</important>
<para>
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>reconnect_attempts</literal>: number of times to check the original primary
for a clean shutdown after executing the shutdown command, before aborting
</simpara>
</listitem>
<listitem>
<simpara>
<literal>reconnect_interval</literal>: interval (in seconds) to check the original
primary for a clean shutdown after executing the shutdown command (up to a maximum
of <literal>reconnect_attempts</literal> tries)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>replication_lag_critical</literal>:
if replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the <literal>-F/--force</literal> option
is provided)
</simpara>
</listitem>
</itemizedlist>
</para>
</sect1> </sect1>
<sect1 id="switchover-execution" xreflabel="Executing the switchover command"> <sect1 id="switchover-execution" xreflabel="Executing the switchover command">

View File

@@ -11,22 +11,86 @@
containing bugfixes and other minor improvements. Any substantial new containing bugfixes and other minor improvements. Any substantial new
functionality will be included in a feature release (e.g. 4.0.x to 4.1.x). functionality will be included in a feature release (e.g. 4.0.x to 4.1.x).
</para> </para>
<para>
&repmgr; is implemented as a PostgreSQL extension; to upgrade it, first
install the updated package (or compile the updated source), then in the
database where the &repmgr; extension is installed, execute
<command>ALTER EXTENSION repmgr UPDATE</command>.
</para>
<para>
If <application>repmgrd</application> is running, it may be necessary to restart
the PostgreSQL server if the upgrade contains changes to the shared object
file used by <application>repmgrd</application>; check the release notes for details.
</para>
<para> <sect1 id="upgrading-repmgr-extension" xreflabel="Upgrading repmgr 4.x and later">
Please check the <link linkend="appendix-release-notes">release notes</link> for every <indexterm>
release as they may contain upgrade instructions particular to individual versions. <primary>upgrading</primary>
</para> <secondary>repmgr 4.x and later</secondary>
</indexterm>
<title>Upgrading repmgr 4.x and later</title>
<para>
&repmgr; 4.x is implemented as a PostgreSQL extension; normally the upgrade consists
of the two following steps:
<orderedlist>
<listitem>
<simpara>
Install the updated package (or compile the updated source)
</simpara>
</listitem>
<listitem>
<simpara>
In the database where the &repmgr; extension is installed, execute
<command>ALTER EXTENSION repmgr UPDATE</command>.
</simpara>
</listitem>
</orderedlist>
</para>
<para>
Always check the <link linkend="appendix-release-notes">release notes</link> for every
release as they may contain upgrade instructions particular to individual versions.
</para>
<para>
If the <application>repmgrd</application> daemon is in use, we recommend stopping it
before upgrading &repmgr;.
</para>
<para>
Note that it may be necessary to restart the PostgreSQL server if the upgrade contains
changes to the shared object file used by <application>repmgrd</application>; check the
release notes for details.
</para>
</sect1>
<sect1 id="upgrading-and-pg-upgrade" xreflabel="pg_upgrade and repmgr">
<indexterm>
<primary>upgrading</primary>
<secondary>pg_upgrade</secondary>
</indexterm>
<indexterm>
<primary>pg_upgrade</primary>
</indexterm>
<title>pg_upgrade and repmgr</title>
<para>
<application>pg_upgrade</application> requires that if any functions are
dependent on a shared library, this library must be present in both
the old and new installations before <application>pg_upgrade</application>
can be executed.
</para>
<para>
To minimize the risk of any upgrade issues (particularly if an upgrade to
a new major &repmgr; version is involved), we recommend upgrading
&repmgr; on the old server <emphasis>before</emphasis> running
<application>pg_upgrade</application> to ensure that old and new
versions are the same.
</para>
<note>
<simpara>
This issue applies to any PostgreSQL extension which has
dependencies on a shared library.
</simpara>
</note>
<para>
For further details please see the <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade documentation</ulink>.
</para>
<para>
If replication slots are in use, bear in mind these will <emphasis>not</emphasis>
be recreated by <application>pg_upgrade</application>. These will need to
be recreated manually.
</para>
</sect1>
<sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x"> <sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x">
<indexterm> <indexterm>
@@ -45,7 +109,7 @@
</listitem> </listitem>
<listitem> <listitem>
<simpara> <simpara>
upgrading the repmgr schema upgrading the repmgr schema using <command>CREATE EXTENSION</command>
</simpara> </simpara>
</listitem> </listitem>
</orderedlist> </orderedlist>
@@ -58,11 +122,19 @@
a packaged PostgreSQL extension) is normally carried out a packaged PostgreSQL extension) is normally carried out
automatically when the &repmgr; extension is created. automatically when the &repmgr; extension is created.
</para> </para>
<para>
The shared library has been renamed from <literal>repmgr_funcs</literal> to
<literal>repmgr</literal> - if it's set in <varname>shared_preload_libraries</varname>
in <filename>postgresql.conf</filename> it will need to be updated to the new name:
<programlisting>
shared_preload_libraries = 'repmgr'</programlisting>
</para>
<sect2 id="converting-repmgr-conf"> <sect2 id="converting-repmgr-conf">
<title>Converting repmgr.conf configuration files</title> <title>Converting repmgr.conf configuration files</title>
<para> <para>
With a completely new repmgr version, we've taken the opportunity With a completely new repmgr version, we've taken the opportunity
to rename some configuration items have had their names changed for to rename some configuration items for
clarity and consistency, both between the configuration file and clarity and consistency, both between the configuration file and
the column names in <structname>repmgr.nodes</structname> the column names in <structname>repmgr.nodes</structname>
(e.g. <varname>node</varname> to <varname>node_id</varname>), and (e.g. <varname>node</varname> to <varname>node_id</varname>), and

View File

@@ -1 +1 @@
<!ENTITY repmgrversion "4.0.1"> <!ENTITY repmgrversion "4.0.3">

View File

@@ -1,6 +1,6 @@
/* /*
* errcode.h * errcode.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -43,5 +43,6 @@
#define ERR_BARMAN 19 #define ERR_BARMAN 19
#define ERR_REGISTRATION_SYNC 20 #define ERR_REGISTRATION_SYNC 20
#define ERR_OUT_OF_MEMORY 21 #define ERR_OUT_OF_MEMORY 21
#define ERR_SWITCHOVER_INCOMPLETE 22
#endif /* _ERRCODE_H_ */ #endif /* _ERRCODE_H_ */

2
log.c
View File

@@ -1,6 +1,6 @@
/* /*
* log.c - Logging methods * log.c - Logging methods
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

2
log.h
View File

@@ -1,6 +1,6 @@
/* /*
* log.h * log.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,9 +1,9 @@
/* /*
* repmgr-action-standby.c * repmgr-action-bdr.c
* *
* Implements BDR-related actions for the repmgr command line utility * Implements BDR-related actions for the repmgr command line utility
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -92,7 +92,39 @@ do_bdr_register(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* check whether repmgr extension exists, and that any other nodes are BDR */ /* check for a matching BDR node */
{
PQExpBufferData bdr_local_node_name;
bool node_match = false;
initPQExpBuffer(&bdr_local_node_name);
node_match = bdr_node_name_matches(conn, config_file_options.node_name, &bdr_local_node_name);
if (node_match == false)
{
if (strlen(bdr_local_node_name.data))
{
log_error(_("local node BDR node name is \"%s\", expected: \"%s\""),
bdr_local_node_name.data,
config_file_options.node_name);
log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
}
else
{
log_error(_("local node does not report BDR node name"));
log_hint(_("ensure this is an active BDR node"));
}
PQfinish(conn);
pfree(dbname);
termPQExpBuffer(&bdr_local_node_name);
exit(ERR_BAD_CONFIG);
}
termPQExpBuffer(&bdr_local_node_name);
}
/* check whether repmgr extension exists, and there are no non-BDR nodes registered */
extension_status = get_repmgr_extension_status(conn); extension_status = get_repmgr_extension_status(conn);
if (extension_status == REPMGR_UNKNOWN) if (extension_status == REPMGR_UNKNOWN)
@@ -142,17 +174,9 @@ do_bdr_register(void)
pfree(dbname); pfree(dbname);
/* check for a matching BDR node */ if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false)
{ {
bool node_exists = bdr_node_exists(conn, config_file_options.node_name); bdr_node_set_repmgr_set(conn, config_file_options.node_name);
if (node_exists == false)
{
log_error(_("no BDR node with node_name \"%s\" found"), config_file_options.node_name);
log_hint(_("\"node_name\" in repmgr.conf must match \"node_name\" in bdr.bdr_nodes"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
} }
/* /*

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-bdr.h * repmgr-action-bdr.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
* *
* Implements cluster information actions for the repmgr command line utility * Implements cluster information actions for the repmgr command line utility
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -82,6 +82,7 @@ do_cluster_show(void)
NodeInfoListCell *cell = NULL; NodeInfoListCell *cell = NULL;
int i = 0; int i = 0;
ItemList warnings = {NULL, NULL}; ItemList warnings = {NULL, NULL};
bool success = false;
/* Connect to local database to obtain cluster connection data */ /* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database")); log_verbose(LOG_INFO, _("connecting to database"));
@@ -91,11 +92,19 @@ do_cluster_show(void)
else else
conn = establish_db_connection_by_params(&source_conninfo, true); conn = establish_db_connection_by_params(&source_conninfo, true);
get_all_node_records_with_upstream(conn, &nodes); success = get_all_node_records_with_upstream(conn, &nodes);
if (success == false)
{
/* get_all_node_records_with_upstream() will print error message */
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
if (nodes.node_count == 0) if (nodes.node_count == 0)
{ {
log_error(_("unable to retrieve any node records")); log_error(_("no node records were found"));
log_hint(_("ensure at least one node is registered"));
PQfinish(conn); PQfinish(conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
@@ -131,8 +140,14 @@ do_cluster_show(void)
} }
else else
{ {
char error[MAXLEN];
strncpy(error, PQerrorMessage(cell->node_info->conn), MAXLEN);
cell->node_info->node_status = NODE_STATUS_DOWN; cell->node_info->node_status = NODE_STATUS_DOWN;
cell->node_info->recovery_type = RECTYPE_UNKNOWN; cell->node_info->recovery_type = RECTYPE_UNKNOWN;
item_list_append_format(&warnings,
"when attempting to connect to node \"%s\" (ID: %i), following error encountered :\n\"%s\"",
cell->node_info->node_name, cell->node_info->node_id, trim(error));
} }
initPQExpBuffer(&details); initPQExpBuffer(&details);
@@ -158,15 +173,13 @@ do_cluster_show(void)
break; break;
case RECTYPE_STANDBY: case RECTYPE_STANDBY:
appendPQExpBuffer(&details, "! running as standby"); appendPQExpBuffer(&details, "! running as standby");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is registered as primary but running as standby", "node \"%s\" (ID: %i) is registered as primary but running as standby",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
case RECTYPE_UNKNOWN: case RECTYPE_UNKNOWN:
appendPQExpBuffer(&details, "! unknown"); appendPQExpBuffer(&details, "! unknown");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) has unknown replication status", "node \"%s\" (ID: %i) has unknown replication status",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
@@ -177,16 +190,14 @@ do_cluster_show(void)
if (cell->node_info->recovery_type == RECTYPE_PRIMARY) if (cell->node_info->recovery_type == RECTYPE_PRIMARY)
{ {
appendPQExpBuffer(&details, "! running"); appendPQExpBuffer(&details, "! running");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is running but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
else else
{ {
appendPQExpBuffer(&details, "! running as standby"); appendPQExpBuffer(&details, "! running as standby");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is registered as an inactive primary but running as standby", "node \"%s\" (ID: %i) is registered as an inactive primary but running as standby",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -199,8 +210,7 @@ do_cluster_show(void)
if (cell->node_info->active == true) if (cell->node_info->active == true)
{ {
appendPQExpBuffer(&details, "? unreachable"); appendPQExpBuffer(&details, "? unreachable");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is registered as an active primary but is unreachable", "node \"%s\" (ID: %i) is registered as an active primary but is unreachable",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -226,8 +236,7 @@ do_cluster_show(void)
break; break;
case RECTYPE_PRIMARY: case RECTYPE_PRIMARY:
appendPQExpBuffer(&details, "! running as primary"); appendPQExpBuffer(&details, "! running as primary");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is registered as standby but running as primary", "node \"%s\" (ID: %i) is registered as standby but running as primary",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
@@ -245,16 +254,14 @@ do_cluster_show(void)
if (cell->node_info->recovery_type == RECTYPE_STANDBY) if (cell->node_info->recovery_type == RECTYPE_STANDBY)
{ {
appendPQExpBuffer(&details, "! running"); appendPQExpBuffer(&details, "! running");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is running but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
else else
{ {
appendPQExpBuffer(&details, "! running as primary"); appendPQExpBuffer(&details, "! running as primary");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is running as primary but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running as primary but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -267,8 +274,7 @@ do_cluster_show(void)
if (cell->node_info->active == true) if (cell->node_info->active == true)
{ {
appendPQExpBuffer(&details, "? unreachable"); appendPQExpBuffer(&details, "? unreachable");
item_list_append_format( item_list_append_format(&warnings,
&warnings,
"node \"%s\" (ID: %i) is registered as an active standby but is unreachable", "node \"%s\" (ID: %i) is registered as an active standby but is unreachable",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -416,7 +422,7 @@ do_cluster_show(void)
printf(_("\nWARNING: following issues were detected\n")); printf(_("\nWARNING: following issues were detected\n"));
for (cell = warnings.head; cell; cell = cell->next) for (cell = warnings.head; cell; cell = cell->next)
{ {
printf(_(" %s\n"), cell->string); printf(_(" - %s\n"), cell->string);
} }
} }
} }
@@ -436,82 +442,18 @@ void
do_cluster_event(void) do_cluster_event(void)
{ {
PGconn *conn = NULL; PGconn *conn = NULL;
PQExpBufferData query;
PQExpBufferData where_clause;
PGresult *res; PGresult *res;
int i = 0; int i = 0;
int column_count = EVENT_HEADER_COUNT;
conn = establish_db_connection(config_file_options.conninfo, true); conn = establish_db_connection(config_file_options.conninfo, true);
initPQExpBuffer(&query); res = get_event_records(conn,
initPQExpBuffer(&where_clause); runtime_options.node_id,
runtime_options.node_name,
/* LEFT JOIN used here as a node record may have been removed */ runtime_options.event,
appendPQExpBuffer( runtime_options.all,
&query, runtime_options.limit);
" SELECT e.node_id, n.node_name, e.event, e.successful, \n"
" TO_CHAR(e.event_timestamp, 'YYYY-MM-DD HH24:MI:SS') AS timestamp, \n"
" e.details \n"
" FROM repmgr.events e \n"
"LEFT JOIN repmgr.nodes n ON e.node_id = n.node_id ");
if (runtime_options.node_id != UNKNOWN_NODE_ID)
{
append_where_clause(&where_clause,
"n.node_id=%i", runtime_options.node_id);
}
else if (runtime_options.node_name[0] != '\0')
{
char *escaped = escape_string(conn, runtime_options.node_name);
if (escaped == NULL)
{
log_error(_("unable to escape value provided for node name"));
}
else
{
append_where_clause(&where_clause,
"n.node_name='%s'",
escaped);
pfree(escaped);
}
}
if (runtime_options.event[0] != '\0')
{
char *escaped = escape_string(conn, runtime_options.event);
if (escaped == NULL)
{
log_error(_("unable to escape value provided for event"));
}
else
{
append_where_clause(&where_clause,
"e.event='%s'",
escaped);
pfree(escaped);
}
}
appendPQExpBuffer(&query, "\n%s\n",
where_clause.data);
appendPQExpBuffer(&query,
" ORDER BY e.event_timestamp DESC");
if (runtime_options.all == false && runtime_options.limit > 0)
{
appendPQExpBuffer(&query, " LIMIT %i",
runtime_options.limit);
}
log_debug("do_cluster_event():\n%s", query.data);
res = PQexec(conn, query.data);
termPQExpBuffer(&query);
termPQExpBuffer(&where_clause);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK)
{ {
@@ -538,7 +480,11 @@ do_cluster_event(void)
strncpy(headers_event[EV_TIMESTAMP].title, _("Timestamp"), MAXLEN); strncpy(headers_event[EV_TIMESTAMP].title, _("Timestamp"), MAXLEN);
strncpy(headers_event[EV_DETAILS].title, _("Details"), MAXLEN); strncpy(headers_event[EV_DETAILS].title, _("Details"), MAXLEN);
for (i = 0; i < EVENT_HEADER_COUNT; i++) /* if --terse provided, simply omit the "Details" column */
if (runtime_options.terse == true)
column_count --;
for (i = 0; i < column_count; i++)
{ {
headers_event[i].max_length = strlen(headers_event[i].title); headers_event[i].max_length = strlen(headers_event[i].title);
} }
@@ -547,7 +493,7 @@ do_cluster_event(void)
{ {
int j; int j;
for (j = 0; j < EVENT_HEADER_COUNT; j++) for (j = 0; j < column_count; j++)
{ {
headers_event[j].cur_length = strlen(PQgetvalue(res, i, j)); headers_event[j].cur_length = strlen(PQgetvalue(res, i, j));
if (headers_event[j].cur_length > headers_event[j].max_length) if (headers_event[j].cur_length > headers_event[j].max_length)
@@ -558,7 +504,7 @@ do_cluster_event(void)
} }
for (i = 0; i < EVENT_HEADER_COUNT; i++) for (i = 0; i < column_count; i++)
{ {
if (i == 0) if (i == 0)
printf(" "); printf(" ");
@@ -571,14 +517,14 @@ do_cluster_event(void)
} }
printf("\n"); printf("\n");
printf("-"); printf("-");
for (i = 0; i < EVENT_HEADER_COUNT; i++) for (i = 0; i < column_count; i++)
{ {
int j; int j;
for (j = 0; j < headers_event[i].max_length; j++) for (j = 0; j < headers_event[i].max_length; j++)
printf("-"); printf("-");
if (i < (EVENT_HEADER_COUNT - 1)) if (i < (column_count - 1))
printf("-+-"); printf("-+-");
else else
printf("-"); printf("-");
@@ -591,13 +537,13 @@ do_cluster_event(void)
int j; int j;
printf(" "); printf(" ");
for (j = 0; j < EVENT_HEADER_COUNT; j++) for (j = 0; j < column_count; j++)
{ {
printf("%-*s", printf("%-*s",
headers_event[j].max_length, headers_event[j].max_length,
PQgetvalue(res, i, j)); PQgetvalue(res, i, j));
if (j < (EVENT_HEADER_COUNT - 1)) if (j < (column_count - 1))
printf(" | "); printf(" | ");
} }
@@ -1204,7 +1150,7 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)
} }
else else
{ {
t_conninfo_param_list remote_conninfo; t_conninfo_param_list remote_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
char *host = NULL; char *host = NULL;
PQExpBufferData quoted_command; PQExpBufferData quoted_command;

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-cluster.h * repmgr-action-cluster.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
* *
* Implements actions available for any kind of node * Implements actions available for any kind of node
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -41,6 +41,7 @@ static void _do_node_status_is_shutdown_cleanly(void);
static void _do_node_archive_config(void); static void _do_node_archive_config(void);
static void _do_node_restore_config(void); static void _do_node_restore_config(void);
static void do_node_check_replication_connection(void);
static CheckStatus do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list_output); static CheckStatus do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list_output);
static CheckStatus do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_output); static CheckStatus do_node_check_downstream(PGconn *conn, OutputMode mode, CheckStatusList *list_output);
static CheckStatus do_node_check_replication_lag(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output); static CheckStatus do_node_check_replication_lag(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckStatusList *list_output);
@@ -249,8 +250,7 @@ do_node_status(void)
if (node_info.max_wal_senders >= 0) if (node_info.max_wal_senders >= 0)
{ {
/* In CSV mode, raw values supplied as well */ /* In CSV mode, raw values supplied as well */
key_value_list_set_format( key_value_list_set_format(&node_status,
&node_status,
"Replication connections", "Replication connections",
"%i (of maximal %i)", "%i (of maximal %i)",
node_info.attached_wal_receivers, node_info.attached_wal_receivers,
@@ -258,8 +258,7 @@ do_node_status(void)
} }
else if (node_info.max_wal_senders == 0) else if (node_info.max_wal_senders == 0)
{ {
key_value_list_set_format( key_value_list_set_format(&node_status,
&node_status,
"Replication connections", "Replication connections",
"disabled"); "disabled");
} }
@@ -276,8 +275,7 @@ do_node_status(void)
initPQExpBuffer(&slotinfo); initPQExpBuffer(&slotinfo);
appendPQExpBuffer( appendPQExpBuffer(&slotinfo,
&slotinfo,
"%i (of maximal %i)", "%i (of maximal %i)",
node_info.active_replication_slots + node_info.inactive_replication_slots, node_info.active_replication_slots + node_info.inactive_replication_slots,
node_info.max_replication_slots); node_info.max_replication_slots);
@@ -289,8 +287,7 @@ do_node_status(void)
"; %i inactive", "; %i inactive",
node_info.inactive_replication_slots); node_info.inactive_replication_slots);
item_list_append_format( item_list_append_format(&warnings,
&warnings,
_("- node has %i inactive replication slots"), _("- node has %i inactive replication slots"),
node_info.inactive_replication_slots); node_info.inactive_replication_slots);
} }
@@ -309,13 +306,44 @@ do_node_status(void)
} }
/*
* check for missing replication slots - we do this regardless of
* what "max_replication_slots" is set to
*/
{
NodeInfoList missing_slots = T_NODE_INFO_LIST_INITIALIZER;
get_downsteam_nodes_with_missing_slot(conn,
config_file_options.node_id,
&missing_slots);
if (missing_slots.node_count > 0)
{
NodeInfoListCell *missing_slot_cell = NULL;
item_list_append_format(&warnings,
_("- replication slots missing for following %i node(s):"),
missing_slots.node_count);
for (missing_slot_cell = missing_slots.head; missing_slot_cell; missing_slot_cell = missing_slot_cell->next)
{
item_list_append_format(&warnings,
_(" - %s (ID: %i, slot name: \"%s\")"),
missing_slot_cell->node_info->node_name,
missing_slot_cell->node_info->node_id,
missing_slot_cell->node_info->slot_name);
}
}
}
if (node_info.type == STANDBY) if (node_info.type == STANDBY)
{ {
key_value_list_set_format(&node_status, key_value_list_set_format(&node_status,
"Upstream node", "Upstream node",
"%s (ID: %i)", "%s (ID: %i)",
node_info.node_name, node_info.upstream_node_name,
node_info.node_id); node_info.upstream_node_id);
get_replication_info(conn, &replication_info); get_replication_info(conn, &replication_info);
@@ -463,8 +491,7 @@ _do_node_status_is_shutdown_cleanly(void)
initPQExpBuffer(&output); initPQExpBuffer(&output);
appendPQExpBuffer( appendPQExpBuffer(&output,
&output,
"--state="); "--state=");
/* sanity-check we're dealing with a PostgreSQL directory */ /* sanity-check we're dealing with a PostgreSQL directory */
@@ -496,16 +523,19 @@ _do_node_status_is_shutdown_cleanly(void)
db_state = get_db_state(config_file_options.data_directory); db_state = get_db_state(config_file_options.data_directory);
log_verbose(LOG_DEBUG, "db state now: %s", describe_db_state(db_state));
if (db_state != DB_SHUTDOWNED && db_state != DB_SHUTDOWNED_IN_RECOVERY) if (db_state != DB_SHUTDOWNED && db_state != DB_SHUTDOWNED_IN_RECOVERY)
{ {
/*
* node is not running, but pg_controldata says it is - unclean
* shutdown
*/
if (node_status != NODE_STATUS_UP) if (node_status != NODE_STATUS_UP)
{ {
node_status = NODE_STATUS_UNCLEAN_SHUTDOWN; node_status = NODE_STATUS_UNCLEAN_SHUTDOWN;
} }
/* server is still responding but shutting down */
else if (db_state == DB_SHUTDOWNING)
{
node_status = NODE_STATUS_SHUTTING_DOWN;
}
} }
checkPoint = get_latest_checkpoint_location(config_file_options.data_directory); checkPoint = get_latest_checkpoint_location(config_file_options.data_directory);
@@ -525,19 +555,24 @@ _do_node_status_is_shutdown_cleanly(void)
node_status = NODE_STATUS_DOWN; node_status = NODE_STATUS_DOWN;
} }
log_verbose(LOG_DEBUG, "node status determined as: %s", print_node_status(node_status));
switch (node_status) switch (node_status)
{ {
case NODE_STATUS_UP: case NODE_STATUS_UP:
appendPQExpBuffer(&output, "RUNNING"); appendPQExpBuffer(&output, "RUNNING");
break; break;
case NODE_STATUS_UNCLEAN_SHUTDOWN: case NODE_STATUS_SHUTTING_DOWN:
appendPQExpBuffer(&output, "UNCLEAN_SHUTDOWN"); appendPQExpBuffer(&output, "SHUTTING_DOWN");
break; break;
case NODE_STATUS_DOWN: case NODE_STATUS_DOWN:
appendPQExpBuffer(&output, appendPQExpBuffer(&output,
"SHUTDOWN --last-checkpoint-lsn=%X/%X", "SHUTDOWN --last-checkpoint-lsn=%X/%X",
format_lsn(checkPoint)); format_lsn(checkPoint));
break; break;
case NODE_STATUS_UNCLEAN_SHUTDOWN:
appendPQExpBuffer(&output, "UNCLEAN_SHUTDOWN");
break;
case NODE_STATUS_UNKNOWN: case NODE_STATUS_UNKNOWN:
appendPQExpBuffer(&output, "UNKNOWN"); appendPQExpBuffer(&output, "UNKNOWN");
break; break;
@@ -572,6 +607,11 @@ do_node_check(void)
exit(return_code); exit(return_code);
} }
if (runtime_options.replication_connection == true)
{
do_node_check_replication_connection();
exit(SUCCESS);
}
if (strlen(config_file_options.conninfo)) if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true); conn = establish_db_connection(config_file_options.conninfo, true);
@@ -728,8 +768,7 @@ do_node_check_role(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckS
} }
else else
{ {
appendPQExpBuffer( appendPQExpBuffer(&details,
&details,
_("node is primary")); _("node is primary"));
} }
break; break;
@@ -754,8 +793,7 @@ do_node_check_role(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckS
if (is_bdr_db(conn, &output) == false) if (is_bdr_db(conn, &output) == false)
{ {
status = CHECK_STATUS_CRITICAL; status = CHECK_STATUS_CRITICAL;
appendPQExpBuffer( appendPQExpBuffer(&details,
&details,
"%s", output.data); "%s", output.data);
} }
termPQExpBuffer(&output); termPQExpBuffer(&output);
@@ -768,6 +806,11 @@ do_node_check_role(PGconn *conn, OutputMode mode, t_node_info *node_info, CheckS
appendPQExpBuffer(&details, appendPQExpBuffer(&details,
_("node is not an active BDR node")); _("node is not an active BDR node"));
} }
else
{
appendPQExpBuffer(&details,
_("node is an active BDR node"));
}
} }
} }
default: default:
@@ -872,6 +915,67 @@ do_node_check_slots(PGconn *conn, OutputMode mode, t_node_info *node_info, Check
} }
static void
do_node_check_replication_connection(void)
{
PGconn *local_conn = NULL;
PGconn *repl_conn = NULL;
t_node_info node_record = T_NODE_INFO_INITIALIZER;
RecordStatus record_status = RECORD_NOT_FOUND;
t_conninfo_param_list remote_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
PQExpBufferData output;
initPQExpBuffer(&output);
appendPQExpBuffer(&output,
"--connection=");
if (runtime_options.remote_node_id == UNKNOWN_NODE_ID)
{
appendPQExpBuffer(&output, "UNKNOWN");
printf("%s\n", output.data);
termPQExpBuffer(&output);
return;
}
local_conn = establish_db_connection(config_file_options.conninfo, true);
record_status = get_node_record(local_conn, runtime_options.remote_node_id, &node_record);
PQfinish(local_conn);
if (record_status != RECORD_FOUND)
{
appendPQExpBuffer(&output, "UNKNOWN");
printf("%s\n", output.data);
termPQExpBuffer(&output);
return;
}
initialize_conninfo_params(&remote_conninfo, false);
parse_conninfo_string(node_record.conninfo, &remote_conninfo, NULL, false);
param_set(&remote_conninfo, "replication", "1");
param_set(&remote_conninfo, "user", node_record.repluser);
repl_conn = establish_db_connection_by_params(&remote_conninfo, false);
if (PQstatus(repl_conn) != CONNECTION_OK)
{
appendPQExpBuffer(&output, "BAD");
printf("%s\n", output.data);
termPQExpBuffer(&output);
return;
}
PQfinish(repl_conn);
appendPQExpBuffer(&output, "OK");
printf("%s\n", output.data);
termPQExpBuffer(&output);
return;
}
static CheckStatus static CheckStatus
do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list_output) do_node_check_archive_ready(PGconn *conn, OutputMode mode, CheckStatusList *list_output)
{ {
@@ -1556,13 +1660,11 @@ parse_server_action(const char *action_name)
/* /*
* Intended mainly for "internal" use by "standby switchover", which * Rejoin a dormant (shut down) node to the replication cluster; this
* calls this on the target server to excute pg_rewind on a demoted * is typically a former primary which needs to be demoted to a standby.
* primary with a forked (sic) timeline. This function does not
* currently check whether this is a useful thing to do (however
* "standby switchover" will perform a check before calling it).
* *
* TODO: make this into a more generally useful function. * Note that "repmgr node rejoin" is also executed by
* "repmgr standby switchover" after promoting the new primary.
*/ */
void void
do_node_rejoin(void) do_node_rejoin(void)
@@ -1581,6 +1683,7 @@ do_node_rejoin(void)
bool success = true; bool success = true;
int server_version_num = UNKNOWN_SERVER_VERSION_NUM; int server_version_num = UNKNOWN_SERVER_VERSION_NUM;
int follow_error_code = SUCCESS;
/* check node is not actually running */ /* check node is not actually running */
@@ -1615,21 +1718,28 @@ do_node_rejoin(void)
/* check if cleanly shut down */ /* check if cleanly shut down */
if (db_state != DB_SHUTDOWNED && db_state != DB_SHUTDOWNED_IN_RECOVERY) if (db_state != DB_SHUTDOWNED && db_state != DB_SHUTDOWNED_IN_RECOVERY)
{ {
log_error(_("database is not shut down cleanly")); if (db_state == DB_SHUTDOWNING)
if (runtime_options.force_rewind == true)
{ {
log_detail(_("pg_rewind will not be able to run")); log_error(_("database is still shutting down"));
}
else
{
log_error(_("database is not shut down cleanly"));
if (runtime_options.force_rewind == true)
{
log_detail(_("pg_rewind will not be able to run"));
}
log_hint(_("database should be restarted then shut down cleanly after crash recovery completes"));
exit(ERR_BAD_CONFIG);
} }
log_hint(_("database should be restarted and shut down cleanly after crash recovery completes"));
exit(ERR_BAD_CONFIG);
} }
/* check provided upstream connection */ /* check provided upstream connection */
upstream_conn = establish_db_connection_by_params(&source_conninfo, true); upstream_conn = establish_db_connection_by_params(&source_conninfo, true);
/* sanity-checks for 9.3 */ /* sanity checks for 9.3 */
server_version_num = get_server_version(upstream_conn, NULL); server_version_num = get_server_version(upstream_conn, NULL);
if (server_version_num < 90400) if (server_version_num < 90400)
@@ -1638,7 +1748,9 @@ do_node_rejoin(void)
if (get_primary_node_record(upstream_conn, &primary_node_record) == false) if (get_primary_node_record(upstream_conn, &primary_node_record) == false)
{ {
log_error(_("unable to retrieve primary node record")); log_error(_("unable to retrieve primary node record"));
log_hint(_("check the provided database connection string is for a \"repmgr\" database"));
PQfinish(upstream_conn); PQfinish(upstream_conn);
exit(ERR_BAD_CONFIG);
} }
PQfinish(upstream_conn); PQfinish(upstream_conn);
@@ -1841,7 +1953,31 @@ do_node_rejoin(void)
success = do_standby_follow_internal(upstream_conn, success = do_standby_follow_internal(upstream_conn,
&primary_node_record, &primary_node_record,
&follow_output); &follow_output,
&follow_error_code);
if (success == false)
{
log_notice(_("NODE REJOIN failed"));
log_detail("%s", follow_output.data);
create_event_notification(upstream_conn,
&config_file_options,
config_file_options.node_id,
"node_rejoin",
success,
follow_output.data);
PQfinish(upstream_conn);
termPQExpBuffer(&follow_output);
exit(follow_error_code);
}
/*
* XXX add checks that node actually started and connected to primary,
* if not exit with ERR_REJOIN_FAIL
*/
create_event_notification(upstream_conn, create_event_notification(upstream_conn,
&config_file_options, &config_file_options,
@@ -1852,19 +1988,12 @@ do_node_rejoin(void)
PQfinish(upstream_conn); PQfinish(upstream_conn);
if (success == false)
{
log_notice(_("NODE REJOIN failed"));
log_detail("%s", follow_output.data);
termPQExpBuffer(&follow_output);
exit(ERR_DB_QUERY);
}
log_notice(_("NODE REJOIN successful")); log_notice(_("NODE REJOIN successful"));
log_detail("%s", follow_output.data); log_detail("%s", follow_output.data);
termPQExpBuffer(&follow_output); termPQExpBuffer(&follow_output);
return;
} }

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-node.h * repmgr-action-node.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -3,7 +3,7 @@
* *
* Implements primary actions for the repmgr command line utility * Implements primary actions for the repmgr command line utility
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -548,7 +548,8 @@ do_primary_help(void)
printf(_(" \"primary unregister\" unregisters an inactive primary node.\n")); printf(_(" \"primary unregister\" unregisters an inactive primary node.\n"));
puts(""); puts("");
printf(_(" --dry-run check what would happen, but don't actually unregister the primary\n")); printf(_(" --dry-run check what would happen, but don't actually unregister the primary\n"));
printf(_(" -F, --force force removal of the record\n")); printf(_(" --node-id ID of the inactive primary node to unregister.\n"));
printf(_(" -F, --force force removal of an active record\n"));
puts(""); puts("");

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-primary.h * repmgr-action-primary.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-standby.h * repmgr-action-standby.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@ extern void do_standby_switchover(void);
extern void do_standby_help(void); extern void do_standby_help(void);
extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output); extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output, int *error_code);

View File

@@ -3,7 +3,7 @@
* *
* Implements witness actions for the repmgr command line utility * Implements witness actions for the repmgr command line utility
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -110,12 +110,12 @@ do_witness_register(void)
} }
/* check primary node's recovery type */ /* check primary node's recovery type */
recovery_type = get_recovery_type(witness_conn); recovery_type = get_recovery_type(primary_conn);
if (recovery_type == RECTYPE_STANDBY) if (recovery_type == RECTYPE_STANDBY)
{ {
log_error(_("provided primary node is a standby")); log_error(_("provided primary node is a standby"));
log_error(_("provide the connection details of the cluster's primary server")); log_hint(_("provide the connection details of the cluster's primary server"));
PQfinish(witness_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-action-witness.h * repmgr-action-witness.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-client-global.h * repmgr-client-global.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -68,6 +68,7 @@ typedef struct
int node_id; int node_id;
char node_name[MAXLEN]; char node_name[MAXLEN];
char data_dir[MAXPGPATH]; char data_dir[MAXPGPATH];
int remote_node_id;
/* "standby clone" options */ /* "standby clone" options */
bool copy_external_config_files; bool copy_external_config_files;
@@ -86,6 +87,7 @@ typedef struct
/* "standby register" options */ /* "standby register" options */
bool wait_register_sync; bool wait_register_sync;
int wait_register_sync_seconds; int wait_register_sync_seconds;
int wait_start;
/* "standby switchover" options */ /* "standby switchover" options */
bool always_promote; bool always_promote;
@@ -102,6 +104,7 @@ typedef struct
bool role; bool role;
bool slots; bool slots;
bool has_passfile; bool has_passfile;
bool replication_connection;
/* "node join" options */ /* "node join" options */
char config_files[MAXLEN]; char config_files[MAXLEN];
@@ -138,21 +141,21 @@ typedef struct
"", "", "", "", \ "", "", "", "", \
/* other connection options */ \ /* other connection options */ \
"", "", \ "", "", \
/* node options */ \ /* general node options */ \
UNKNOWN_NODE_ID, "", "", \ UNKNOWN_NODE_ID, "", "", UNKNOWN_NODE_ID, \
/* "standby clone" options */ \ /* "standby clone" options */ \
false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \ false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \
false, \ false, \
/* "standby clone"/"standby follow" options */ \ /* "standby clone"/"standby follow" options */ \
NO_UPSTREAM_NODE, \ NO_UPSTREAM_NODE, \
/* "standby register" options */ \ /* "standby register" options */ \
false, 0, \ false, 0, DEFAULT_WAIT_START, \
/* "standby switchover" options */ \ /* "standby switchover" options */ \
false, false, false, \ false, false, false, \
/* "node status" options */ \ /* "node status" options */ \
false, \ false, \
/* "node check" options */ \ /* "node check" options */ \
false, false, false, false, false, false, \ false, false, false, false, false, false, false, \
/* "node join" options */ \ /* "node join" options */ \
"", \ "", \
/* "node service" options */ \ /* "node service" options */ \
@@ -178,6 +181,7 @@ typedef enum
ACTION_NONE, ACTION_NONE,
ACTION_START, ACTION_START,
ACTION_STOP, ACTION_STOP,
ACTION_STOP_WAIT,
ACTION_RESTART, ACTION_RESTART,
ACTION_RELOAD, ACTION_RELOAD,
ACTION_PROMOTE ACTION_PROMOTE

View File

@@ -1,7 +1,7 @@
/* /*
* repmgr-client.c - Command interpreter for the repmgr package * repmgr-client.c - Command interpreter for the repmgr package
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This module is a command-line utility to easily setup a cluster of * This module is a command-line utility to easily setup a cluster of
* hot standby servers for an HA environment * hot standby servers for an HA environment
@@ -60,7 +60,6 @@
#include "repmgr-action-witness.h" #include "repmgr-action-witness.h"
#include "repmgr-action-bdr.h" #include "repmgr-action-bdr.h"
#include "repmgr-action-node.h" #include "repmgr-action-node.h"
#include "repmgr-action-cluster.h" #include "repmgr-action-cluster.h"
#include <storage/fd.h> /* for PG_TEMP_FILE_PREFIX */ #include <storage/fd.h> /* for PG_TEMP_FILE_PREFIX */
@@ -73,7 +72,7 @@ t_runtime_options runtime_options = T_RUNTIME_OPTIONS_INITIALIZER;
t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER; t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
/* conninfo params for the node we're operating on */ /* conninfo params for the node we're operating on */
t_conninfo_param_list source_conninfo; t_conninfo_param_list source_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
bool config_file_required = true; bool config_file_required = true;
char pg_bindir[MAXLEN] = ""; char pg_bindir[MAXLEN] = "";
@@ -95,7 +94,7 @@ static ItemList cli_warnings = {NULL, NULL};
int int
main(int argc, char **argv) main(int argc, char **argv)
{ {
t_conninfo_param_list default_conninfo; t_conninfo_param_list default_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
int optindex; int optindex;
int c; int c;
@@ -177,7 +176,7 @@ main(int argc, char **argv)
strncpy(runtime_options.username, pw->pw_name, MAXLEN); strncpy(runtime_options.username, pw->pw_name, MAXLEN);
} }
while ((c = getopt_long(argc, argv, "?Vb:f:Fd:h:p:U:R:S:L:vtD:crC:", long_options, while ((c = getopt_long(argc, argv, "?Vb:f:FWd:h:p:U:R:S:D:ckL:tvC:", long_options,
&optindex)) != -1) &optindex)) != -1)
{ {
/* /*
@@ -329,6 +328,11 @@ main(int argc, char **argv)
strncpy(runtime_options.node_name, optarg, MAXLEN); strncpy(runtime_options.node_name, optarg, MAXLEN);
break; break;
/* --remote-node-id */
case OPT_REMOTE_NODE_ID:
runtime_options.remote_node_id = repmgr_atoi(optarg, "--remote-node-id", &cli_errors, false);
break;
/* /*
* standby options * --------------- * standby options * ---------------
*/ */
@@ -389,7 +393,11 @@ main(int argc, char **argv)
*--------------------------- *---------------------------
*/ */
case OPT_REGISTER_WAIT: case OPT_WAIT_START:
runtime_options.wait_start = repmgr_atoi(optarg, "--wait-start", &cli_errors, false);
break;
case OPT_WAIT_SYNC:
runtime_options.wait_register_sync = true; runtime_options.wait_register_sync = true;
if (optarg != NULL) if (optarg != NULL)
{ {
@@ -451,6 +459,10 @@ main(int argc, char **argv)
runtime_options.has_passfile = true; runtime_options.has_passfile = true;
break; break;
case OPT_REPL_CONN:
runtime_options.replication_connection = true;
break;
/*-------------------- /*--------------------
* "node rejoin" options * "node rejoin" options
*-------------------- *--------------------
@@ -733,7 +745,6 @@ main(int argc, char **argv)
if (repmgr_command != NULL) if (repmgr_command != NULL)
{ {
#ifndef BDR_ONLY
if (strcasecmp(repmgr_command, "PRIMARY") == 0 || strcasecmp(repmgr_command, "MASTER") == 0) if (strcasecmp(repmgr_command, "PRIMARY") == 0 || strcasecmp(repmgr_command, "MASTER") == 0)
{ {
if (help_option == true) if (help_option == true)
@@ -790,9 +801,6 @@ main(int argc, char **argv)
action = WITNESS_UNREGISTER; action = WITNESS_UNREGISTER;
} }
else if (strcasecmp(repmgr_command, "BDR") == 0) else if (strcasecmp(repmgr_command, "BDR") == 0)
#else
if (strcasecmp(repmgr_command, "BDR") == 0)
#endif
{ {
if (help_option == true) if (help_option == true)
{ {
@@ -997,7 +1005,7 @@ main(int argc, char **argv)
&& config_file_options.use_replication_slots == true) && config_file_options.use_replication_slots == true)
{ {
log_error(_("STANDBY CLONE in Barman mode is incompatible with configuration option \"use_replication_slots\"")); log_error(_("STANDBY CLONE in Barman mode is incompatible with configuration option \"use_replication_slots\""));
log_hint(_("set \"use_replication_slots\" to \"no\" in repmgr.conf, or use --without-barman fo clone directly from the upstream server")); log_hint(_("set \"use_replication_slots\" to \"no\" in repmgr.conf, or use --without-barman to clone directly from the upstream server"));
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
} }
@@ -1153,7 +1161,6 @@ main(int argc, char **argv)
switch (action) switch (action)
{ {
#ifndef BDR_ONLY
/* PRIMARY */ /* PRIMARY */
case PRIMARY_REGISTER: case PRIMARY_REGISTER:
do_primary_register(); do_primary_register();
@@ -1189,21 +1196,6 @@ main(int argc, char **argv)
case WITNESS_UNREGISTER: case WITNESS_UNREGISTER:
do_witness_unregister(); do_witness_unregister();
break; break;
#else
/* we won't ever reach here, but stop the compiler complaining */
case PRIMARY_REGISTER:
case PRIMARY_UNREGISTER:
case STANDBY_CLONE:
case STANDBY_REGISTER:
case STANDBY_UNREGISTER:
case STANDBY_PROMOTE:
case STANDBY_FOLLOW:
case STANDBY_SWITCHOVER:
case WITNESS_REGISTER:
case WITNESS_UNREGISTER:
break;
#endif
/* BDR */ /* BDR */
case BDR_REGISTER: case BDR_REGISTER:
do_bdr_register(); do_bdr_register();
@@ -1595,8 +1587,7 @@ check_cli_parameters(const int action)
case NODE_STATUS: case NODE_STATUS:
break; break;
default: default:
item_list_append_format( item_list_append_format(&cli_warnings,
&cli_warnings,
_("--is-shutdown-cleanly will be ignored when executing %s"), _("--is-shutdown-cleanly will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1609,8 +1600,7 @@ check_cli_parameters(const int action)
case STANDBY_SWITCHOVER: case STANDBY_SWITCHOVER:
break; break;
default: default:
item_list_append_format( item_list_append_format(&cli_warnings,
&cli_warnings,
_("--always-promote will be ignored when executing %s"), _("--always-promote will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1624,8 +1614,7 @@ check_cli_parameters(const int action)
case NODE_REJOIN: case NODE_REJOIN:
break; break;
default: default:
item_list_append_format( item_list_append_format(&cli_warnings,
&cli_warnings,
_("--force-rewind will be ignored when executing %s"), _("--force-rewind will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1639,8 +1628,7 @@ check_cli_parameters(const int action)
case NODE_REJOIN: case NODE_REJOIN:
break; break;
default: default:
item_list_append_format( item_list_append_format(&cli_warnings,
&cli_warnings,
_("--config-files will be ignored when executing %s"), _("--config-files will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1654,6 +1642,7 @@ check_cli_parameters(const int action)
case PRIMARY_UNREGISTER: case PRIMARY_UNREGISTER:
case STANDBY_CLONE: case STANDBY_CLONE:
case STANDBY_REGISTER: case STANDBY_REGISTER:
case STANDBY_FOLLOW:
case STANDBY_SWITCHOVER: case STANDBY_SWITCHOVER:
case WITNESS_REGISTER: case WITNESS_REGISTER:
case WITNESS_UNREGISTER: case WITNESS_UNREGISTER:
@@ -1661,8 +1650,7 @@ check_cli_parameters(const int action)
case NODE_SERVICE: case NODE_SERVICE:
break; break;
default: default:
item_list_append_format( item_list_append_format(&cli_warnings,
&cli_warnings,
_("--dry-run is not effective when executing %s"), _("--dry-run is not effective when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1684,8 +1672,7 @@ check_cli_parameters(const int action)
if (used_options > 1) if (used_options > 1)
{ {
/* TODO: list which options were used */ /* TODO: list which options were used */
item_list_append( item_list_append(&cli_errors,
&cli_errors,
"only one of --csv, --nagios and --optformat can be used"); "only one of --csv, --nagios and --optformat can be used");
} }
} }
@@ -1789,10 +1776,8 @@ do_help(void)
print_help_header(); print_help_header();
printf(_("Usage:\n")); printf(_("Usage:\n"));
#ifndef BDR_ONLY
printf(_(" %s [OPTIONS] primary {register|unregister}\n"), progname()); printf(_(" %s [OPTIONS] primary {register|unregister}\n"), progname());
printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow}\n"), progname()); printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow}\n"), progname());
#endif
printf(_(" %s [OPTIONS] bdr {register|unregister}\n"), progname()); printf(_(" %s [OPTIONS] bdr {register|unregister}\n"), progname());
printf(_(" %s [OPTIONS] node status\n"), progname()); printf(_(" %s [OPTIONS] node status\n"), progname());
printf(_(" %s [OPTIONS] cluster {show|event|matrix|crosscheck}\n"), progname()); printf(_(" %s [OPTIONS] cluster {show|event|matrix|crosscheck}\n"), progname());
@@ -2119,9 +2104,12 @@ test_ssh_connection(char *host, char *remote_user)
bool bool
local_command(const char *command, PQExpBufferData *outputbuf) local_command(const char *command, PQExpBufferData *outputbuf)
{ {
FILE *fp; FILE *fp = NULL;
char output[MAXLEN]; char output[MAXLEN];
int retval = 0; int retval = 0;
bool success;
log_verbose(LOG_DEBUG, "executing:\n %s", command);
if (outputbuf == NULL) if (outputbuf == NULL)
{ {
@@ -2137,20 +2125,29 @@ local_command(const char *command, PQExpBufferData *outputbuf)
return false; return false;
} }
/* TODO: better error handling */
while (fgets(output, MAXLEN, fp) != NULL) while (fgets(output, MAXLEN, fp) != NULL)
{ {
appendPQExpBuffer(outputbuf, "%s", output); appendPQExpBuffer(outputbuf, "%s", output);
if (!feof(fp))
{
break;
}
} }
pclose(fp); retval = pclose(fp);
/* */
success = (WEXITSTATUS(retval) == 0 || WEXITSTATUS(retval) == 141) ? true : false;
log_verbose(LOG_DEBUG, "result of command was %i (%i)", WEXITSTATUS(retval), retval);
if (outputbuf->data != NULL) if (outputbuf->data != NULL)
log_verbose(LOG_DEBUG, "local_command(): output returned was:\n%s", outputbuf->data); log_verbose(LOG_DEBUG, "local_command(): output returned was:\n%s", outputbuf->data);
else else
log_verbose(LOG_DEBUG, "local_command(): no output returned"); log_verbose(LOG_DEBUG, "local_command(): no output returned");
return true; return success;
} }
@@ -2412,7 +2409,12 @@ remote_command(const char *host, const char *user, const char *command, PQExpBuf
pclose(fp); pclose(fp);
if (outputbuf != NULL) if (outputbuf != NULL)
log_verbose(LOG_DEBUG, "remote_command(): output returned was:\n %s", outputbuf->data); {
if (strlen(outputbuf->data))
log_verbose(LOG_DEBUG, "remote_command(): output returned was:\n %s", outputbuf->data);
else
log_verbose(LOG_DEBUG, "remote_command(): no output returned");
}
return true; return true;
} }
@@ -2458,18 +2460,15 @@ get_server_action(t_server_action action, char *script, char *data_dir)
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString( appendShellString(&command,
&command,
data_dir); data_dir);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
" start"); " start");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2481,6 +2480,7 @@ get_server_action(t_server_action action, char *script, char *data_dir)
} }
case ACTION_STOP: case ACTION_STOP:
case ACTION_STOP_WAIT:
{ {
if (config_file_options.service_stop_command[0] != '\0') if (config_file_options.service_stop_command[0] != '\0')
{ {
@@ -2490,19 +2490,23 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
"%s %s -D ", "%s %s -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString( appendShellString(&command,
&command,
data_dir); data_dir);
appendPQExpBuffer( if (action == ACTION_STOP_WAIT)
&command, appendPQExpBuffer(&command,
" -m fast -W stop"); " -w");
else
appendPQExpBuffer(&command,
" -W");
appendPQExpBuffer(&command,
" -m fast stop");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2521,18 +2525,15 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString( appendShellString(&command,
&command,
data_dir); data_dir);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
" restart"); " restart");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2552,18 +2553,15 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString( appendShellString(&command,
&command,
data_dir); data_dir);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
" reload"); " reload");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2584,18 +2582,15 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString( appendShellString(&command,
&command,
data_dir); data_dir);
appendPQExpBuffer( appendPQExpBuffer(&command,
&command,
" promote"); " promote");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2629,6 +2624,7 @@ data_dir_required_for_action(t_server_action action)
return true; return true;
case ACTION_STOP: case ACTION_STOP:
case ACTION_STOP_WAIT:
if (config_file_options.service_stop_command[0] != '\0') if (config_file_options.service_stop_command[0] != '\0')
{ {
return false; return false;
@@ -2725,6 +2721,6 @@ init_node_record(t_node_info *node_record)
if (config_file_options.use_replication_slots == true) if (config_file_options.use_replication_slots == true)
{ {
maxlen_snprintf(node_record->slot_name, "repmgr_slot_%i", config_file_options.node_id); create_slot_name(node_record->slot_name, config_file_options.node_id);
} }
} }

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr-client.h * repmgr-client.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -56,7 +56,7 @@
#define OPT_NODE_NAME 1007 #define OPT_NODE_NAME 1007
#define OPT_WITHOUT_BARMAN 1008 #define OPT_WITHOUT_BARMAN 1008
#define OPT_NO_UPSTREAM_CONNECTION 1009 #define OPT_NO_UPSTREAM_CONNECTION 1009
#define OPT_REGISTER_WAIT 1010 #define OPT_WAIT_SYNC 1010
#define OPT_LOG_TO_FILE 1011 #define OPT_LOG_TO_FILE 1011
#define OPT_UPSTREAM_CONNINFO 1012 #define OPT_UPSTREAM_CONNINFO 1012
#define OPT_REPLICATION_USER 1013 #define OPT_REPLICATION_USER 1013
@@ -82,6 +82,10 @@
#define OPT_SLOTS 1033 #define OPT_SLOTS 1033
#define OPT_CONFIG_ARCHIVE_DIR 1034 #define OPT_CONFIG_ARCHIVE_DIR 1034
#define OPT_HAS_PASSFILE 1035 #define OPT_HAS_PASSFILE 1035
#define OPT_WAIT_START 1036
#define OPT_REPL_CONN 1037
#define OPT_REMOTE_NODE_ID 1038
/* deprecated since 3.3 */ /* deprecated since 3.3 */
#define OPT_DATA_DIR 999 #define OPT_DATA_DIR 999
#define OPT_NO_CONNINFO_PASSWORD 998 #define OPT_NO_CONNINFO_PASSWORD 998
@@ -113,6 +117,7 @@ static struct option long_options[] =
{"pgdata", required_argument, NULL, 'D'}, {"pgdata", required_argument, NULL, 'D'},
{"node-id", required_argument, NULL, OPT_NODE_ID}, {"node-id", required_argument, NULL, OPT_NODE_ID},
{"node-name", required_argument, NULL, OPT_NODE_NAME}, {"node-name", required_argument, NULL, OPT_NODE_NAME},
{"remote-node-id", required_argument, NULL, OPT_REMOTE_NODE_ID},
/* logging options */ /* logging options */
{"log-level", required_argument, NULL, 'L'}, {"log-level", required_argument, NULL, 'L'},
@@ -136,7 +141,8 @@ static struct option long_options[] =
{"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN}, {"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN},
/* "standby register" options */ /* "standby register" options */
{"wait-sync", optional_argument, NULL, OPT_REGISTER_WAIT}, {"wait-start", required_argument, NULL, OPT_WAIT_START},
{"wait-sync", optional_argument, NULL, OPT_WAIT_SYNC},
/* "standby switchover" options /* "standby switchover" options
* *
@@ -155,6 +161,7 @@ static struct option long_options[] =
{"role", no_argument, NULL, OPT_ROLE}, {"role", no_argument, NULL, OPT_ROLE},
{"slots", no_argument, NULL, OPT_SLOTS}, {"slots", no_argument, NULL, OPT_SLOTS},
{"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE}, {"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE},
{"replication-connection", no_argument, NULL, OPT_REPL_CONN},
/* "node rejoin" options */ /* "node rejoin" options */
{"config-files", required_argument, NULL, OPT_CONFIG_FILES}, {"config-files", required_argument, NULL, OPT_CONFIG_FILES},

View File

@@ -1,7 +1,7 @@
/* /*
* repmgr.c - repmgr extension * repmgr.c - repmgr extension
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This is the actual extension code; see repmgr-client.c for the code which * This is the actual extension code; see repmgr-client.c for the code which
* generates the repmgr binary * generates the repmgr binary
@@ -288,7 +288,6 @@ standby_get_last_updated(PG_FUNCTION_ARGS)
Datum Datum
notify_follow_primary(PG_FUNCTION_ARGS) notify_follow_primary(PG_FUNCTION_ARGS)
{ {
#ifndef BDR_ONLY
int primary_node_id = UNKNOWN_NODE_ID; int primary_node_id = UNKNOWN_NODE_ID;
if (!shared_state) if (!shared_state)
@@ -316,7 +315,7 @@ notify_follow_primary(PG_FUNCTION_ARGS)
} }
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
PG_RETURN_VOID(); PG_RETURN_VOID();
} }
@@ -329,14 +328,12 @@ get_new_primary(PG_FUNCTION_ARGS)
if (!shared_state) if (!shared_state)
PG_RETURN_NULL(); PG_RETURN_NULL();
#ifndef BDR_ONLY
LWLockAcquire(shared_state->lock, LW_SHARED); LWLockAcquire(shared_state->lock, LW_SHARED);
if (shared_state->follow_new_primary == true) if (shared_state->follow_new_primary == true)
new_primary_node_id = shared_state->candidate_node_id; new_primary_node_id = shared_state->candidate_node_id;
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
if (new_primary_node_id == UNKNOWN_NODE_ID) if (new_primary_node_id == UNKNOWN_NODE_ID)
PG_RETURN_NULL(); PG_RETURN_NULL();
@@ -348,7 +345,6 @@ get_new_primary(PG_FUNCTION_ARGS)
Datum Datum
reset_voting_status(PG_FUNCTION_ARGS) reset_voting_status(PG_FUNCTION_ARGS)
{ {
#ifndef BDR_ONLY
if (!shared_state) if (!shared_state)
PG_RETURN_NULL(); PG_RETURN_NULL();
@@ -366,7 +362,7 @@ reset_voting_status(PG_FUNCTION_ARGS)
} }
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
PG_RETURN_VOID(); PG_RETURN_VOID();
} }

View File

@@ -13,35 +13,35 @@
# repmgr and repmgrd require the following items to be explicitly configured. # repmgr and repmgrd require the following items to be explicitly configured.
#node_id= # A unique integer greater than zero #node_id= # A unique integer greater than zero
#node_name='' # An arbitrary (but unique) string; we recommend #node_name='' # An arbitrary (but unique) string; we recommend
# using the server's hostname or another identifier # using the server's hostname or another identifier
# unambiguously associated with the server to avoid # unambiguously associated with the server to avoid
# confusion. Avoid choosing names which reflect the # confusion. Avoid choosing names which reflect the
# node's current role, e.g. "primary" or "standby1", # node's current role, e.g. "primary" or "standby1",
# as roles can change and it will be confusing if # as roles can change and it will be confusing if
# the current primary is called "standby1". # the current primary is called "standby1".
#conninfo='' # Database connection information as a conninfo string. #conninfo='' # Database connection information as a conninfo string.
# All servers in the cluster must be able to connect to # All servers in the cluster must be able to connect to
# the local node using this string. # the local node using this string.
# #
# For details on conninfo strings, see: # For details on conninfo strings, see:
# https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING # https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING
# #
# If repmgrd is in use, consider explicitly setting # If repmgrd is in use, consider explicitly setting
# "connect_timeout" in the conninfo string to determine # "connect_timeout" in the conninfo string to determine
# the length of time which elapses before a network # the length of time which elapses before a network
# connection attempt is abandoned; for details see: # connection attempt is abandoned; for details see:
# https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT # https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT
#data_directory # The node's data directory. This is needed by repmgr #data_directory='' # The node's data directory. This is needed by repmgr
# when performing operations when the PostgreSQL instance # when performing operations when the PostgreSQL instance
# is not running and there's no other way of determining # is not running and there's no other way of determining
# the data directory. # the data directory.
#replication_user # User to make replication connections with, if not set defaults #replication_user='repmgr' # User to make replication connections with, if not set defaults
# to the user defined in "conninfo". # to the user defined in "conninfo".
# ============================================================================= # =============================================================================
@@ -52,28 +52,28 @@
# Replication settings # Replication settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
#replication_type=physical # Must be one of 'physical' or 'bdr'. #replication_type=physical # Must be one of 'physical' or 'bdr'.
#location=default # arbitrary string defining the location of the node; this #location=default # arbitrary string defining the location of the node; this
# is used during failover to check visibilty of the # is used during failover to check visibilty of the
# current primary node. See the 'repmgrd' documentation # current primary node. See the 'repmgrd' documentation
# in README.md for further details. # in README.md for further details.
#use_replication_slots=no # whether to use physical replication slots #use_replication_slots=no # whether to use physical replication slots
# NOTE: when using replication slots, # NOTE: when using replication slots,
# 'max_replication_slots' should be configured for # 'max_replication_slots' should be configured for
# at least the number of standbys which will connect # at least the number of standbys which will connect
# to the primary. # to the primary.
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" in recovery.conf #recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" in recovery.conf
# will be set to this value. # will be set to this value.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Witness server settings # Witness server settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
#witness_sync_interval=15 # interval (in seconds) to synchronise node records #witness_sync_interval=15 # interval (in seconds) to synchronise node records
# to the witness server # to the witness server
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Logging settings # Logging settings
@@ -85,14 +85,14 @@
# This is mainly intended for those cases when `repmgr` is executed directly # This is mainly intended for those cases when `repmgr` is executed directly
# by `repmgrd`. # by `repmgrd`.
#log_level=INFO # Log level: possible values are DEBUG, INFO, NOTICE, #log_level=INFO # Log level: possible values are DEBUG, INFO, NOTICE,
# WARNING, ERROR, ALERT, CRIT or EMERG # WARNING, ERROR, ALERT, CRIT or EMERG
#log_facility=STDERR # Logging facility: possible values are STDERR, or for #log_facility=STDERR # Logging facility: possible values are STDERR, or for
# syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER # syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER
#log_file='' # stderr can be redirected to an arbitrary file: #log_file='' # stderr can be redirected to an arbitrary file:
#log_status_interval=300 # interval (in seconds) for repmgrd to log a status message #log_status_interval=300 # interval (in seconds) for repmgrd to log a status message
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
@@ -118,28 +118,28 @@
# #
# event_notifications=primary_register,standby_register # event_notifications=primary_register,standby_register
#event_notification_command='' # An external program or script which #event_notification_command='' # An external program or script which
# can be executed by the user under which # can be executed by the user under which
# repmgr/repmgrd are run. # repmgr/repmgrd are run.
#event_notifications='' # A commas-separated list of notification #event_notifications='' # A commas-separated list of notification
# types # types
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Environment/command settings # Environment/command settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
#pg_bindir='' # Path to PostgreSQL binary directory (location #pg_bindir='' # Path to PostgreSQL binary directory (location
# of pg_ctl, pg_basebackup etc.). Only needed # of pg_ctl, pg_basebackup etc.). Only needed
# if these files are not in the system $PATH. # if these files are not in the system $PATH.
# #
# Debian/Ubuntu users: you will probably need to # Debian/Ubuntu users: you will probably need to
# set this to the directory where `pg_ctl` is located, # set this to the directory where `pg_ctl` is located,
# e.g. /usr/lib/postgresql/9.6/bin/ # e.g. /usr/lib/postgresql/9.6/bin/
#use_primary_conninfo_password=false # explicitly set "password" in recovery.conf's #use_primary_conninfo_password=false # explicitly set "password" in recovery.conf's
# "primary_conninfo" parameter using the value contained # "primary_conninfo" parameter using the value contained
# in the environment variable PGPASSWORD # in the environment variable PGPASSWORD
#passfile='' # path to .pgpass file to include in "primary_conninfo" #passfile='' # path to .pgpass file to include in "primary_conninfo"
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# external command options # external command options
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
@@ -153,11 +153,10 @@
# rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\"" # rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\""
# ssh_options=-o "StrictHostKeyChecking no" # ssh_options=-o "StrictHostKeyChecking no"
#pg_ctl_options='' # Options to append to "pg_ctl" #pg_ctl_options='' # Options to append to "pg_ctl"
#pg_basebackup_options='' # Options to append to "pg_basebackup" #pg_basebackup_options='' # Options to append to "pg_basebackup"
#rsync_options='' # Options to append to "rsync" #rsync_options='' # Options to append to "rsync"
ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh" ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
@@ -172,12 +171,12 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# tablespace_mapping=/path/to/original/tablespace=/path/to/new/tablespace # tablespace_mapping=/path/to/original/tablespace=/path/to/new/tablespace
# restore_command = 'cp /path/to/archived/wals/%f %p' # restore_command = 'cp /path/to/archived/wals/%f %p'
#tablespace_mapping='' # Tablespaces can be remapped from one #tablespace_mapping='' # Tablespaces can be remapped from one
# file system location to another. This # file system location to another. This
# parameter can be provided multiple times. # parameter can be provided multiple times.
#restore_command='' # This will be placed in the recovery.conf #restore_command='' # This will be placed in the recovery.conf
# file generated by repmgr # file generated by repmgr
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Standby follow settings # Standby follow settings
@@ -186,19 +185,19 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# These settings apply when instructing a standby to follow the new primary # These settings apply when instructing a standby to follow the new primary
# ("repmgr standby follow"). # ("repmgr standby follow").
#primary_follow_timeout=60 # The length of time (in seconds) to wait #primary_follow_timeout=60 # The length of time (in seconds) to wait
# for the new primary to become available # for the new primary to become available
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Barman options # Barman options
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
#barman_server='' # The barman configuration section #barman_server='' # The barman configuration section
#barman_host='' # The host name of the barman server #barman_host='' # The host name of the barman server
#barman_config='' # The Barman configuration file on the #barman_config='' # The Barman configuration file on the
# Barman server (needed if the file is # Barman server (needed if the file is
# in a non-standard location) # in a non-standard location)
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Failover and monitoring settings (repmgrd) # Failover and monitoring settings (repmgrd)
@@ -207,42 +206,43 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# These settings are only applied when repmgrd is running. Values shown # These settings are only applied when repmgrd is running. Values shown
# are defaults. # are defaults.
#failover=manual # one of 'automatic', 'manual'. #failover=manual # one of 'automatic', 'manual'.
# determines what action to take in the event of upstream failure # determines what action to take in the event of upstream failure
# #
# 'automatic': repmgrd will automatically attempt to promote the # 'automatic': repmgrd will automatically attempt to promote the
# node or follow the new upstream node # node or follow the new upstream node
# 'manual': repmgrd will take no action and the node will require # 'manual': repmgrd will take no action and the node will require
# manual attention to reattach it to replication # manual attention to reattach it to replication
# (does not apply to BDR mode) # (does not apply to BDR mode)
#priority=100 # indicate a preferred priorty for promoting nodes; #priority=100 # indicate a preferred priorty for promoting nodes;
# a value of zero prevents the node being promoted to primary # a value of zero prevents the node being promoted to primary
# (default: 100) # (default: 100)
#reconnect_attempts=6 # Number attempts which will be made to reconnect to an unreachable #reconnect_attempts=6 # Number attempts which will be made to reconnect to an unreachable
# primary (or other upstream node) # primary (or other upstream node)
#reconnect_interval=10 # Interval between attempts to reconnect to an unreachable #reconnect_interval=10 # Interval between attempts to reconnect to an unreachable
# primary (or other upstream node) # primary (or other upstream node)
#promote_command= # command to execute when promoting a new primary; use something like: #promote_command= # command to execute when promoting a new primary; use something like:
# #
# repmgr standby promote -f /etc/repmgr.conf # repmgr standby promote -f /etc/repmgr.conf
# #
#follow_command= # command to execute when instructing a standby to follow a new primary; #follow_command= # command to execute when instructing a standby to follow a new primary;
# use something like: # use something like:
# #
# repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n # repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n
# #
#primary_notification_timeout=60 # Interval (in seconds) which repmgrd on a standby #primary_notification_timeout=60 # Interval (in seconds) which repmgrd on a standby
# will wait for a notification from the new primary, # will wait for a notification from the new primary,
# before falling back to degraded monitoring # before falling back to degraded monitoring
#monitoring_history=no
#degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the #monitoring_history=no # Whether to write monitoring data to the "montoring_history" table
# server being monitored is no longer available. -1 (default) #monitor_interval_secs=2 # Interval (in seconds) at which to write monitoring data
# disables the timeout completely. #degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the
#async_query_timeout=60 # Interval (in seconds) which repmgrd will wait before # server being monitored is no longer available. -1 (default)
# cancelling an asynchronous query. # disables the timeout completely.
#async_query_timeout=60 # Interval (in seconds) which repmgrd will wait before
# cancelling an asynchronous query.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# service control commands # service control commands
@@ -275,10 +275,10 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#service_stop_command = '' #service_stop_command = ''
#service_restart_command = '' #service_restart_command = ''
#service_reload_command = '' #service_reload_command = ''
#service_promote_command = '' # Note: this overrides any value contained in the setting #service_promote_command = '' # Note: this overrides any value contained in the setting
# "promote_command". This is intended for systems which # "promote_command". This is intended for systems which
# provide a package-level promote command, such as Debian's # provide a package-level promote command, such as Debian's
# "pg_ctlcluster" # "pg_ctlcluster"
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
@@ -287,25 +287,25 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# Various warning/critical thresholds used by "repmgr node check". # Various warning/critical thresholds used by "repmgr node check".
#archive_ready_warning=16 # repmgr node check --archiver #archive_ready_warning=16 # repmgr node check --archive-ready
#archive_ready_critical=128 # #archive_ready_critical=128 #
# Numbers of files pending archiving via PostgreSQL's # Numbers of files pending archiving via PostgreSQL's
# "archive_command" configuration parameter. If # "archive_command" configuration parameter. If
# files can't be archived fast enough, or the archive # files can't be archived fast enough, or the archive
# command is failing, the buildup of files can # command is failing, the buildup of files can
# cause various issues, such as server shutdown being # cause various issues, such as server shutdown being
# delayed until all files are archived, or excessive # delayed until all files are archived, or excessive
# space being occupied by unarchived files. # space being occupied by unarchived files.
# #
# Note that these values will be checked when executing # Note that these values will be checked when executing
# "repmgr standby switchover" to warn about potential # "repmgr standby switchover" to warn about potential
# issues with shutting down the demotion candidate. # issues with shutting down the demotion candidate.
#replication_lag_warning=300 # repmgr node check --replication-lag #replication_lag_warning=300 # repmgr node check --replication-lag
#replication_lag_critical=600 # #replication_lag_critical=600 #
# Note that these values will be checked when executing # Note that these values will be checked when executing
# "repmgr standby switchover" to warn about potential # "repmgr standby switchover" to warn about potential
# issues with shutting down the demotion candidate. # issues with shutting down the demotion candidate.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------

View File

@@ -1,6 +1,6 @@
/* /*
* repmgr.h * repmgr.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -76,6 +76,7 @@
#define DEFAULT_REPLICATION_LAG_WARNING 300 /* seconds */ #define DEFAULT_REPLICATION_LAG_WARNING 300 /* seconds */
#define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */ #define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */
#define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */ #define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */
#define DEFAULT_WAIT_START 30 /* seconds */
#ifndef RECOVERY_COMMAND_FILE #ifndef RECOVERY_COMMAND_FILE
#define RECOVERY_COMMAND_FILE "recovery.conf" #define RECOVERY_COMMAND_FILE "recovery.conf"

View File

@@ -1,3 +1,3 @@
#define REPMGR_VERSION_DATE "" #define REPMGR_VERSION_DATE ""
#define REPMGR_VERSION "4.0.1" #define REPMGR_VERSION "4.0.3"

View File

@@ -1,7 +1,7 @@
/* /*
* repmgrd-bdr.c - BDR functionality for repmgrd * repmgrd-bdr.c - BDR functionality for repmgrd
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -124,9 +124,9 @@ monitor_bdr(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
if (is_active_bdr_node(local_conn, local_node_info.node_name)) if (is_active_bdr_node(local_conn, local_node_info.node_name) == false)
{ {
log_error(_("BDR node %s is not active, terminating"), log_error(_("BDR node \"%s\" is not active, terminating"),
local_node_info.node_name); local_node_info.node_name);
PQfinish(local_conn); PQfinish(local_conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);

View File

@@ -1,6 +1,6 @@
/* /*
* repmgrd-bdr.h * repmgrd-bdr.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,7 +1,7 @@
/* /*
* repmgrd-physical.c - physical replication functionality for repmgrd * repmgrd-physical.c - physical (streaming) replication functionality for repmgrd
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -54,7 +54,6 @@ typedef enum
static PGconn *upstream_conn = NULL; static PGconn *upstream_conn = NULL;
static PGconn *primary_conn = NULL; static PGconn *primary_conn = NULL;
#ifndef BDR_ONLY
static FailoverState failover_state = FAILOVER_STATE_UNKNOWN; static FailoverState failover_state = FAILOVER_STATE_UNKNOWN;
static int primary_node_id = UNKNOWN_NODE_ID; static int primary_node_id = UNKNOWN_NODE_ID;
@@ -85,15 +84,12 @@ static void update_monitoring_history(void);
static const char * format_failover_state(FailoverState failover_state); static const char * format_failover_state(FailoverState failover_state);
#endif
/* perform some sanity checks on the node's configuration */ /* perform some sanity checks on the node's configuration */
void void
do_physical_node_check(void) do_physical_node_check(void)
{ {
#ifndef BDR_ONLY
/* /*
* Check if node record is active - if not, and `failover=automatic`, the * Check if node record is active - if not, and `failover=automatic`, the
* node won't be considered as a promotion candidate; this often happens * node won't be considered as a promotion candidate; this often happens
@@ -107,11 +103,11 @@ do_physical_node_check(void)
if (local_node_info.active == false) if (local_node_info.active == false)
{ {
char *hint = "Check that 'repmgr (primary|standby) register' was executed for this node"; char *hint = "Check that \"repmgr (primary|standby) register\" was executed for this node";
switch (config_file_options.failover) switch (config_file_options.failover)
{ {
/* "failover" is an enum, all values should be covered here */ /* "failover" is an enum, all values should be covered here */
case FAILOVER_AUTOMATIC: case FAILOVER_AUTOMATIC:
log_error(_("this node is marked as inactive and cannot be used as a failover target")); log_error(_("this node is marked as inactive and cannot be used as a failover target"));
@@ -163,7 +159,6 @@ do_physical_node_check(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
} }
#endif
} }
@@ -174,7 +169,6 @@ do_physical_node_check(void)
void void
monitor_streaming_primary(void) monitor_streaming_primary(void)
{ {
#ifndef BDR_ONLY
instr_time log_status_interval_start; instr_time log_status_interval_start;
PQExpBufferData event_details; PQExpBufferData event_details;
@@ -485,14 +479,12 @@ loop:
sleep(config_file_options.monitor_interval_secs); sleep(config_file_options.monitor_interval_secs);
} }
#endif
} }
void void
monitor_streaming_standby(void) monitor_streaming_standby(void)
{ {
#ifndef BDR_ONLY
RecordStatus record_status; RecordStatus record_status;
instr_time log_status_interval_start; instr_time log_status_interval_start;
PQExpBufferData event_details; PQExpBufferData event_details;
@@ -935,22 +927,19 @@ loop:
local_node_info.active = false; local_node_info.active = false;
appendPQExpBuffer( appendPQExpBuffer(&event_details,
&event_details,
_("unable to connect to local node \"%s\" (ID: %i), marking inactive"), _("unable to connect to local node \"%s\" (ID: %i), marking inactive"),
local_node_info.node_name, local_node_info.node_name,
local_node_info.node_id); local_node_info.node_id);
log_warning("%s", event_details.data) log_warning("%s", event_details.data);
create_event_notification(primary_conn,
create_event_notification( &config_file_options,
primary_conn, local_node_info.node_id,
&config_file_options, "standby_failure",
local_node_info.node_id, false,
"standby_failure", event_details.data);
false,
event_details.data);
termPQExpBuffer(&event_details); termPQExpBuffer(&event_details);
} }
@@ -971,8 +960,7 @@ loop:
local_node_info.active = true; local_node_info.active = true;
appendPQExpBuffer( appendPQExpBuffer(&event_details,
&event_details,
_("reconnected to local node \"%s\" (ID: %i), marking active"), _("reconnected to local node \"%s\" (ID: %i), marking active"),
local_node_info.node_name, local_node_info.node_name,
local_node_info.node_id); local_node_info.node_id);
@@ -1023,14 +1011,12 @@ loop:
sleep(config_file_options.monitor_interval_secs); sleep(config_file_options.monitor_interval_secs);
} }
#endif
} }
void void
monitor_streaming_witness(void) monitor_streaming_witness(void)
{ {
#ifndef BDR_ONLY
instr_time log_status_interval_start; instr_time log_status_interval_start;
instr_time witness_sync_interval_start; instr_time witness_sync_interval_start;
@@ -1355,13 +1341,12 @@ loop:
sleep(config_file_options.monitor_interval_secs); sleep(config_file_options.monitor_interval_secs);
} }
#endif
return; return;
} }
#ifndef BDR_ONLY
static bool static bool
do_primary_failover(void) do_primary_failover(void)
{ {
@@ -2726,7 +2711,6 @@ format_failover_state(FailoverState failover_state)
return "UNKNOWN_FAILOVER_STATE"; return "UNKNOWN_FAILOVER_STATE";
} }
#endif /* #ifndef BDR_ONLY */
void void
close_connections_physical() close_connections_physical()

View File

@@ -1,6 +1,6 @@
/* /*
* repmgrd-physical.h * repmgrd-physical.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,7 +1,7 @@
/* /*
* repmgrd.c - Replication manager daemon * repmgrd.c - Replication manager daemon
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by
@@ -89,6 +89,7 @@ main(int argc, char **argv)
bool cli_monitoring_history = false; bool cli_monitoring_history = false;
RecordStatus record_status; RecordStatus record_status;
ExtensionStatus extension_status = REPMGR_UNKNOWN;
FILE *fd; FILE *fd;
@@ -318,6 +319,37 @@ main(int argc, char **argv)
* repmgr has not been properly configured. * repmgr has not been properly configured.
*/ */
/* Check "repmgr" the extension is installed */
extension_status = get_repmgr_extension_status(local_conn);
if (extension_status != REPMGR_INSTALLED)
{
/* this is unlikely to happen */
if (extension_status == REPMGR_UNKNOWN)
{
log_error(_("unable to determine status of \"repmgr\" extension"));
log_detail("%s", PQerrorMessage(local_conn));
PQfinish(local_conn);
exit(ERR_DB_QUERY);
}
log_error(_("repmgr extension not found on this node"));
if (extension_status == REPMGR_AVAILABLE)
{
log_detail(_("repmgr extension is available but not installed in database \"%s\""),
PQdb(local_conn));
}
else if (extension_status == REPMGR_UNAVAILABLE)
{
log_detail(_("repmgr extension is not available on this node"));
}
log_hint(_("check that this node is part of a repmgr cluster"));
PQfinish(local_conn);
exit(ERR_BAD_CONFIG);
}
/* Retrieve record for this node from the local database */ /* Retrieve record for this node from the local database */
record_status = get_node_record(local_conn, config_file_options.node_id, &local_node_info); record_status = get_node_record(local_conn, config_file_options.node_id, &local_node_info);
@@ -400,7 +432,6 @@ start_monitoring(void)
{ {
switch (local_node_info.type) switch (local_node_info.type)
{ {
#ifndef BDR_ONLY
case PRIMARY: case PRIMARY:
monitor_streaming_primary(); monitor_streaming_primary();
break; break;
@@ -409,12 +440,6 @@ start_monitoring(void)
break; break;
case WITNESS: case WITNESS:
monitor_streaming_witness(); monitor_streaming_witness();
break;
#else
case PRIMARY:
case STANDBY:
return;
#endif
case BDR: case BDR:
monitor_bdr(); monitor_bdr();
return; return;

View File

@@ -1,6 +1,6 @@
/* /*
* repmgrd.h * repmgrd.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
*/ */

View File

@@ -1,7 +1,7 @@
/* /*
* strutil.c * strutil.c
* *
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/* /*
* strutil.h * strutil.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by

View File

@@ -1,6 +1,6 @@
/* /*
* voting.h * voting.h
* Copyright (c) 2ndQuadrant, 2010-2017 * Copyright (c) 2ndQuadrant, 2010-2018
* *
* This program is free software: you can redistribute it and/or modify * This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by * it under the terms of the GNU General Public License as published by