Compare commits

..

153 Commits

Author SHA1 Message Date
Ian Barwick
d10f1f289e Bump version in configure.in
4.0.2
2018-01-16 13:55:58 +09:00
Ian Barwick
5731ba6043 Update version and release date 2018-01-16 12:58:11 +09:00
Ian Barwick
3d6437c8f8 repmgr: assume node is actually shutting down if pingable and that's the reported status 2018-01-16 11:17:06 +09:00
Ian Barwick
54b5c8ad94 repmgrd: log execution error in "repmgrd_get_local_node_id()"
That shouldn't happen, but if it does it will make it easier to
identify the issue.
2018-01-16 11:14:04 +09:00
Ian Barwick
0eca08ffaf doc: improve switchover documentation
Emphasize need to set the "service_*_command" options when repmgr is
installed from a package.
2018-01-16 11:06:39 +09:00
Ian Barwick
05c1dc2b92 doc: add 4.0.2 release notes 2018-01-11 16:39:58 +09:00
Ian Barwick
2bd300073d doc: minor readbility fix 2018-01-11 15:49:56 +09:00
Ian Barwick
01e020df8e doc: note change of shared library name from "repmgr_funcs" to "repmgr" 2018-01-11 15:47:35 +09:00
Ian Barwick
ae7963dc64 repmgr: automatically create slot name if missing
It's possible that a node was registered with "use_replication_slots=false"
but that was later changed to "use_replication_slots=true". If the node
was not subsequently re-registered, the node record will contain an empty
slot name, which will cause any slot creation operation during
"standby follow" or "node rejoin" to fail.

To prevent this happening, check for an empty slot name and automatically
set before proceeding.

Addresses GitHub #343.
2018-01-11 11:13:41 +09:00
Ian Barwick
faffb2a6e7 repmgr: catch possible corner case when checking node shutdown status
It's conceivable that PQping is returning "no response" but the
shutdown hasn't quite completed.
2018-01-10 14:56:00 +09:00
Ian Barwick
5d57044118 repmgr: during switchover, correctly detect unclean shutdown status 2018-01-10 12:21:04 +09:00
Ian Barwick
07a88c78a5 repmgr standby switchover: add "%p" event notification parameter
This will contain the node ID of the former primary.
2018-01-10 11:01:00 +09:00
Ian Barwick
f7df8b9c80 doc: document command line options for "standby switchover" 2018-01-10 10:19:36 +09:00
Ian Barwick
20920b3da1 repmgr standby switchover: add event details 2018-01-10 09:55:24 +09:00
Ian Barwick
683f4de182 Bump version
4.0.2
2018-01-09 13:43:58 +09:00
Ian Barwick
0c62821ffb Consolidate parsing of output from executing repmgr on a remote server
This should also fix the issue reported in GitHub #349.
2018-01-09 13:33:38 +09:00
Ian Barwick
6b70e8bbe6 doc: list repmgr.conf parameters relevant during switchover 2018-01-08 11:13:39 +09:00
Ian Barwick
6b223698c9 Fix call to is_active_bdr_node() in BDR repmgrd
Following the fix to "is_active_bdr_node()" in 841f03ae, it turns out
the call in repmgrd-bdr.c was only accidentally working; explicitly
test for a false return value.
2018-01-04 21:06:45 +09:00
Ian Barwick
aee12dc2c7 "repmgr bdr register": create missing connection replication set if needed
Previously the assumption was that the "repmgr" replication set would be
set up when the nodes are created, however no checks were implemented
and this was not well-documented.

Addresses GitHub #347.
2018-01-04 17:12:52 +09:00
Ian Barwick
c5c86e1ada "repmgr bdr register": improve node name check
We'll use "bdr.bdr_get_local_node_name()" to check the local BDR node
name and the repmgr one match.
2018-01-04 16:07:06 +09:00
Ian Barwick
7476dc84f2 doc: link event notification page from relevate command reference pages 2018-01-04 14:54:14 +09:00
Ian Barwick
f6d63f5216 doc: update package documentation 2018-01-04 13:11:44 +09:00
Ian Barwick
a608b0bc18 "repmgr standby register": add --wait-start option
Implements GitHub #356.
2018-01-04 12:48:12 +09:00
Ian Barwick
469ebba656 doc: fix typos in "repmgr primary unregister" command reference 2018-01-04 12:31:29 +09:00
Ian Barwick
647c21ad0e doc: add link to event notifications page from "repmgr cluster event" 2018-01-04 10:57:54 +09:00
Ian Barwick
3d2530d6f9 Fix query in is_active_bdr_node()
Boolean column was not being checked correctly.

Also add detail output in "repmgr node role --check", where the function
is called.
2018-01-04 10:48:31 +09:00
Ian Barwick
b26e400199 "repmgr cluster event": move query to dbutils.c 2018-01-04 10:06:54 +09:00
Ian Barwick
152e9545a4 docs: document "repmgr cluster event --terse" 2018-01-04 09:53:54 +09:00
Ian Barwick
83b8f05221 "repmgr cluster events": optionally omit "Details" column with --terse
Implements GitHub #360.
2018-01-04 09:48:00 +09:00
Ian Barwick
486f8e5a2c repmgrd: document standby_[failure|recovery] event notifications
Also clean up the relevant code section.

Addresses GitHub #359.
2018-01-04 09:34:49 +09:00
Ian Barwick
e517cc74d1 repmgr node rejoin: handle missing node record correctly
If a connection was provided for a database other than the "repmgr"
database, error was logged but execution continued, resulting in
the connection being finished twice.

Addresses GitHub #358.
2018-01-03 15:20:10 +09:00
Ian Barwick
26285b470f doc: add appendix with details about packages
work-in-progress
2018-01-02 17:24:51 +09:00
Ian Barwick
1521657965 Update copyright notices to 2018 2018-01-02 10:20:09 +09:00
Ian Barwick
041604e303 doc: Fix event notification placeholder typo
Per report from Carlos.
2018-01-01 10:29:34 +09:00
Ian Barwick
0be0100a7c docs: update HISTORY 2017-12-27 10:24:56 +09:00
Ian Barwick
2133834dda doc: update documentation build instructions
Describe how to build documentation as a single file, and also note
requirement to build against 9.6 or earlier.
2017-12-27 10:24:22 +09:00
Ian Barwick
d5fd93c350 repmgr.conf.sample: fix command line argument
"repmgr node check --archive-ready" is correct, however abbreviated
versions will be accepted by getopt_long() if they don't match
or partially match any other options.

Per report by "chaintng" in GitHub #355.
2017-12-27 10:24:17 +09:00
Tony Finch
5804778b58 doc: an optional all-in-one-file manual 2017-12-27 10:24:10 +09:00
Ian Barwick
407a7ea2f4 repmgr: add missing -W option to getopt_long() invocation
Addresses GitHub #350.
2017-12-20 10:28:31 +09:00
Martín Marqués
4d2eca0978 Switch spaces for tabs in repmgr.conf sample file.
This makes comments stay aligned in most cases the conf file is
modified, and when indentation changes, it's easy to re-align
(by removing or adding a tab)

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2017-12-20 09:27:06 +09:00
Martín Marqués
9d25544ab5 Add more information to the setting up sudo without requiretty in
the documentation

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2017-12-20 09:27:02 +09:00
Daymel Bonne Solís
8506607388 Fix package name 2017-12-20 09:26:57 +09:00
Ian Barwick
e8e059c26d docs: update 4.0.1 release date 2017-12-13 15:15:13 +09:00
Abhijit Menon-Sen
38d293694d Fix typo: upstream_node_id → upstream_node 2017-12-11 09:30:37 +09:00
Ian Barwick
54a10a0c3f Add diagnostic option "repmgr node check --has-passfile"
This checks if the active libpq version (9.6 and later) has the
"passfile" option, and returns 0 if present, 1 if not.
`
2017-12-05 12:53:04 +09:00
Ian Barwick
a8016f602f Fix unpackaged upgrade SQL for PostgreSQL 9.3 2017-12-04 17:46:52 +09:00
Ian Barwick
de57ecdad1 Finalize 4.0.1 release files 2017-11-29 17:02:47 +09:00
Ian Barwick
1fde81cf3f docs: improve event notification documentation 2017-11-29 14:44:07 +09:00
Ian Barwick
146c412061 docs: minor fixes to various examples 2017-11-29 11:30:38 +09:00
Ian Barwick
e9cb61ae7a docs: add additional note about setting "wal_log_hints"
Useful to reference this when discussing PostgreSQL configuration in
general.
2017-11-29 11:25:14 +09:00
Ian Barwick
50e9460b3e Update release notes 2017-11-28 13:42:28 +09:00
Ian Barwick
47e7cbe147 Update HISTORY 2017-11-28 13:00:31 +09:00
Ian Barwick
bf0be3eb43 Bump version
4.0.1
2017-11-28 12:36:22 +09:00
Ian Barwick
270da1294c repmgr: initialise "voting_term" in "repmgr primary register"
This previously happened in the extension SQL code, which could
potentially cause replay problems if installing on a BDR cluster.

As this table is only required for streaming replication failover,
move the initialisation to "repmgr primary register".

Addresses GitHub #344 .
2017-11-28 12:26:33 +09:00
Ian Barwick
d3c47f450f docs: add 2ndQ yum repository installation instructions
These replace the HTML document at https://repmgr.org/yum-repository.html
2017-11-24 14:14:36 +09:00
Ian Barwick
c20475f94a Delete any replication slots copied by pg_rewind
If --force-rewind is used in conjunction with "repmgr node rejoin",
any replication slots present on the source node will be copied too;
it's essential to remove these to prevent stale slots being extant
when the node starts up.

We do this at file system level *before* the server starts to minimize
the risk of any problems.

Addresses GitHub #334
2017-11-24 11:15:14 +09:00
Ian Barwick
e0560c3e70 docs: fix configuration file example
Per report from Carlos Chapi.
2017-11-24 09:27:39 +09:00
Ian Barwick
3fa2bef6f4 repmgr: fix configuration file sanity check
The check was being carried out regardless of whether --copy-external-config-files
was specified, which means cloning will fail if no SSH connection is available.

Addresses GitHub #342
2017-11-23 22:50:28 +09:00
Ian Barwick
f8a0b051c8 repmgr: fix return code output for repmgr node check --action=...
Addresses GitHub #340
2017-11-23 10:35:41 +09:00
Martín Marqués
3e4a5e6ff5 Fix missing FQN for the nodes table.
This bug was not detected before because most users work with the repmgr
user. For that reason, the repmgr schema is already in the search_path
by default.

Add the repmgr schema to the nodes table in the LEFT JOIN used for
cluster show (and in other places)

Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
2017-11-23 10:35:38 +09:00
Ian Barwick
020b5b6982 docs: update 4.0.0 release notes 2017-11-21 16:27:18 +09:00
Ian Barwick
932326e4a0 Bump version in configure.in 2017-11-20 17:55:22 +09:00
Ian Barwick
019cd081e8 Bump version
4.0.0
2017-11-20 15:45:48 +09:00
Ian Barwick
3ace908126 docs: miscellaneous updates 2017-11-20 15:44:31 +09:00
Ian Barwick
2ad174489c docs: improve documentation of pg_basebackup_options 2017-11-20 15:30:31 +09:00
Ian Barwick
9124e0f0a2 docs: expand witness documentation 2017-11-20 15:29:31 +09:00
Ian Barwick
060b746743 docs: miscellaneous cleanup 2017-11-20 15:29:28 +09:00
Ian Barwick
bdb82d3aba docs: add initial witness server documentation 2017-11-20 15:29:24 +09:00
Ian Barwick
f6a6df3600 repmgrd: renable monitoring data recording when in archive recovery.
The warning emitted gives the impression that monitoring data shouldn't
be written if there's no streaming replication, but we can and should
do this as long as we have a primary connection.

Explictly document this in the code.

Also remove an unused variable warning.
2017-11-20 15:29:21 +09:00
Ian Barwick
67e27f9ecd Remove unneeded functions 2017-11-20 15:26:32 +09:00
Ian Barwick
454c0b7bd9 docs: add note about "service_promote_command" in repmgr.conf.sample
It must never contain "repmgr standby promote", as it is intended
to enable use of package-level promote commands such as Debian's
"pg_ctlcluster promote".

Addresses GitHub #336.
2017-11-20 12:31:24 +09:00
Ian Barwick
faf297b07f remove spurios "/base" path element in Barman tablespace cloning code.
Addresses GitHub #339
2017-11-20 11:10:30 +09:00
Ian Barwick
0dae8c9f0b repmgr: don't add empty "passfile" parameter in recovery.conf 2017-11-20 10:28:16 +09:00
Ian Barwick
3f872cde0c "repmgr node ...": fixes for 9.3
Mainly to account for the lack of replication slots.
2017-11-16 11:26:39 +09:00
Ian Barwick
e331069f53 Escape double-quotes in strings passed to an event notification script
The string in question will be generated internally by repmgr as a simple
one-line string with no control characters etc., so all that needs to be
escaped at the moment are any double quotes.
2017-11-16 10:38:55 +09:00
Ian Barwick
53ebde8f33 repmgrd: don't fail over unless more than 50% of active nodes are visible. 2017-11-15 14:04:41 +09:00
Ian Barwick
5e9d50f8ca repmgrd: finalize witness failover handling 2017-11-15 14:04:37 +09:00
Ian Barwick
347e753c27 repmgrd: synchronise repmgr.nodes table on witness server 2017-11-15 14:04:34 +09:00
Ian Barwick
2f978847b1 repmgrd: handle witness server 2017-11-15 14:04:30 +09:00
Ian Barwick
3014f72fda "witness register": set upstream_node_id to that of the primary 2017-11-15 14:04:26 +09:00
Ian Barwick
e02ddd0f37 repmgrd: basic witness node monitoring 2017-11-15 14:04:23 +09:00
Ian Barwick
29fcee2209 docs: add witness command reference files to file list 2017-11-15 14:04:19 +09:00
Ian Barwick
f61f7f82eb docs: add command reference for "witness (un)register" 2017-11-15 14:04:14 +09:00
Ian Barwick
efe28cbbeb witness (un)register: add --dry-run mode 2017-11-15 14:04:09 +09:00
Ian Barwick
6131c1d8ce witness unregister: enable execution when witness server is down
Also add help output for "repmgr witness --help".
2017-11-15 14:04:06 +09:00
Ian Barwick
c907b7b33d repmgr: minor fix to "repmgr standby --help" output 2017-11-15 14:04:01 +09:00
Ian Barwick
e6644305d3 Add "witness unregister" functionality 2017-11-15 14:03:57 +09:00
Ian Barwick
31b856dd9f Add "witness register" functionality 2017-11-15 14:03:54 +09:00
Ian Barwick
dff2bcc5de witness: initial code framework 2017-11-15 14:03:50 +09:00
Ian Barwick
688e609169 docs: add some more index entries 2017-11-15 14:03:44 +09:00
Ian Barwick
3e68c9fcc6 docs: document "passfile" configuration file parameter 2017-11-15 14:03:40 +09:00
Ian Barwick
d459b92186 Add configuration file "passfile"
This will enable a custom .pgpass to be included in "primary_conninfo"
(provided it's supported by the libpq version on the standby).
2017-11-15 14:03:37 +09:00
Ian Barwick
2a898721c0 docs: update release notes
Add note about changes to password handling.1
2017-11-15 14:03:34 +09:00
Ian Barwick
35782d83c0 Update extension SQL 2017-11-15 14:03:30 +09:00
Ian Barwick
e16eb42693 repmgrd: detect role change from primary to standby
If repmgrd is monitoring a primary which is taken off-line, then later
restored as a standby, detect this change and resume monitoring
in standby node.

Addresses GitHub #338.
2017-11-15 14:03:26 +09:00
Ian Barwick
4d6dc57589 repmgrd: check shared library is loaded
If this isn't the case, "repmgrd" will appear to run but not handle
failover correctly.

Address GitHub #337.
2017-11-15 14:03:18 +09:00
Ian Barwick
cbc97d84ac repmgrd: updates related to node_id handling 2017-11-15 14:03:15 +09:00
Ian Barwick
96fe7dd2d6 repmgrd: catch corner cases where monitoring data is not available 2017-11-15 14:03:12 +09:00
Ian Barwick
13935a88c9 repmgrd: ensure shmem is reinitialised after a restart 2017-11-09 19:51:31 +09:00
Ian Barwick
5275890467 repmgrd: misc fixes 2017-11-09 19:51:26 +09:00
Ian Barwick
7f865fdaf3 repmgrd: fix priority/node_id tie-break check 2017-11-09 19:51:22 +09:00
Ian Barwick
9e2fb7ea13 repmgrd: remove unneeded functions 2017-11-09 19:51:18 +09:00
Ian Barwick
a3428e4d8a repmgrd: simplify the candidate selection logic
All disconnected nodes will be in a static, known state, so as long as
each node has the same meta-information (repmgr.nodes) and is able
to retrieve the last receive LSN of the other nodes, it is possible
for each node to independently determine the best promotion candidate,
thereby reaching consensus without an explicit "voting" process.
2017-11-09 19:51:13 +09:00
Ian Barwick
03b9475755 repmgrd: fixes to failover handling
get_new_primary() returns NULL if no notification for the new primary has
been received, but the code was expecting it to return UNKNOWN_NODE_ID,
which was causing repmgrd to prematurely drop out of the new primary
detection loop if no notification had been received by the time the loop
started.

Also store the electoral term as a single row, single column table,
to ensure that all repmgrds see the same turn. It is then bumped
by the winning node after it gets promoted.

Various logging improvements.
2017-11-09 19:51:09 +09:00
Ian Barwick
de1eb3c459 Ensure shared memory functions handle NULL parameters correctly 2017-11-09 19:51:02 +09:00
Ian Barwick
a13eccccc5 Update .gitignore
Ignore output from "make installcheck"
2017-11-09 19:50:57 +09:00
Ian Barwick
158f132bc0 README: update links to https versions 2017-11-09 19:50:53 +09:00
Ian Barwick
cdf54d217a Fix lock acquisition in shared memory functions 2017-11-09 19:50:48 +09:00
Ian Barwick
1a8a82f207 Update repmgr.conf.sample 2017-11-09 19:50:42 +09:00
Ian Barwick
60e877ca39 docs: fix example in BDR section 2017-11-02 11:24:10 +09:00
Ian Barwick
91531bffe4 docs: tweak Markdown URL formatting 2017-11-01 10:59:10 +09:00
Ian Barwick
fc5f46ca5a docs: update links to repmgr 4.0 documentation 2017-11-01 10:49:58 +09:00
Ian Barwick
b76952e136 docs: update copyright info 2017-11-01 09:36:16 +09:00
Ian Barwick
c3a1969f55 docs: convert command reference sections to <refentry> format
Note that most entries still need a bit more tidying up, consistent structuring,
provision of more examples etc.
2017-10-31 11:29:49 +09:00
Ian Barwick
11d856a1ec "standby follow": get upstream record before server restart, if required
The standby may not always be available for connections right after it's
restarted, so attempting to connect and get the node's upstream record
after the restart may fail. Record is now retrieved before the restart.

Addresses GitHub #333.
2017-10-27 16:30:25 +09:00
Ian Barwick
fbf357947d docs: add sample output to "standby follow" and "standby promote" 2017-10-27 15:05:46 +09:00
Ian Barwick
47eaa99537 docs: add note about building docs 2017-10-27 10:46:58 +09:00
Ian Barwick
aeee11d1b7 docs: finalize conversion of existing BDR repmgr documentation 2017-10-26 18:57:34 +09:00
Ian Barwick
e4713c5eca docs: update configuration documentation 2017-10-26 18:57:29 +09:00
Ian Barwick
e55e5a0581 Initial conversion of existing BDR repmgr documentation 2017-10-26 18:56:58 +09:00
Ian Barwick
fb0aae183d Docs: update "repmgr cluster show" 2017-10-26 09:42:36 +09:00
Ian Barwick
52655e9cd5 Improve trim() function
Did not cope well with trailing spaces or entirely blank strings.
2017-10-26 09:42:26 +09:00
Ian Barwick
c5d91ca88c repmgr node rejoin: add --dry-run option 2017-10-26 09:42:12 +09:00
Ian Barwick
9f5edd07ad Fix typo 2017-10-26 09:35:25 +09:00
Ian Barwick
f58b102d51 Standardize terminology on "primary" (in place of "master") 2017-10-24 13:44:03 +09:00
Ian Barwick
90733aecf7 --dry-run available for "node rejoin" 2017-10-23 10:40:43 +09:00
Ian Barwick
e0be228c89 docs: fix formatting 2017-10-23 10:00:00 +09:00
Ian Barwick
a9759cf6ca Add --help output for "repmgr node service"
Addresses GitHub #329.
2017-10-20 16:49:29 +09:00
Ian Barwick
6852ac82c6 Add --help output for "repmgr node rejoin"
Addresses GitHub #329.
2017-10-20 16:49:19 +09:00
Ian Barwick
c27bd2a135 docs: fix typo 2017-10-20 16:06:46 +09:00
Ian Barwick
5045e2eb9d node rewind: add check for pg_rewind and --dry-run mode
Addresses GitHub #330
2017-10-20 14:16:56 +09:00
Ian Barwick
23f7af17a2 Note Barman configuration file parameter changes 2017-10-20 11:31:31 +09:00
Ian Barwick
93936c090d Fix error message typo 2017-10-20 11:19:12 +09:00
Ian Barwick
564c951f0c Prevent relative configuration file path being stored in the repmgr metadata
The configuration file path is stored to make remote execution of repmgr
(e.g. during "repmgr standby switchover") simpler, so relative paths
make no sense.

Addresses GitHub #332
2017-10-20 10:59:54 +09:00
Ian Barwick
3f5e8f6aec Update README
Main body of documentation moved to DocBook format and hosted at:

    https://repmgr.org/docs/index.html

as the existing README and sundry additional files were becoming
unmanageable. Conversion to DocBook format enables all documentation
to be managed in a single structured system, with cross-references,
indexes, linkable URLS etc.
2017-10-19 16:39:33 +09:00
Ian Barwick
a6a97cda86 docs: update "repmgr cluster show" page 2017-10-19 16:39:27 +09:00
Ian Barwick
18c8e4c529 Add placeholder FAQ.md
This replaces the original FAQ maintainted for repmgr 3.x; repmgr 4
documentation is now available in DocBook format.
2017-10-19 16:22:28 +09:00
Ian Barwick
6984fe7029 docs: expand release notes and redirect "changes-in-repmgr4.md" 2017-10-19 14:11:17 +09:00
Ian Barwick
5ecc3a0a8f Add 4.0 release notes 2017-10-19 13:59:03 +09:00
Ian Barwick
febde097be doc: add missing entry for "priority" in repmgr.conf.sample
Per report from Shaun Thomas.
2017-10-19 13:16:36 +09:00
Ian Barwick
19ea248226 docs: add more index references 2017-10-19 12:22:58 +09:00
Ian Barwick
acdbd1110a docs: note way of forcing recovery then quitting in single user mode 2017-10-19 12:22:54 +09:00
Ian Barwick
946683182c Documentation: update markup 2017-10-18 11:12:37 +09:00
Ian Barwick
c9fbb7febf Update package signature documentation 2017-10-18 10:51:35 +09:00
Ian Barwick
ff966fe533 Document "upgrading-from-repmgr3.md" moved to main repmgr documentation 2017-10-18 10:51:29 +09:00
Ian Barwick
7001960cc1 Update "repmgr node rejoin" documentation 2017-10-17 17:41:36 +09:00
Ian Barwick
1cfba44799 Add FAQ to documentation 2017-10-17 16:16:40 +09:00
Ian Barwick
d1f9ca4b43 Move deprecated command line option
Not required in repmgr4, we're keeping it around for backwards compatibility;
a warning will be issued if used.
2017-10-17 16:16:06 +09:00
Ian Barwick
f6c253f8a6 Various documentation fixes 2017-10-17 11:02:33 +09:00
Ian Barwick
95ec8d8b21 Bump doc version 2017-10-17 09:46:23 +09:00
Ian Barwick
041f1b7667 Merge commit '0b2a6fe2fb958f10f211f0656fd91cae980fd08d' into REL4_0_STABLE 2017-10-16 11:22:48 +09:00
Ian Barwick
104279016a Update HISTORY 2017-10-04 13:33:37 +09:00
Ian Barwick
901a7603b1 Stamp 4.0beta1 2017-10-04 13:01:49 +09:00
80 changed files with 2448 additions and 9095 deletions

4
FAQ.md
View File

@@ -1,7 +1,9 @@
FAQ - Frequently Asked Questions about repmgr FAQ - Frequently Asked Questions about repmgr
============================================= =============================================
The repmgr 4 FAQ is located here: [repmgr FAQ (Frequently Asked Questions)](https://repmgr.org/docs/4.0/appendix-faq.html "repmgr FAQ") The repmgr 4 FAQ is located here:
https://repmgr.org/docs/appendix-faq.html
The repmgr 3.x FAQ can be found here: The repmgr 3.x FAQ can be found here:

113
HISTORY
View File

@@ -1,115 +1,3 @@
4.1.0 2018-??-??
repmgr: change default log_level to INFO, add documentation; GitHub #470 (Ian)
repmgr: add "--missing-slots" check to "repmgr node check" (Ian)
repmgr: improve command line error handling; GitHub #464 (Ian)
repmgr: fix "standby register --wait-sync" when no timeout provided (Ian)
repmgr: "cluster show" returns non-zero value if an issue encountered;
GitHub #456 (Ian)
repmgr: "node check" and "node status" returns non-zero value if an issue
encountered (Ian)
repmgr: add CSV output mode to "cluster event"; GitHub #471 (Ian)
repmgr: add -q/--quiet option to suppress non-error output; GitHub #468 (Ian)
repmgr: "node status" returns non-zero value if an issue encountered (Ian)
repmgr: enable "recovery_min_apply_delay" to be 0; GitHub #448 (Ian)
repmgr: "cluster cleanup" - add missing help options; GitHub #461/#462 (gclough)
repmgr: ensure witness node follows new primary after switchover;
GitHub #453 (Ian)
repmgr: fix witness node handling in "node check"/"node status";
GitHub #451 (Ian)
repmgr: fix "primary_slot_name" when using "standby clone" with --recovery-conf-only;
GitHub #474 (Ian)
repmgr: don't perform a switchover if an exclusive backup is running;
GitHub #476 (Martín)
repmgr: enable "witness unregister" to be run on any node; GitHub #472 (Ian)
repmgrd: create a PID file by default; GitHub #457 (Ian)
repmgrd: daemonize process by default; GitHub #458 (Ian)
4.0.6 2018-06-14
repmgr: (witness register) prevent registration of a witness server with the
same name as an existing node (Ian)
repmgr: (standby follow) check node has actually connected to new primary
before reporting success; GitHub #444 (Ian)
repmgr: (standby clone) improve handling of external configuration file copying,
including consideration in --dry-run check; GitHub #443 (Ian)
repmgr: (standby clone) don't require presence of "user" parameter in
conninfo string; GitHub #437 (Ian)
repmgr: (standby clone) improve documentation of --recovery-conf-only
mode; GitHub #438 (Ian)
repmgr: (node rejoin) fix bug when parsing --config-files parameter;
GitHub #442 (Ian)
repmgr: when using --dry-run, force log level to INFO to ensure output
will always be displayed; GitHub #441 (Ian)
repmgr: (cluster matrix/crosscheck) return non-zero exit code if node
connection issues detected; GitHub #447 (Ian)
repmgrd: ensure local node is counted as quorum member; GitHub #439 (Ian)
4.0.5 2018-05-02
repmgr: poll demoted primary after restart as a standby during a
switchover operation; GitHub #408 (Ian)
repmgr: add configuration parameter "config_directory"; GitHub #424 (Ian)
repmgr: add "dbname=replication" to all replication connection strings;
GitHub #421 (Ian)
repmgr: add sanity check if --upstream-node-id not supplied when executing
"standby register"; GitHub #395 (Ian)
repmgr: enable provision of "archive_cleanup_command" in recovery.conf;
GitHub #416 (Ian)
repmgr: actively check for node to rejoin cluster; GitHub #415 (Ian)
repmgr: enable pg_rewind to be used with PostgreSQL 9.3/9.4; GitHub #413 (Ian)
repmgr: fix minimum accepted value for "degraded_monitoring_timeout";
GitHub #411 (Ian)
repmgr: fix superuser password handling; GitHub #400 (Ian)
repmgr: fix parsing of "archive_ready_critical" configuration file
parameter; GitHub #426 (Ian)
repmgr: fix display of conninfo parsing error messages (Ian)
repmgr: fix "repmgr cluster crosscheck" output; GitHub #389 (Ian)
repmgrd: prevent standby connection handle from going stale (Ian)
repmgrd: fix memory leaks in witness code; GitHub #402 (AndrzejNowicki, Martín)
repmgrd: handle "pg_ctl promote" timeout; GitHub #425 (Ian)
repmgrd: handle failover situation with only two nodes in the primary
location, and at least one node in another location; GitHub #407 (Ian)
repmgrd: set "connect_timeout=2" when pinging a server (Ian)
4.0.4 2018-03-09
repmgr: add "standby clone --recovery-conf-only" option; GitHub #382 (Ian)
repmgr: make "standby promote" timeout values configurable; GitHub #387 (Ian)
repmgr: improve replication slot warnings generated by "node status";
GitHub #385 (Ian)
repmgr: remove restriction on replication slots when cloning from
a Barman server; GitHub #379 (Ian)
repmgr: ensure "node rejoin" honours "--dry-run" option; GitHub #383 (Ian)
repmgr: fix --superuser handling when cloning a standby; GitHub #380 (Ian)
repmgr: update various help options; GitHub #391, #392 (hasegeli)
repmgrd: add event "repmgrd_shutdown"; GitHub #393 (Ian)
repmgrd: improve detection of status change from primary to standby (Ian)
repmgrd: improve log output in various situations (Ian)
repmgrd: improve reconnection to the local node after a failover (Ian)
repmgrd: ensure witness server connects to new primary after a failover (Ian)
4.0.3 2018-02-15
repmgr: improve switchover handling when "pg_ctl" used to control the
server and logging output is not explicitly redirected (Ian)
repmgr: improve switchover log messages and exit code when old primary could
not be shut down cleanly (Ian)
repmgr: check demotion candidate can make a replication connection to the
promotion candidate before executing a switchover; GitHub #370 (Ian)
repmgr: add check for sufficient walsenders/replication slots before executing
a switchover; GitHub #371 (Ian)
repmgr: add --dry-run mode to "repmgr standby follow"; GitHub #368 (Ian)
repmgr: provide information about the primary node for "standby_register" and
"standby_follow" event notifications; GitHub #375 (Ian)
repmgr: add "standby_register_sync" event notification; GitHub #374 (Ian)
repmgr: output any connection error messages in "cluster show"'s list of
warnings; GitHub #369 (Ian)
repmgr: ensure an inactive data directory can be deleted; GitHub #366 (Ian)
repmgr: fix upstream node display in "repmgr node status"; GitHub #363 (fanf2)
repmgr: improve/clarify documentation and update --help output for
"primary unregister"; GitHub #373 (Ian)
repmgr: allow replication slots when Barman is configured; GitHub #379 (Ian)
repmgr: fix parsing of "pg_basebackup_options"; GitHub #376 (Ian)
repmgr: ensure "pg_subtrans" directory is created when cloning a standby in
Barman mode (Ian)
repmgr: fix primary node check in "witness register"; GitHub #377 (Ian)
4.0.2 2018-01-18 4.0.2 2018-01-18
repmgr: add missing -W option to getopt_long() invocation; GitHub #350 (Ian) repmgr: add missing -W option to getopt_long() invocation; GitHub #350 (Ian)
repmgr: automatically create slot name if missing; GitHub #343 (Ian) repmgr: automatically create slot name if missing; GitHub #343 (Ian)
@@ -133,6 +21,7 @@
GitHub #344 (Ian) GitHub #344 (Ian)
repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian) repmgr: delete any replication slots copied by pg_rewind; GitHub #334 (Ian)
repmgr: fix configuration file sanity check; GitHub #342 (Ian) repmgr: fix configuration file sanity check; GitHub #342 (Ian)
Improve event notification documentation (Ian)
4.0.0 2017-11-21 4.0.0 2017-11-21
Complete rewrite with many changes; for details see the repmgr 4.0.0 release Complete rewrite with many changes; for details see the repmgr 4.0.0 release

View File

@@ -11,10 +11,7 @@ EXTENSION = repmgr
DATA = \ DATA = \
repmgr--unpackaged--4.0.sql \ repmgr--unpackaged--4.0.sql \
repmgr--4.0.sql \ repmgr--4.0.sql
repmgr--4.0--4.1.sql \
repmgr--4.1.sql
REGRESS = repmgr_extension REGRESS = repmgr_extension

20
TODO.md
View File

@@ -1,20 +0,0 @@
TODO
====
This file contains a list of improvements which are desireable and/or have
been requested, and which we aim to address/implement when time and resources
permit.
It is *not* a roadmap and there's no guarantee of any item being implemented
within any given timeframe.
Enable suspension of repmgrd failover
-------------------------------------
When performing maintenance, e.g. a switchover, it's necessary to stop all
repmgrd nodes to prevent unintended failover; this is obviously inconvenient.
We'll need to implement some way of notifying each repmgrd to suspend automatic
failover until further notice.
Requested in GitHub #410 ( https://github.com/2ndQuadrant/repmgr/issues/410 )

View File

@@ -1,2 +1,4 @@
/* config.h.in. Generated from configure.in by autoheader. */ /* config.h.in. Generated from configure.in by autoheader. */
/* Only build repmgr for BDR */
#undef BDR_ONLY

View File

@@ -29,6 +29,9 @@ static bool config_file_provided = false;
bool config_file_found = false; bool config_file_found = false;
static void _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *warning_list); static void _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *warning_list);
static bool parse_bool(const char *s,
const char *config_item,
ItemList *error_list);
static void _parse_line(char *buf, char *name, char *value); static void _parse_line(char *buf, char *name, char *value);
static void parse_event_notifications_list(t_configuration_options *options, const char *arg); static void parse_event_notifications_list(t_configuration_options *options, const char *arg);
@@ -285,7 +288,6 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
memset(options->node_name, 0, sizeof(options->node_name)); memset(options->node_name, 0, sizeof(options->node_name));
memset(options->conninfo, 0, sizeof(options->conninfo)); memset(options->conninfo, 0, sizeof(options->conninfo));
memset(options->data_directory, 0, sizeof(options->data_directory)); memset(options->data_directory, 0, sizeof(options->data_directory));
memset(options->config_directory, 0, sizeof(options->data_directory));
memset(options->pg_bindir, 0, sizeof(options->pg_bindir)); memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
options->replication_type = REPLICATION_TYPE_PHYSICAL; options->replication_type = REPLICATION_TYPE_PHYSICAL;
@@ -301,7 +303,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
options->log_status_interval = DEFAULT_LOG_STATUS_INTERVAL; options->log_status_interval = DEFAULT_LOG_STATUS_INTERVAL;
/*----------------------- /*-----------------------
* standby clone settings * standby action settings
*------------------------ *------------------------
*/ */
options->use_replication_slots = false; options->use_replication_slots = false;
@@ -312,30 +314,9 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
options->tablespace_mapping.tail = NULL; options->tablespace_mapping.tail = NULL;
memset(options->recovery_min_apply_delay, 0, sizeof(options->recovery_min_apply_delay)); memset(options->recovery_min_apply_delay, 0, sizeof(options->recovery_min_apply_delay));
options->recovery_min_apply_delay_provided = false; options->recovery_min_apply_delay_provided = false;
memset(options->archive_cleanup_command, 0, sizeof(options->archive_cleanup_command));
options->use_primary_conninfo_password = false; options->use_primary_conninfo_password = false;
memset(options->passfile, 0, sizeof(options->passfile)); memset(options->passfile, 0, sizeof(options->passfile));
/*-------------------------
* standby promote settings
*-------------------------
*/
options->promote_check_timeout = DEFAULT_PROMOTE_CHECK_TIMEOUT;
options->promote_check_interval = DEFAULT_PROMOTE_CHECK_INTERVAL;
/*------------------------
* standby follow settings
*------------------------
*/
options->primary_follow_timeout = DEFAULT_PRIMARY_FOLLOW_TIMEOUT;
options->standby_follow_timeout = DEFAULT_STANDBY_FOLLOW_TIMEOUT;
/*------------------------
* standby switchover settings
*------------------------
*/
options->standby_reconnect_timeout = DEFAULT_STANDBY_RECONNECT_TIMEOUT;
/*----------------- /*-----------------
* repmgrd settings * repmgrd settings
*----------------- *-----------------
@@ -355,8 +336,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
options->degraded_monitoring_timeout = -1; options->degraded_monitoring_timeout = -1;
options->async_query_timeout = DEFAULT_ASYNC_QUERY_TIMEOUT; options->async_query_timeout = DEFAULT_ASYNC_QUERY_TIMEOUT;
options->primary_notification_timeout = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT; options->primary_notification_timeout = DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT;
options->repmgrd_standby_startup_timeout = -1; /* defaults to "standby_reconnect_timeout" if not set */ options->primary_follow_timeout = DEFAULT_PRIMARY_FOLLOW_TIMEOUT;
memset(options->repmgrd_pid_file, 0, sizeof(options->repmgrd_pid_file));
/*------------- /*-------------
* witness settings * witness settings
@@ -475,9 +455,6 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
strncpy(options->conninfo, value, MAXLEN); strncpy(options->conninfo, value, MAXLEN);
else if (strcmp(name, "data_directory") == 0) else if (strcmp(name, "data_directory") == 0)
strncpy(options->data_directory, value, MAXPGPATH); strncpy(options->data_directory, value, MAXPGPATH);
else if (strcmp(name, "config_directory") == 0)
strncpy(options->config_directory, value, MAXPGPATH);
else if (strcmp(name, "replication_user") == 0) else if (strcmp(name, "replication_user") == 0)
{ {
if (strlen(value) < NAMEDATALEN) if (strlen(value) < NAMEDATALEN)
@@ -523,38 +500,15 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
parse_time_unit_parameter(name, value, options->recovery_min_apply_delay, error_list); parse_time_unit_parameter(name, value, options->recovery_min_apply_delay, error_list);
options->recovery_min_apply_delay_provided = true; options->recovery_min_apply_delay_provided = true;
} }
else if (strcmp(name, "archive_cleanup_command") == 0)
strncpy(options->archive_cleanup_command, value, MAXLEN);
else if (strcmp(name, "use_primary_conninfo_password") == 0) else if (strcmp(name, "use_primary_conninfo_password") == 0)
options->use_primary_conninfo_password = parse_bool(value, name, error_list); options->use_primary_conninfo_password = parse_bool(value, name, error_list);
else if (strcmp(name, "passfile") == 0) else if (strcmp(name, "passfile") == 0)
strncpy(options->passfile, value, sizeof(options->passfile)); strncpy(options->passfile, value, sizeof(options->passfile));
/* standby promote settings */
else if (strcmp(name, "promote_check_timeout") == 0)
options->promote_check_timeout = repmgr_atoi(value, name, error_list, 1);
else if (strcmp(name, "promote_check_interval") == 0)
options->promote_check_interval = repmgr_atoi(value, name, error_list, 1);
/* standby follow settings */
else if (strcmp(name, "primary_follow_timeout") == 0)
options->primary_follow_timeout = repmgr_atoi(value, name, error_list, 0);
else if (strcmp(name, "standby_follow_timeout") == 0)
options->standby_follow_timeout = repmgr_atoi(value, name, error_list, 0);
/* standby switchover settings */
else if (strcmp(name, "standby_reconnect_timeout") == 0)
options->standby_reconnect_timeout = repmgr_atoi(value, name, error_list, 0);
/* node rejoin settings */
else if (strcmp(name, "node_rejoin_timeout") == 0)
options->node_rejoin_timeout = repmgr_atoi(value, name, error_list, 0);
/* node check settings */ /* node check settings */
else if (strcmp(name, "archive_ready_warning") == 0) else if (strcmp(name, "archive_ready_warning") == 0)
options->archive_ready_warning = repmgr_atoi(value, name, error_list, 1); options->archive_ready_warning = repmgr_atoi(value, name, error_list, 1);
else if (strcmp(name, "archive_ready_critical") == 0) else if (strcmp(name, "archive_ready_critcial") == 0)
options->archive_ready_critical = repmgr_atoi(value, name, error_list, 1); options->archive_ready_critical = repmgr_atoi(value, name, error_list, 1);
else if (strcmp(name, "replication_lag_warning") == 0) else if (strcmp(name, "replication_lag_warning") == 0)
options->replication_lag_warning = repmgr_atoi(value, name, error_list, 1); options->replication_lag_warning = repmgr_atoi(value, name, error_list, 1);
@@ -595,15 +549,13 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
else if (strcmp(name, "monitoring_history") == 0) else if (strcmp(name, "monitoring_history") == 0)
options->monitoring_history = parse_bool(value, name, error_list); options->monitoring_history = parse_bool(value, name, error_list);
else if (strcmp(name, "degraded_monitoring_timeout") == 0) else if (strcmp(name, "degraded_monitoring_timeout") == 0)
options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, -1); options->degraded_monitoring_timeout = repmgr_atoi(value, name, error_list, 1);
else if (strcmp(name, "async_query_timeout") == 0) else if (strcmp(name, "async_query_timeout") == 0)
options->async_query_timeout = repmgr_atoi(value, name, error_list, 0); options->async_query_timeout = repmgr_atoi(value, name, error_list, 0);
else if (strcmp(name, "primary_notification_timeout") == 0) else if (strcmp(name, "primary_notification_timeout") == 0)
options->primary_notification_timeout = repmgr_atoi(value, name, error_list, 0); options->primary_notification_timeout = repmgr_atoi(value, name, error_list, 0);
else if (strcmp(name, "repmgrd_standby_startup_timeout") == 0) else if (strcmp(name, "primary_follow_timeout") == 0)
options->repmgrd_standby_startup_timeout = repmgr_atoi(value, name, error_list, 0); options->primary_follow_timeout = repmgr_atoi(value, name, error_list, 0);
else if (strcmp(name, "repmgrd_pid_file") == 0)
strncpy(options->repmgrd_pid_file, value, MAXPGPATH);
/* witness settings */ /* witness settings */
else if (strcmp(name, "witness_sync_interval") == 0) else if (strcmp(name, "witness_sync_interval") == 0)
@@ -719,7 +671,7 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
* Raise an error if a known parameter is provided with an empty * Raise an error if a known parameter is provided with an empty
* value. Currently there's no reason why empty parameters are needed; * value. Currently there's no reason why empty parameters are needed;
* if we want to accept those, we'd need to add stricter default * if we want to accept those, we'd need to add stricter default
* checking, as currently e.g. an empty `node_id` value will be converted * checking, as currently e.g. an empty `node` value will be converted
* to '0'. * to '0'.
*/ */
if (known_parameter == true && !strlen(value)) if (known_parameter == true && !strlen(value))
@@ -785,18 +737,6 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
PQconninfoFree(conninfo_options); PQconninfoFree(conninfo_options);
} }
/* set values for parameters which default to other parameters */
/*
* From 4.1, "repmgrd_standby_startup_timeout" replaces "standby_reconnect_timeout"
* in repmgrd; fall back to "standby_reconnect_timeout" if no value explicitly provided
*/
if (options->repmgrd_standby_startup_timeout == -1)
{
options->repmgrd_standby_startup_timeout = options->standby_reconnect_timeout;
}
/* add warning about changed "barman_" parameter meanings */ /* add warning about changed "barman_" parameter meanings */
if ((options->barman_host[0] == '\0' && options->barman_server[0] != '\0') || if ((options->barman_host[0] == '\0' && options->barman_server[0] != '\0') ||
(options->barman_host[0] != '\0' && options->barman_server[0] == '\0')) (options->barman_host[0] != '\0' && options->barman_server[0] == '\0'))
@@ -821,12 +761,6 @@ _parse_config(t_configuration_options *options, ItemList *error_list, ItemList *
item_list_append(error_list, item_list_append(error_list,
_("\replication_lag_critical\" must be greater than \"replication_lag_warning\"")); _("\replication_lag_critical\" must be greater than \"replication_lag_warning\""));
} }
if (options->standby_reconnect_timeout < options->node_rejoin_timeout)
{
item_list_append(error_list,
_("\"standby_reconnect_timeout\" must be equal to or greater than \"node_rejoin_timeout\""));
}
} }
@@ -991,11 +925,12 @@ parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemL
char *ptr = NULL; char *ptr = NULL;
int targ = strtol(value, &ptr, 10); int targ = strtol(value, &ptr, 10);
if (targ < 0) if (targ < 1)
{ {
if (errors != NULL) if (errors != NULL)
{ {
item_list_append_format(errors, item_list_append_format(
errors,
_("invalid value provided for \"%s\""), _("invalid value provided for \"%s\""),
name); name);
} }
@@ -1049,7 +984,6 @@ parse_time_unit_parameter(const char *name, const char *value, char *dest, ItemL
* - promote_delay * - promote_delay
* - reconnect_attempts * - reconnect_attempts
* - reconnect_interval * - reconnect_interval
* - repmgrd_standby_startup_timeout
* - retry_promote_interval_secs * - retry_promote_interval_secs
* *
* non-changeable options * non-changeable options
@@ -1075,36 +1009,17 @@ reload_config(t_configuration_options *orig_options)
static ItemList config_errors = {NULL, NULL}; static ItemList config_errors = {NULL, NULL};
static ItemList config_warnings = {NULL, NULL}; static ItemList config_warnings = {NULL, NULL};
PQExpBufferData errors;
log_info(_("reloading configuration file")); log_info(_("reloading configuration file"));
_parse_config(&new_options, &config_errors, &config_warnings); _parse_config(&new_options, &config_errors, &config_warnings);
if (config_errors.head != NULL) if (config_errors.head != NULL)
{ {
ItemListCell *cell = NULL; /* XXX dump errors to log */
log_warning(_("unable to parse new configuration, retaining current configuration")); log_warning(_("unable to parse new configuration, retaining current configuration"));
initPQExpBuffer(&errors);
appendPQExpBuffer(&errors,
"following errors were detected:\n");
for (cell = config_errors.head; cell; cell = cell->next)
{
appendPQExpBuffer(&errors,
" %s\n", cell->string);
}
log_detail("%s", errors.data);
termPQExpBuffer(&errors);
return false; return false;
} }
/* The following options cannot be changed */ /* The following options cannot be changed */
if (new_options.node_id != orig_options->node_id) if (new_options.node_id != orig_options->node_id)
@@ -1113,7 +1028,7 @@ reload_config(t_configuration_options *orig_options)
return false; return false;
} }
if (strncmp(new_options.node_name, orig_options->node_name, MAXLEN) != 0) if (strcmp(new_options.node_name, orig_options->node_name) != 0)
{ {
log_warning(_("\"node_name\" cannot be changed, keeping current configuration")); log_warning(_("\"node_name\" cannot be changed, keeping current configuration"));
return false; return false;
@@ -1157,7 +1072,7 @@ reload_config(t_configuration_options *orig_options)
} }
/* conninfo */ /* conninfo */
if (strncmp(orig_options->conninfo, new_options.conninfo, MAXLEN) != 0) if (strcmp(orig_options->conninfo, new_options.conninfo) != 0)
{ {
/* Test conninfo string works */ /* Test conninfo string works */
conn = establish_db_connection(new_options.conninfo, false); conn = establish_db_connection(new_options.conninfo, false);
@@ -1184,7 +1099,7 @@ reload_config(t_configuration_options *orig_options)
} }
/* event_notification_command */ /* event_notification_command */
if (strncmp(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN) != 0) if (strcmp(orig_options->event_notification_command, new_options.event_notification_command) != 0)
{ {
strncpy(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN); strncpy(orig_options->event_notification_command, new_options.event_notification_command, MAXLEN);
log_info(_("\"event_notification_command\" is now \"%s\""), new_options.event_notification_command); log_info(_("\"event_notification_command\" is now \"%s\""), new_options.event_notification_command);
@@ -1193,7 +1108,7 @@ reload_config(t_configuration_options *orig_options)
} }
/* event_notifications */ /* event_notifications */
if (strncmp(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN) != 0) if (strcmp(orig_options->event_notifications_orig, new_options.event_notifications_orig) != 0)
{ {
strncpy(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN); strncpy(orig_options->event_notifications_orig, new_options.event_notifications_orig, MAXLEN);
log_info(_("\"event_notifications\" is now \"%s\""), new_options.event_notifications_orig); log_info(_("\"event_notifications\" is now \"%s\""), new_options.event_notifications_orig);
@@ -1213,7 +1128,7 @@ reload_config(t_configuration_options *orig_options)
} }
/* follow_command */ /* follow_command */
if (strncmp(orig_options->follow_command, new_options.follow_command, MAXLEN) != 0) if (strcmp(orig_options->follow_command, new_options.follow_command) != 0)
{ {
strncpy(orig_options->follow_command, new_options.follow_command, MAXLEN); strncpy(orig_options->follow_command, new_options.follow_command, MAXLEN);
log_info(_("\"follow_command\" is now \"%s\""), new_options.follow_command); log_info(_("\"follow_command\" is now \"%s\""), new_options.follow_command);
@@ -1250,7 +1165,7 @@ reload_config(t_configuration_options *orig_options)
/* promote_command */ /* promote_command */
if (strncmp(orig_options->promote_command, new_options.promote_command, MAXLEN) != 0) if (strcmp(orig_options->promote_command, new_options.promote_command) != 0)
{ {
strncpy(orig_options->promote_command, new_options.promote_command, MAXLEN); strncpy(orig_options->promote_command, new_options.promote_command, MAXLEN);
log_info(_("\"promote_command\" is now \"%s\""), new_options.promote_command); log_info(_("\"promote_command\" is now \"%s\""), new_options.promote_command);
@@ -1285,32 +1200,23 @@ reload_config(t_configuration_options *orig_options)
config_changed = true; config_changed = true;
} }
/* repmgrd_standby_startup_timeout */
if (orig_options->repmgrd_standby_startup_timeout != new_options.repmgrd_standby_startup_timeout)
{
orig_options->repmgrd_standby_startup_timeout = new_options.repmgrd_standby_startup_timeout;
log_info(_("\"repmgrd_standby_startup_timeout\" is now \"%i\""), new_options.repmgrd_standby_startup_timeout);
config_changed = true;
}
/* /*
* Handle changes to logging configuration * Handle changes to logging configuration
*/ */
/* log_facility */ /* log_facility */
if (strncmp(orig_options->log_facility, new_options.log_facility, MAXLEN) != 0) if (strcmp(orig_options->log_facility, new_options.log_facility) != 0)
{ {
strncpy(orig_options->log_facility, new_options.log_facility, MAXLEN); strcpy(orig_options->log_facility, new_options.log_facility);
log_info(_("\"log_facility\" is now \"%s\""), new_options.log_facility); log_info(_("\"log_facility\" is now \"%s\""), new_options.log_facility);
log_config_changed = true; log_config_changed = true;
} }
/* log_file */ /* log_file */
if (strncmp(orig_options->log_file, new_options.log_file, MAXLEN) != 0) if (strcmp(orig_options->log_file, new_options.log_file) != 0)
{ {
strncpy(orig_options->log_file, new_options.log_file, MAXLEN); strcpy(orig_options->log_file, new_options.log_file);
log_info(_("\"log_file\" is now \"%s\""), new_options.log_file); log_info(_("\"log_file\" is now \"%s\""), new_options.log_file);
log_config_changed = true; log_config_changed = true;
@@ -1318,9 +1224,9 @@ reload_config(t_configuration_options *orig_options)
/* log_level */ /* log_level */
if (strncmp(orig_options->log_level, new_options.log_level, MAXLEN) != 0) if (strcmp(orig_options->log_level, new_options.log_level) != 0)
{ {
strncpy(orig_options->log_level, new_options.log_level, MAXLEN); strcpy(orig_options->log_level, new_options.log_level);
log_info(_("\"log_level\" is now \"%s\""), new_options.log_level); log_info(_("\"log_level\" is now \"%s\""), new_options.log_level);
log_config_changed = true; log_config_changed = true;
@@ -1386,23 +1292,13 @@ exit_with_config_file_errors(ItemList *config_errors, ItemList *config_warnings,
void void
exit_with_cli_errors(ItemList *error_list, const char *repmgr_command) exit_with_cli_errors(ItemList *error_list)
{ {
fprintf(stderr, _("The following command line errors were encountered:\n")); fprintf(stderr, _("The following command line errors were encountered:\n"));
print_item_list(error_list); print_item_list(error_list);
if (repmgr_command != NULL) fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname());
{
fprintf(stderr, _("Try \"%s --help\" or \"%s %s --help\" for more information.\n"),
progname(),
progname(),
repmgr_command);
}
else
{
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname());
}
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
@@ -1507,7 +1403,7 @@ repmgr_atoi(const char *value, const char *config_item, ItemList *error_list, in
* *
* https://www.postgresql.org/docs/current/static/config-setting.html * https://www.postgresql.org/docs/current/static/config-setting.html
*/ */
bool static bool
parse_bool(const char *s, const char *config_item, ItemList *error_list) parse_bool(const char *s, const char *config_item, ItemList *error_list)
{ {
PQExpBufferData errors; PQExpBufferData errors;
@@ -1704,112 +1600,31 @@ clear_event_notification_list(t_configuration_options *options)
} }
int bool
parse_output_to_argv(const char *string, char ***argv_array) parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
{ {
int options_len = 0; int options_len = 0;
char *options_string = NULL; char *options_string = NULL;
char *options_string_ptr = NULL; char *options_string_ptr = NULL;
int c = 1,
argc_item = 1;
char *argv_item = NULL;
char **local_argv_array = NULL;
ItemListCell *cell;
/* /*
* Add parsed options to this list, then copy to an array to pass to * Add parsed options to this list, then copy to an array to pass to
* getopt * getopt
*/ */
ItemList option_argv = {NULL, NULL}; static ItemList option_argv = {NULL, NULL};
options_len = strlen(string) + 1; char *argv_item = NULL;
options_string = pg_malloc0(options_len); int c,
options_string_ptr = options_string; argc_item = 1;
/* Copy the string before operating on it with strtok() */
strncpy(options_string, string, options_len);
/* Extract arguments into a list and keep a count of the total */
while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
{
item_list_append(&option_argv, trim(argv_item));
argc_item++;
if (options_string_ptr != NULL)
options_string_ptr = NULL;
}
pfree(options_string);
/*
* Array of argument values to pass to getopt_long - this will need to
* include an empty string as the first value (normally this would be the
* program name)
*/
local_argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
/* Insert a blank dummy program name at the start of the array */
local_argv_array[0] = pg_malloc0(1);
/*
* Copy the previously extracted arguments from our list to the array
*/
for (cell = option_argv.head; cell; cell = cell->next)
{
int argv_len = strlen(cell->string) + 1;
local_argv_array[c] = (char *)pg_malloc0(argv_len);
strncpy(local_argv_array[c], cell->string, argv_len);
c++;
}
local_argv_array[c] = NULL;
item_list_free(&option_argv);
*argv_array = local_argv_array;
return argc_item;
}
void
free_parsed_argv(char ***argv_array)
{
char **local_argv_array = *argv_array;
int i = 0;
while (local_argv_array[i] != NULL)
{
pfree((char *)local_argv_array[i]);
i++;
}
pfree((char **)local_argv_array);
*argv_array = NULL;
}
bool
parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_options *backup_options, int server_version_num, ItemList *error_list)
{
bool backup_options_ok = true;
int c = 0,
argc_item = 0;
char **argv_array = NULL; char **argv_array = NULL;
ItemListCell *cell = NULL;
int optindex = 0; int optindex = 0;
struct option *long_options = NULL; struct option *long_options = NULL;
bool backup_options_ok = true;
/* We're only interested in these options */ /* We're only interested in these options */
static struct option long_options_9[] = static struct option long_options_9[] =
@@ -1835,12 +1650,56 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
if (!strlen(pg_basebackup_options)) if (!strlen(pg_basebackup_options))
return backup_options_ok; return backup_options_ok;
options_len = strlen(pg_basebackup_options) + 1;
options_string = pg_malloc(options_len);
options_string_ptr = options_string;
if (server_version_num >= 100000) if (server_version_num >= 100000)
long_options = long_options_10; long_options = long_options_10;
else else
long_options = long_options_9; long_options = long_options_9;
argc_item = parse_output_to_argv(pg_basebackup_options, &argv_array); /* Copy the string before operating on it with strtok() */
strncpy(options_string, pg_basebackup_options, options_len);
/* Extract arguments into a list and keep a count of the total */
while ((argv_item = strtok(options_string_ptr, " ")) != NULL)
{
item_list_append(&option_argv, argv_item);
argc_item++;
if (options_string_ptr != NULL)
options_string_ptr = NULL;
}
/*
* Array of argument values to pass to getopt_long - this will need to
* include an empty string as the first value (normally this would be the
* program name)
*/
argv_array = pg_malloc0(sizeof(char *) * (argc_item + 2));
/* Insert a blank dummy program name at the start of the array */
argv_array[0] = pg_malloc0(1);
c = 1;
/*
* Copy the previously extracted arguments from our list to the array
*/
for (cell = option_argv.head; cell; cell = cell->next)
{
int argv_len = strlen(cell->string) + 1;
argv_array[c] = pg_malloc0(argv_len);
strncpy(argv_array[c], cell->string, argv_len);
c++;
}
argv_array[c] = NULL;
/* Reset getopt's optind variable */ /* Reset getopt's optind variable */
optind = 0; optind = 0;
@@ -1884,7 +1743,15 @@ parse_pg_basebackup_options(const char *pg_basebackup_options, t_basebackup_opti
backup_options_ok = false; backup_options_ok = false;
} }
free_parsed_argv(&argv_array); pfree(options_string);
{
int i;
for (i = 0; i < argc_item + 2; i++)
pfree(argv_array[i]);
}
pfree(argv_array);
return backup_options_ok; return backup_options_ok;
} }

View File

@@ -73,7 +73,6 @@ typedef struct
char conninfo[MAXLEN]; char conninfo[MAXLEN];
char replication_user[NAMEDATALEN]; char replication_user[NAMEDATALEN];
char data_directory[MAXPGPATH]; char data_directory[MAXPGPATH];
char config_directory[MAXPGPATH];
char pg_bindir[MAXPGPATH]; char pg_bindir[MAXPGPATH];
int replication_type; int replication_type;
@@ -83,31 +82,16 @@ typedef struct
char log_file[MAXLEN]; char log_file[MAXLEN];
int log_status_interval; int log_status_interval;
/* standby clone settings */ /* standby action settings */
bool use_replication_slots; bool use_replication_slots;
char pg_basebackup_options[MAXLEN]; char pg_basebackup_options[MAXLEN];
char restore_command[MAXLEN]; char restore_command[MAXLEN];
TablespaceList tablespace_mapping; TablespaceList tablespace_mapping;
char recovery_min_apply_delay[MAXLEN]; char recovery_min_apply_delay[MAXLEN];
bool recovery_min_apply_delay_provided; bool recovery_min_apply_delay_provided;
char archive_cleanup_command[MAXLEN];
bool use_primary_conninfo_password; bool use_primary_conninfo_password;
char passfile[MAXPGPATH]; char passfile[MAXPGPATH];
/* standby promote settings */
int promote_check_timeout;
int promote_check_interval;
/* standby follow settings */
int primary_follow_timeout;
int standby_follow_timeout;
/* standby switchover settings */
int standby_reconnect_timeout;
/* node rejoin settings */
int node_rejoin_timeout;
/* node check settings */ /* node check settings */
int archive_ready_warning; int archive_ready_warning;
int archive_ready_critical; int archive_ready_critical;
@@ -130,8 +114,7 @@ typedef struct
int degraded_monitoring_timeout; int degraded_monitoring_timeout;
int async_query_timeout; int async_query_timeout;
int primary_notification_timeout; int primary_notification_timeout;
int repmgrd_standby_startup_timeout; int primary_follow_timeout;
char repmgrd_pid_file[MAXPGPATH];
/* BDR settings */ /* BDR settings */
bool bdr_local_monitoring_only; bool bdr_local_monitoring_only;
@@ -170,20 +153,11 @@ typedef struct
#define T_CONFIGURATION_OPTIONS_INITIALIZER { \ #define T_CONFIGURATION_OPTIONS_INITIALIZER { \
/* node information */ \ /* node information */ \
UNKNOWN_NODE_ID, "", "", "", "", "", "", REPLICATION_TYPE_PHYSICAL, \ UNKNOWN_NODE_ID, "", "", "", "", "", REPLICATION_TYPE_PHYSICAL, \
/* log settings */ \ /* log settings */ \
"", "", "", DEFAULT_LOG_STATUS_INTERVAL, \ "", "", "", DEFAULT_LOG_STATUS_INTERVAL, \
/* standby clone settings */ \ /* standby action settings */ \
false, "", "", { NULL, NULL }, "", false, "", false, "", \ false, "", "", { NULL, NULL }, "", false, false, "", \
/* standby promote settings */ \
DEFAULT_PROMOTE_CHECK_TIMEOUT, DEFAULT_PROMOTE_CHECK_INTERVAL, \
/* standby follow settings */ \
DEFAULT_PRIMARY_FOLLOW_TIMEOUT, \
DEFAULT_STANDBY_FOLLOW_TIMEOUT, \
/* standby switchover settings */ \
DEFAULT_STANDBY_RECONNECT_TIMEOUT, \
/* node rejoin settings */ \
DEFAULT_NODE_REJOIN_TIMEOUT, \
/* node check settings */ \ /* node check settings */ \
DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \ DEFAULT_ARCHIVE_READY_WARNING, DEFAULT_ARCHIVE_READY_CRITICAL, \
DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \ DEFAULT_REPLICATION_LAG_WARNING, DEFAULT_REPLICATION_LAG_CRITICAL, \
@@ -197,7 +171,7 @@ typedef struct
false, -1, \ false, -1, \
DEFAULT_ASYNC_QUERY_TIMEOUT, \ DEFAULT_ASYNC_QUERY_TIMEOUT, \
DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT, \ DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT, \
-1, "", \ DEFAULT_PRIMARY_FOLLOW_TIMEOUT, \
/* BDR settings */ \ /* BDR settings */ \
false, DEFAULT_BDR_RECOVERY_TIMEOUT, \ false, DEFAULT_BDR_RECOVERY_TIMEOUT, \
/* service settings */ \ /* service settings */ \
@@ -274,6 +248,7 @@ typedef struct
} }
void set_progname(const char *argv0); void set_progname(const char *argv0);
const char *progname(void); const char *progname(void);
@@ -283,26 +258,19 @@ bool reload_config(t_configuration_options *orig_options);
bool parse_recovery_conf(const char *data_dir, t_recovery_conf *conf); bool parse_recovery_conf(const char *data_dir, t_recovery_conf *conf);
bool parse_bool(const char *s,
const char *config_item,
ItemList *error_list);
int repmgr_atoi(const char *s, int repmgr_atoi(const char *s,
const char *config_item, const char *config_item,
ItemList *error_list, ItemList *error_list,
int minval); int minval);
bool parse_pg_basebackup_options(const char *pg_basebackup_options, bool parse_pg_basebackup_options(const char *pg_basebackup_options,
t_basebackup_options *backup_options, t_basebackup_options *backup_options,
int server_version_num, int server_version_num,
ItemList *error_list); ItemList *error_list);
int parse_output_to_argv(const char *string, char ***argv_array);
void free_parsed_argv(char ***argv_array);
/* called by repmgr-client and repmgrd */ /* called by repmgr-client and repmgrd */
void exit_with_cli_errors(ItemList *error_list, const char *repmgr_command); void exit_with_cli_errors(ItemList *error_list);
void print_item_list(ItemList *item_list); void print_item_list(ItemList *item_list);
#endif /* _REPMGR_CONFIGFILE_H_ */ #endif /* _REPMGR_CONFIGFILE_H_ */

38
configure vendored
View File

@@ -1,6 +1,6 @@
#! /bin/sh #! /bin/sh
# Guess values for system-dependent variables and create Makefiles. # Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for repmgr 4.1. # Generated by GNU Autoconf 2.69 for repmgr 4.0.2.
# #
# Report bugs to <pgsql-bugs@postgresql.org>. # Report bugs to <pgsql-bugs@postgresql.org>.
# #
@@ -582,8 +582,8 @@ MAKEFLAGS=
# Identity of this package. # Identity of this package.
PACKAGE_NAME='repmgr' PACKAGE_NAME='repmgr'
PACKAGE_TARNAME='repmgr' PACKAGE_TARNAME='repmgr'
PACKAGE_VERSION='4.1' PACKAGE_VERSION='4.0.2'
PACKAGE_STRING='repmgr 4.1' PACKAGE_STRING='repmgr 4.0.2'
PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org' PACKAGE_BUGREPORT='pgsql-bugs@postgresql.org'
PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/' PACKAGE_URL='https://2ndquadrant.com/en/resources/repmgr/'
@@ -633,6 +633,7 @@ SHELL'
ac_subst_files='' ac_subst_files=''
ac_user_opts=' ac_user_opts='
enable_option_checking enable_option_checking
with_bdr_only
' '
ac_precious_vars='build_alias ac_precious_vars='build_alias
host_alias host_alias
@@ -1178,7 +1179,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing. # Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh. # This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF cat <<_ACEOF
\`configure' configures repmgr 4.1 to adapt to many kinds of systems. \`configure' configures repmgr 4.0.2 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]... Usage: $0 [OPTION]... [VAR=VALUE]...
@@ -1239,10 +1240,15 @@ fi
if test -n "$ac_init_help"; then if test -n "$ac_init_help"; then
case $ac_init_help in case $ac_init_help in
short | recursive ) echo "Configuration of repmgr 4.1:";; short | recursive ) echo "Configuration of repmgr 4.0.2:";;
esac esac
cat <<\_ACEOF cat <<\_ACEOF
Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-bdr-only BDR-only build
Some influential environment variables: Some influential environment variables:
PG_CONFIG Location to find pg_config for target PostgreSQL (default PATH) PG_CONFIG Location to find pg_config for target PostgreSQL (default PATH)
@@ -1313,7 +1319,7 @@ fi
test -n "$ac_init_help" && exit $ac_status test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then if $ac_init_version; then
cat <<\_ACEOF cat <<\_ACEOF
repmgr configure 4.1 repmgr configure 4.0.2
generated by GNU Autoconf 2.69 generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc. Copyright (C) 2012 Free Software Foundation, Inc.
@@ -1332,7 +1338,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake. running configure, to aid debugging if configure makes a mistake.
It was created by repmgr $as_me 4.1, which was It was created by repmgr $as_me 4.0.2, which was
generated by GNU Autoconf 2.69. Invocation command line was generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@ $ $0 $@
@@ -1688,6 +1694,20 @@ ac_config_headers="$ac_config_headers config.h"
# Check whether --with-bdr_only was given.
if test "${with_bdr_only+set}" = set; then :
withval=$with_bdr_only;
fi
if test "x$with_bdr_only" != "x"; then :
$as_echo "#define BDR_ONLY \"1\"" >>confdefs.h
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a sed that does not truncate output" >&5
$as_echo_n "checking for a sed that does not truncate output... " >&6; } $as_echo_n "checking for a sed that does not truncate output... " >&6; }
if ${ac_cv_path_SED+:} false; then : if ${ac_cv_path_SED+:} false; then :
@@ -2359,7 +2379,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their # report actual input values of CONFIG_FILES etc. instead of their
# values after options handling. # values after options handling.
ac_log=" ac_log="
This file was extended by repmgr $as_me 4.1, which was This file was extended by repmgr $as_me 4.0.2, which was
generated by GNU Autoconf 2.69. Invocation command line was generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES CONFIG_FILES = $CONFIG_FILES
@@ -2422,7 +2442,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\ ac_cs_version="\\
repmgr config.status 4.1 repmgr config.status 4.0.2
configured by $0, generated by GNU Autoconf 2.69, configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\" with options \\"\$ac_cs_config\\"

View File

@@ -1,4 +1,4 @@
AC_INIT([repmgr], [4.1], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/]) AC_INIT([repmgr], [4.0.2], [pgsql-bugs@postgresql.org], [repmgr], [https://2ndquadrant.com/en/resources/repmgr/])
AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.]) AC_COPYRIGHT([Copyright (c) 2010-2018, 2ndQuadrant Ltd.])
@@ -6,6 +6,12 @@ AC_CONFIG_HEADER(config.h)
AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)]) AC_ARG_VAR([PG_CONFIG], [Location to find pg_config for target PostgreSQL (default PATH)])
AC_ARG_WITH([bdr_only], [AS_HELP_STRING([--with-bdr-only], [BDR-only build])])
AS_IF([test "x$with_bdr_only" != "x"],
[AC_DEFINE([BDR_ONLY], ["1"], [Only build repmgr for BDR])]
)
AC_PROG_SED AC_PROG_SED
if test -z "$PG_CONFIG"; then if test -z "$PG_CONFIG"; then

View File

@@ -37,8 +37,13 @@ get_system_identifier(const char *data_directory)
uint64 system_identifier = UNKNOWN_SYSTEM_IDENTIFIER; uint64 system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
control_file_info = get_controlfile(data_directory); control_file_info = get_controlfile(data_directory);
system_identifier = control_file_info->system_identifier;
if (control_file_info->control_file_processed == true)
system_identifier = control_file_info->control_file->system_identifier;
else
system_identifier = UNKNOWN_SYSTEM_IDENTIFIER;
pfree(control_file_info->control_file);
pfree(control_file_info); pfree(control_file_info);
return system_identifier; return system_identifier;
@@ -52,8 +57,13 @@ get_db_state(const char *data_directory)
control_file_info = get_controlfile(data_directory); control_file_info = get_controlfile(data_directory);
state = control_file_info->state; if (control_file_info->control_file_processed == true)
state = control_file_info->control_file->state;
else
/* if we were unable to parse the control file, assume DB is shut down */
state = DB_SHUTDOWNED;
pfree(control_file_info->control_file);
pfree(control_file_info); pfree(control_file_info);
return state; return state;
@@ -68,8 +78,12 @@ get_latest_checkpoint_location(const char *data_directory)
control_file_info = get_controlfile(data_directory); control_file_info = get_controlfile(data_directory);
checkPoint = control_file_info->checkPoint; if (control_file_info->control_file_processed == false)
return InvalidXLogRecPtr;
checkPoint = control_file_info->control_file->checkPoint;
pfree(control_file_info->control_file);
pfree(control_file_info); pfree(control_file_info);
return checkPoint; return checkPoint;
@@ -84,8 +98,16 @@ get_data_checksum_version(const char *data_directory)
control_file_info = get_controlfile(data_directory); control_file_info = get_controlfile(data_directory);
data_checksum_version = (int) control_file_info->data_checksum_version; if (control_file_info->control_file_processed == false)
{
data_checksum_version = -1;
}
else
{
data_checksum_version = (int) control_file_info->control_file->data_checksum_version;
}
pfree(control_file_info->control_file);
pfree(control_file_info); pfree(control_file_info);
return data_checksum_version; return data_checksum_version;
@@ -117,109 +139,33 @@ describe_db_state(DBState state)
/* /*
* We maintain our own version of get_controlfile() as we need cross-version * we maintain our own version of get_controlfile() as we need cross-version
* compatibility, and also don't care if the file isn't readable. * compatibility, and also don't care if the file isn't readable.
*/ */
static ControlFileInfo * static ControlFileInfo *
get_controlfile(const char *DataDir) get_controlfile(const char *DataDir)
{ {
ControlFileInfo *control_file_info; ControlFileInfo *control_file_info;
FILE *fp = NULL; int fd;
int fd, ret, version_num;
char PgVersionPath[MAXPGPATH] = "";
char ControlFilePath[MAXPGPATH] = ""; char ControlFilePath[MAXPGPATH] = "";
char file_version_string[64] = "";
long file_major, file_minor;
char *endptr = NULL;
void *ControlFileDataPtr = NULL;
int expected_size = 0;
control_file_info = palloc0(sizeof(ControlFileInfo)); control_file_info = palloc0(sizeof(ControlFileInfo));
/* set default values */
control_file_info->control_file_processed = false; control_file_info->control_file_processed = false;
control_file_info->system_identifier = UNKNOWN_SYSTEM_IDENTIFIER; control_file_info->control_file = palloc0(sizeof(ControlFileData));
control_file_info->state = DB_SHUTDOWNED;
control_file_info->checkPoint = InvalidXLogRecPtr;
control_file_info->data_checksum_version = -1;
/*
* Read PG_VERSION, as we'll need to determine which struct to read
* the control file contents into
*/
snprintf(PgVersionPath, MAXPGPATH, "%s/PG_VERSION", DataDir);
fp = fopen(PgVersionPath, "r");
if (fp == NULL)
{
log_warning(_("could not open file \"%s\" for reading"),
PgVersionPath);
log_detail("%s", strerror(errno));
return control_file_info;
}
file_version_string[0] = '\0';
ret = fscanf(fp, "%63s", file_version_string);
fclose(fp);
if (ret != 1 || endptr == file_version_string)
{
log_warning(_("unable to determine major version number from PG_VERSION"));
return control_file_info;
}
file_major = strtol(file_version_string, &endptr, 10);
file_minor = 0;
if (*endptr == '.')
file_minor = strtol(endptr + 1, NULL, 10);
version_num = ((int) file_major * 10000) + ((int) file_minor * 100);
if (version_num < 90300)
{
log_warning(_("Data directory appears to be initialised for %s"), file_version_string);
return control_file_info;
}
snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir); snprintf(ControlFilePath, MAXPGPATH, "%s/global/pg_control", DataDir);
if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1) if ((fd = open(ControlFilePath, O_RDONLY | PG_BINARY, 0)) == -1)
{ {
log_warning(_("could not open file \"%s\" for reading"), log_debug("could not open file \"%s\" for reading: %s",
ControlFilePath); ControlFilePath, strerror(errno));
log_detail("%s", strerror(errno));
return control_file_info; return control_file_info;
} }
if (read(fd, control_file_info->control_file, sizeof(ControlFileData)) != sizeof(ControlFileData))
if (version_num >= 90500)
{ {
expected_size = sizeof(ControlFileData95); log_debug("could not read file \"%s\": %s",
ControlFileDataPtr = palloc0(expected_size); ControlFilePath, strerror(errno));
}
else if (version_num >= 90400)
{
expected_size = sizeof(ControlFileData94);
ControlFileDataPtr = palloc0(expected_size);
}
else if (version_num >= 90300)
{
expected_size = sizeof(ControlFileData93);
ControlFileDataPtr = palloc0(expected_size);
}
if (read(fd, ControlFileDataPtr, expected_size) != expected_size)
{
log_warning(_("could not read file \"%s\""),
ControlFilePath);
log_detail("%s", strerror(errno));
return control_file_info; return control_file_info;
} }
@@ -227,33 +173,6 @@ get_controlfile(const char *DataDir)
control_file_info->control_file_processed = true; control_file_info->control_file_processed = true;
if (version_num >= 90500)
{
ControlFileData95 *ptr = (struct ControlFileData95 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
}
else if (version_num >= 90400)
{
ControlFileData94 *ptr = (struct ControlFileData94 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
}
else if (version_num >= 90300)
{
ControlFileData93 *ptr = (struct ControlFileData93 *)ControlFileDataPtr;
control_file_info->system_identifier = ptr->system_identifier;
control_file_info->state = ptr->state;
control_file_info->checkPoint = ptr->checkPoint;
control_file_info->data_checksum_version = ptr->data_checksum_version;
}
pfree(ControlFileDataPtr);
/* /*
* We don't check the CRC here as we're potentially checking a pg_control * We don't check the CRC here as we're potentially checking a pg_control
* file from a different PostgreSQL version to the one repmgr was compiled * file from a different PostgreSQL version to the one repmgr was compiled

View File

@@ -12,261 +12,12 @@
#include "postgres_fe.h" #include "postgres_fe.h"
#include "catalog/pg_control.h" #include "catalog/pg_control.h"
/*
* A simplified representation of pg_control containing only those fields
* required by repmgr.
*/
typedef struct typedef struct
{ {
bool control_file_processed; bool control_file_processed;
uint64 system_identifier; ControlFileData *control_file;
DBState state;
XLogRecPtr checkPoint;
uint32 data_checksum_version;
} ControlFileInfo; } ControlFileInfo;
/* Same for 9.3, 9.4 */
typedef struct CheckPoint93
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
TimeLineID ThisTimeLineID; /* current TLI */
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
uint32 nextXidEpoch; /* higher-order bits of nextXid */
TransactionId nextXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
Oid oldestXidDB; /* database with minimum datfrozenxid */
MultiXactId oldestMulti; /* cluster-wide minimum datminmxid */
Oid oldestMultiDB; /* database with minimum datminmxid */
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestActiveXid;
} CheckPoint93;
/* Same for 9.5, 9.6, 10, HEAD */
typedef struct CheckPoint95
{
XLogRecPtr redo; /* next RecPtr available when we began to
* create CheckPoint (i.e. REDO start point) */
TimeLineID ThisTimeLineID; /* current TLI */
TimeLineID PrevTimeLineID; /* previous TLI, if this record begins a new
* timeline (equals ThisTimeLineID otherwise) */
bool fullPageWrites; /* current full_page_writes */
uint32 nextXidEpoch; /* higher-order bits of nextXid */
TransactionId nextXid; /* next free XID */
Oid nextOid; /* next free OID */
MultiXactId nextMulti; /* next free MultiXactId */
MultiXactOffset nextMultiOffset; /* next free MultiXact offset */
TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */
Oid oldestXidDB; /* database with minimum datfrozenxid */
MultiXactId oldestMulti; /* cluster-wide minimum datminmxid */
Oid oldestMultiDB; /* database with minimum datminmxid */
pg_time_t time; /* time stamp of checkpoint */
TransactionId oldestCommitTsXid; /* oldest Xid with valid commit
* timestamp */
TransactionId newestCommitTsXid; /* newest Xid with valid commit
* timestamp */
TransactionId oldestActiveXid;
} CheckPoint95;
typedef struct ControlFileData93
{
uint64 system_identifier;
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
int MaxConnections;
int max_prepared_xacts;
int max_locks_per_xact;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
/* flag indicating internal format of timestamp, interval, time */
bool enableIntTimes; /* int64 storage enabled? */
/* flags indicating pass-by-value status of various types */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
/* Are data pages protected by checksums? Zero if no checksum version */
uint32 data_checksum_version;
} ControlFileData93;
/*
* Following fields added since 9.3:
*
* int max_worker_processes;
* int max_prepared_xacts;
* int max_locks_per_xact;
*
*/
typedef struct ControlFileData94
{
uint64 system_identifier;
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint93 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
bool wal_log_hints;
int MaxConnections;
int max_worker_processes;
int max_prepared_xacts;
int max_locks_per_xact;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
uint32 loblksize; /* chunk size in pg_largeobject */
bool enableIntTimes; /* int64 storage enabled? */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
/* Are data pages protected by checksums? Zero if no checksum version */
uint32 data_checksum_version;
} ControlFileData94;
/*
* Following field added since 9.4:
*
* bool track_commit_timestamp;
*
* Unchanged in 9.6
*
* In 10, following field appended *after* "data_checksum_version":
*
* char mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
*
* (but we don't care about that)
*/
typedef struct ControlFileData95
{
uint64 system_identifier;
uint32 pg_control_version; /* PG_CONTROL_VERSION */
uint32 catalog_version_no; /* see catversion.h */
DBState state; /* see enum above */
pg_time_t time; /* time stamp of last pg_control update */
XLogRecPtr checkPoint; /* last check point record ptr */
XLogRecPtr prevCheckPoint; /* previous check point record ptr */
CheckPoint95 checkPointCopy; /* copy of last check point record */
XLogRecPtr unloggedLSN; /* current fake LSN value, for unlogged rels */
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
XLogRecPtr backupStartPoint;
XLogRecPtr backupEndPoint;
bool backupEndRequired;
int wal_level;
bool wal_log_hints;
int MaxConnections;
int max_worker_processes;
int max_prepared_xacts;
int max_locks_per_xact;
bool track_commit_timestamp;
uint32 maxAlign; /* alignment requirement for tuples */
double floatFormat; /* constant 1234567.0 */
uint32 blcksz; /* data block size for this DB */
uint32 relseg_size; /* blocks per segment of large relation */
uint32 xlog_blcksz; /* block size within WAL files */
uint32 xlog_seg_size; /* size of each WAL segment */
uint32 nameDataLen; /* catalog name field width */
uint32 indexMaxKeys; /* max number of columns in an index */
uint32 toast_max_chunk_size; /* chunk size in TOAST tables */
uint32 loblksize; /* chunk size in pg_largeobject */
bool enableIntTimes; /* int64 storage enabled? */
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
uint32 data_checksum_version;
} ControlFileData95;
extern DBState get_db_state(const char *data_directory); extern DBState get_db_state(const char *data_directory);
extern const char *describe_db_state(DBState state); extern const char *describe_db_state(DBState state);
extern int get_data_checksum_version(const char *data_directory); extern int get_data_checksum_version(const char *data_directory);

990
dbutils.c

File diff suppressed because it is too large Load Diff

View File

@@ -28,10 +28,8 @@
#include "strutil.h" #include "strutil.h"
#include "voting.h" #include "voting.h"
#define REPMGR_NODES_COLUMNS "n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name " #define REPMGR_NODES_COLUMNS "node_id, type, upstream_node_id, node_name, conninfo, repluser, slot_name, location, priority, active, config_file, '' AS upstream_node_name "
#define BDR2_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_name, node_local_dsn, ''" #define BDR_NODES_COLUMNS "node_sysid, node_timeline, node_dboid, node_status, node_name, node_local_dsn, node_init_from_dsn, node_read_only, node_seq_id"
#define BDR3_NODES_COLUMNS "ns.node_id, 0, 0, ns.node_name, ns.interface_connstr, ns.peer_state_name"
#define ERRBUFF_SIZE 512 #define ERRBUFF_SIZE 512
@@ -81,14 +79,6 @@ typedef enum
NODE_STATUS_UNCLEAN_SHUTDOWN NODE_STATUS_UNCLEAN_SHUTDOWN
} NodeStatus; } NodeStatus;
typedef enum
{
CONN_UNKNOWN = -1,
CONN_OK,
CONN_BAD,
CONN_ERROR
} ConnectionStatus;
typedef enum typedef enum
{ {
SLOT_UNKNOWN = -1, SLOT_UNKNOWN = -1,
@@ -96,14 +86,6 @@ typedef enum
SLOT_ACTIVE SLOT_ACTIVE
} ReplSlotStatus; } ReplSlotStatus;
typedef enum
{
BACKUP_STATE_UNKNOWN = -1,
BACKUP_STATE_IN_BACKUP,
BACKUP_STATE_NO_BACKUP
} BackupState;
/* /*
* Struct to store node information * Struct to store node information
*/ */
@@ -193,7 +175,7 @@ typedef struct s_event_info
{ {
char *node_name; char *node_name;
char *conninfo_str; char *conninfo_str;
int node_id; int former_primary_id;
} t_event_info; } t_event_info;
#define T_EVENT_INFO_INITIALIZER { \ #define T_EVENT_INFO_INITIALIZER { \
@@ -247,14 +229,18 @@ typedef struct s_bdr_node_info
char node_sysid[MAXLEN]; char node_sysid[MAXLEN];
uint32 node_timeline; uint32 node_timeline;
uint32 node_dboid; uint32 node_dboid;
char node_status;
char node_name[MAXLEN]; char node_name[MAXLEN];
char node_local_dsn[MAXLEN]; char node_local_dsn[MAXLEN];
char peer_state_name[MAXLEN]; char node_init_from_dsn[MAXLEN];
bool read_only;
uint32 node_seq_id;
} t_bdr_node_info; } t_bdr_node_info;
#define T_BDR_NODE_INFO_INITIALIZER { \ #define T_BDR_NODE_INFO_INITIALIZER { \
"", InvalidOid, InvalidOid, \ "", InvalidOid, InvalidOid, \
"", "", "" \ '?', "", "", "", \
false, -1 \
} }
@@ -349,6 +335,9 @@ bool atobool(const char *value);
PGconn *establish_db_connection(const char *conninfo, PGconn *establish_db_connection(const char *conninfo,
const bool exit_on_error); const bool exit_on_error);
PGconn *establish_db_connection_quiet(const char *conninfo); PGconn *establish_db_connection_quiet(const char *conninfo);
PGconn *establish_db_connection_as_user(const char *conninfo,
const char *user,
const bool exit_on_error);
PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list, PGconn *establish_db_connection_by_params(t_conninfo_param_list *param_list,
const bool exit_on_error); const bool exit_on_error);
@@ -359,11 +348,10 @@ PGconn *get_primary_connection(PGconn *standby_conn, int *primary_id, char *p
PGconn *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out); PGconn *get_primary_connection_quiet(PGconn *standby_conn, int *primary_id, char *primary_conninfo_out);
bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo); bool is_superuser_connection(PGconn *conn, t_connection_user *userinfo);
void close_connection(PGconn **conn);
/* conninfo manipulation functions */ /* conninfo manipulation functions */
bool get_conninfo_value(const char *conninfo, const char *keyword, char *output); bool get_conninfo_value(const char *conninfo, const char *keyword, char *output);
bool get_conninfo_default_value(const char *param, char *output, int maxlen);
void initialize_conninfo_params(t_conninfo_param_list *param_list, bool set_defaults); void initialize_conninfo_params(t_conninfo_param_list *param_list, bool set_defaults);
void free_conninfo_params(t_conninfo_param_list *param_list); void free_conninfo_params(t_conninfo_param_list *param_list);
void copy_conninfo_params(t_conninfo_param_list *dest_list, t_conninfo_param_list *source_list); void copy_conninfo_params(t_conninfo_param_list *dest_list, t_conninfo_param_list *source_list);
@@ -371,11 +359,10 @@ void conn_to_param_list(PGconn *conn, t_conninfo_param_list *param_list);
void param_set(t_conninfo_param_list *param_list, const char *param, const char *value); void param_set(t_conninfo_param_list *param_list, const char *param, const char *value);
void param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value); void param_set_ine(t_conninfo_param_list *param_list, const char *param, const char *value);
char *param_get(t_conninfo_param_list *param_list, const char *param); char *param_get(t_conninfo_param_list *param_list, const char *param);
bool parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char **errmsg, bool ignore_local_params); bool parse_conninfo_string(const char *conninfo_str, t_conninfo_param_list *param_list, char *errmsg, bool ignore_local_params);
char *param_list_to_string(t_conninfo_param_list *param_list); char *param_list_to_string(t_conninfo_param_list *param_list);
bool has_passfile(void); bool has_passfile(void);
/* transaction functions */ /* transaction functions */
bool begin_transaction(PGconn *conn); bool begin_transaction(PGconn *conn);
bool commit_transaction(PGconn *conn); bool commit_transaction(PGconn *conn);
@@ -394,11 +381,11 @@ bool get_cluster_size(PGconn *conn, char *size);
int get_server_version(PGconn *conn, char *server_version); int get_server_version(PGconn *conn, char *server_version);
RecoveryType get_recovery_type(PGconn *conn); RecoveryType get_recovery_type(PGconn *conn);
int get_primary_node_id(PGconn *conn); int get_primary_node_id(PGconn *conn);
bool can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
int get_ready_archive_files(PGconn *conn, const char *data_directory); int get_ready_archive_files(PGconn *conn, const char *data_directory);
bool identify_system(PGconn *repl_conn, t_system_identification *identification); bool identify_system(PGconn *repl_conn, t_system_identification *identification);
bool repmgrd_set_local_node_id(PGconn *conn, int local_node_id); bool repmgrd_set_local_node_id(PGconn *conn, int local_node_id);
int repmgrd_get_local_node_id(PGconn *conn); int repmgrd_get_local_node_id(PGconn *conn);
BackupState server_in_exclusive_backup_mode(PGconn *conn);
/* extension functions */ /* extension functions */
ExtensionStatus get_repmgr_extension_status(PGconn *conn); ExtensionStatus get_repmgr_extension_status(PGconn *conn);
@@ -413,8 +400,6 @@ t_server_type parse_node_type(const char *type);
const char *get_node_type_string(t_server_type type); const char *get_node_type_string(t_server_type type);
RecordStatus get_node_record(PGconn *conn, int node_id, t_node_info *node_info); RecordStatus get_node_record(PGconn *conn, int node_id, t_node_info *node_info);
RecordStatus get_node_record_with_upstream(PGconn *conn, int node_id, t_node_info *node_info);
RecordStatus get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_info); RecordStatus get_node_record_by_name(PGconn *conn, const char *node_name, t_node_info *node_info);
t_node_info *get_node_record_pointer(PGconn *conn, int node_id); t_node_info *get_node_record_pointer(PGconn *conn, int node_id);
@@ -425,8 +410,7 @@ void get_all_node_records(PGconn *conn, NodeInfoList *node_list);
void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes); void get_downstream_node_records(PGconn *conn, int node_id, NodeInfoList *nodes);
void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list); void get_active_sibling_node_records(PGconn *conn, int node_id, int upstream_node_id, NodeInfoList *node_list);
void get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list); void get_node_records_by_priority(PGconn *conn, NodeInfoList *node_list);
bool get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list); void get_all_node_records_with_upstream(PGconn *conn, NodeInfoList *node_list);
bool get_downstream_nodes_with_missing_slot(PGconn *conn, int this_node_id, NodeInfoList *noede_list);
bool create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info); bool create_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
bool update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info); bool update_node_record(PGconn *conn, char *repmgr_action, t_node_info *node_info);
@@ -435,7 +419,6 @@ bool truncate_node_records(PGconn *conn);
bool update_node_record_set_active(PGconn *conn, int this_node_id, bool active); bool update_node_record_set_active(PGconn *conn, int this_node_id, bool active);
bool update_node_record_set_primary(PGconn *conn, int this_node_id); bool update_node_record_set_primary(PGconn *conn, int this_node_id);
bool update_node_record_set_active_standby(PGconn *conn, int this_node_id);
bool update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id); bool update_node_record_set_upstream(PGconn *conn, int this_node_id, int new_upstream_node_id);
bool update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active); bool update_node_record_status(PGconn *conn, int this_node_id, char *type, int upstream_node_id, bool active);
bool update_node_record_conn_priority(PGconn *conn, t_configuration_options *options); bool update_node_record_conn_priority(PGconn *conn, t_configuration_options *options);
@@ -462,8 +445,6 @@ void create_slot_name(char *slot_name, int node_id);
bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg); bool create_replication_slot(PGconn *conn, char *slot_name, int server_version_num, PQExpBufferData *error_msg);
bool drop_replication_slot(PGconn *conn, char *slot_name); bool drop_replication_slot(PGconn *conn, char *slot_name);
RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record); RecordStatus get_slot_record(PGconn *conn, char *slot_name, t_replication_slot *record);
int get_free_replication_slot_count(PGconn *conn);
int get_inactive_replication_slots(PGconn *conn, KeyValueList *list);
/* tablespace functions */ /* tablespace functions */
bool get_tablespace_name_by_location(PGconn *conn, const char *location, char *name); bool get_tablespace_name_by_location(PGconn *conn, const char *location, char *name);
@@ -474,8 +455,6 @@ int wait_connection_availability(PGconn *conn, long long timeout);
/* node availability functions */ /* node availability functions */
bool is_server_available(const char *conninfo); bool is_server_available(const char *conninfo);
bool is_server_available_params(t_conninfo_param_list *param_list);
void connection_ping(PGconn *conn);
/* monitoring functions */ /* monitoring functions */
void void
@@ -514,14 +493,12 @@ void get_node_replication_stats(PGconn *conn, int server_version_num, t_node_in
bool is_downstream_node_attached(PGconn *conn, char *node_name); bool is_downstream_node_attached(PGconn *conn, char *node_name);
/* BDR functions */ /* BDR functions */
int get_bdr_version_num(void);
void get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list); void get_all_bdr_node_records(PGconn *conn, BdrNodeInfoList *node_list);
RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info); RecordStatus get_bdr_node_record_by_name(PGconn *conn, const char *node_name, t_bdr_node_info *node_info);
bool is_bdr_db(PGconn *conn, PQExpBufferData *output); bool is_bdr_db(PGconn *conn, PQExpBufferData *output);
bool is_bdr_db_quiet(PGconn *conn); bool is_bdr_db_quiet(PGconn *conn);
bool is_active_bdr_node(PGconn *conn, const char *node_name); bool is_active_bdr_node(PGconn *conn, const char *node_name);
bool is_bdr_repmgr(PGconn *conn); bool is_bdr_repmgr(PGconn *conn);
char *get_default_bdr_replication_set(PGconn *conn);
bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set); bool is_table_in_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set); bool add_table_to_bdr_replication_set(PGconn *conn, const char *tablename, const char *set);
void add_extension_tables_to_bdr_replication_set(PGconn *conn); void add_extension_tables_to_bdr_replication_set(PGconn *conn);

196
dirutil.c
View File

@@ -21,7 +21,6 @@
#include <unistd.h> #include <unistd.h>
#include <dirent.h> #include <dirent.h>
#include <signal.h>
#include <sys/stat.h> #include <sys/stat.h>
#include <errno.h> #include <errno.h>
#include <stdio.h> #include <stdio.h>
@@ -35,33 +34,34 @@
#include "dirutil.h" #include "dirutil.h"
#include "strutil.h" #include "strutil.h"
#include "log.h" #include "log.h"
#include "controldata.h"
static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf); static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf);
/* PID can be negative if backend is standalone */
typedef long pgpid_t;
/* /*
* Check if a directory exists, and if so whether it is empty. * make sure the directory either doesn't exist or is empty
* we use this function to check the new data directory and
* the directories for tablespaces
* *
* This function is used for checking both the data directory * This is the same check initdb does on the new PGDATA dir
* and tablespace directories. *
* Returns 0 if nonexistent, 1 if exists and empty, 2 if not empty,
* or -1 if trouble accessing directory
*/ */
DataDirState int
check_dir(char *path) check_dir(char *path)
{ {
DIR *chkdir = NULL; DIR *chkdir;
struct dirent *file = NULL; struct dirent *file;
int result = DIR_EMPTY; int result = 1;
errno = 0; errno = 0;
chkdir = opendir(path); chkdir = opendir(path);
if (!chkdir) if (!chkdir)
return (errno == ENOENT) ? DIR_NOENT : DIR_ERROR; return (errno == ENOENT) ? 0 : -1;
while ((file = readdir(chkdir)) != NULL) while ((file = readdir(chkdir)) != NULL)
{ {
@@ -73,15 +73,25 @@ check_dir(char *path)
} }
else else
{ {
result = DIR_NOT_EMPTY; result = 2; /* not empty */
break; break;
} }
} }
#ifdef WIN32
/*
* This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
* released version
*/
if (GetLastError() == ERROR_NO_MORE_FILES)
errno = 0;
#endif
closedir(chkdir); closedir(chkdir);
if (errno != 0) if (errno != 0)
return DIR_ERROR; /* some kind of I/O error? */ return -1; /* some kind of I/O error? */
return result; return result;
} }
@@ -96,13 +106,12 @@ create_dir(char *path)
if (mkdir_p(path, 0700) == 0) if (mkdir_p(path, 0700) == 0)
return true; return true;
log_error(_("unable to create directory \"%s\""), path); log_error(_("unable to create directory \"%s\": %s"),
log_detail("%s", strerror(errno)); path, strerror(errno));
return false; return false;
} }
bool bool
set_dir_permissions(char *path) set_dir_permissions(char *path)
{ {
@@ -137,6 +146,26 @@ mkdir_p(char *path, mode_t omode)
oumask = 0; oumask = 0;
retval = 0; retval = 0;
#ifdef WIN32
/* skip network and drive specifiers for win32 */
if (strlen(p) >= 2)
{
if (p[0] == '/' && p[1] == '/')
{
/* network drive */
p = strstr(p + 2, "/");
if (p == NULL)
return 1;
}
else if (p[1] == ':' &&
((p[0] >= 'a' && p[0] <= 'z') ||
(p[0] >= 'A' && p[0] <= 'Z')))
{
/* local drive */
p += 2;
}
}
#endif
if (p[0] == '/') /* Skip leading '/'. */ if (p[0] == '/') /* Skip leading '/'. */
++p; ++p;
@@ -213,91 +242,17 @@ is_pg_dir(char *path)
return false; return false;
} }
/*
* Attempt to determine if a PostgreSQL data directory is in use
* by reading the pidfile. This is the same mechanism used by
* "pg_ctl".
*
* This function will abort with appropriate log messages if a file error
* is encountered, as the user will need to address the situation before
* any further useful progress can be made.
*/
PgDirState
is_pg_running(char *path)
{
long pid;
FILE *pidf;
char pid_file[MAXPGPATH];
/* it's reasonable to assume the pidfile name will not change */
snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", path);
pidf = fopen(pid_file, "r");
if (pidf == NULL)
{
/*
* No PID file - PostgreSQL shouldn't be running. From 9.3 (the
* earliesty version we care about) removal of the PID file will
* cause the postmaster to shut down, so it's highly unlikely
* that PostgreSQL will still be running.
*/
if (errno == ENOENT)
{
return PG_DIR_NOT_RUNNING;
}
else
{
log_error(_("unable to open PostgreSQL PID file \"%s\""), pid_file);
log_detail("%s", strerror(errno));
exit(ERR_BAD_CONFIG);
}
}
/*
* In the unlikely event we're unable to extract a PID from the PID file,
* log a warning but assume we're not dealing with a running instance
* as PostgreSQL should have shut itself down in these cases anyway.
*/
if (fscanf(pidf, "%ld", &pid) != 1)
{
/* Is the file empty? */
if (ftell(pidf) == 0 && feof(pidf))
{
log_warning(_("PostgreSQL PID file \"%s\" is empty"), path);
}
else
{
log_warning(_("invalid data in PostgreSQL PID file \"%s\""), path);
}
return PG_DIR_NOT_RUNNING;
}
fclose(pidf);
if (pid == getpid())
return PG_DIR_NOT_RUNNING;
if (pid == getppid())
return PG_DIR_NOT_RUNNING;
if (kill(pid, 0) == 0)
return PG_DIR_RUNNING;
return PG_DIR_NOT_RUNNING;
}
bool bool
create_pg_dir(char *path, bool force) create_pg_dir(char *path, bool force)
{ {
/* Check this directory can be used as a PGDATA dir */ bool pg_dir = false;
/* Check this directory could be used as a PGDATA dir */
switch (check_dir(path)) switch (check_dir(path))
{ {
case DIR_NOENT: case 0:
/* directory does not exist, attempt to create it */ /* dir not there, must create it */
log_info(_("creating directory \"%s\"..."), path); log_info(_("creating directory \"%s\"..."), path);
if (!create_dir(path)) if (!create_dir(path))
@@ -307,51 +262,52 @@ create_pg_dir(char *path, bool force)
return false; return false;
} }
break; break;
case DIR_EMPTY: case 1:
/* exists but empty, fix permissions and use it */ /* Present but empty, fix permissions and use it */
log_info(_("checking and correcting permissions on existing directory \"%s\""), log_info(_("checking and correcting permissions on existing directory %s"),
path); path);
if (!set_dir_permissions(path)) if (!set_dir_permissions(path))
{ {
log_error(_("unable to change permissions of directory \"%s\""), path); log_error(_("unable to change permissions of directory \"%s\":\n %s"),
log_detail("%s", strerror(errno)); path, strerror(errno));
return false; return false;
} }
break; break;
case DIR_NOT_EMPTY: case 2:
/* exists but is not empty */ /* Present and not empty */
log_warning(_("directory \"%s\" exists but is not empty"), log_warning(_("directory \"%s\" exists but is not empty"),
path); path);
if (is_pg_dir(path)) pg_dir = is_pg_dir(path);
{
if (force == true)
{
log_notice(_("-F/--force provided - deleting existing data directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
return true;
}
return false; if (pg_dir && force)
}
else
{ {
if (force == true) /* TODO: check DB state, if not running overwrite */
if (false)
{ {
log_notice(_("deleting existing directory \"%s\""), path); log_notice(_("deleting existing data directory \"%s\""), path);
nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS); nftw(path, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS);
return true;
} }
/* Let it continue */
break;
}
else if (pg_dir && !force)
{
log_hint(_("This looks like a PostgreSQL directory.\n"
"If you are sure you want to clone here, "
"please check there is no PostgreSQL server "
"running and use the -F/--force option"));
return false; return false;
} }
break;
case DIR_ERROR: return false;
default:
log_error(_("could not access directory \"%s\": %s"), log_error(_("could not access directory \"%s\": %s"),
path, strerror(errno)); path, strerror(errno));
return false; return false;
} }
return true; return true;
} }

View File

@@ -19,29 +19,12 @@
#ifndef _DIRUTIL_H_ #ifndef _DIRUTIL_H_
#define _DIRUTIL_H_ #define _DIRUTIL_H_
typedef enum
{
DIR_ERROR = -1,
DIR_NOENT,
DIR_EMPTY,
DIR_NOT_EMPTY
} DataDirState;
typedef enum
{
PG_DIR_ERROR = -1,
PG_DIR_NOT_RUNNING,
PG_DIR_RUNNING
} PgDirState;
extern int mkdir_p(char *path, mode_t omode); extern int mkdir_p(char *path, mode_t omode);
extern bool set_dir_permissions(char *path); extern bool set_dir_permissions(char *path);
extern DataDirState check_dir(char *path); extern int check_dir(char *path);
extern bool create_dir(char *path); extern bool create_dir(char *path);
extern bool is_pg_dir(char *path); extern bool is_pg_dir(char *path);
extern PgDirState is_pg_running(char *path);
extern bool create_pg_dir(char *path, bool force); extern bool create_pg_dir(char *path, bool force);
extern int rmdir_recursive(char *path); extern int rmdir_recursive(char *path);
#endif #endif

View File

@@ -24,9 +24,8 @@
series will no longer be actively maintained. series will no longer be actively maintained.
</para> </para>
<para> <para>
&repmgr; 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible repmgr 2.x supports PostgreSQL 9.0 ~ 9.3. While it is compatible
with PostgreSQL 9.3, we recommend using repmgr 4.x. &repmgr; 2.x is with PostgreSQL 9.3, we recommend using repmgr 4.x.
no longer maintained.
</para> </para>
</sect2> </sect2>
@@ -36,7 +35,7 @@
Replication slots, introduced in PostgreSQL 9.4, ensure that the Replication slots, introduced in PostgreSQL 9.4, ensure that the
primary server will retain WAL files until they have been consumed primary server will retain WAL files until they have been consumed
by all standby servers. This makes WAL file management much easier, by all standby servers. This makes WAL file management much easier,
and if used &repmgr; will no longer insist on a fixed minimum number and if used `repmgr` will no longer insist on a fixed minimum number
(default: 5000) of WAL files being retained. (default: 5000) of WAL files being retained.
</para> </para>
<para> <para>
@@ -70,50 +69,12 @@
in a streaming replication cluster. in a streaming replication cluster.
</para> </para>
</sect2> </sect2>
<sect2 id="faq-upgrades" xreflabel="Upgrading PostgreSQL with repmgr">
<title>Can &repmgr; assist with upgrading a PostgreSQL cluster?</title>
<para>
For <emphasis>minor</emphasis> version upgrades, e.g. from 9.6.7 to 9.6.8, a common
approach is to upgrade a standby to the latest version, perform a
<link linkend="performing-switchover">switchover</link> promoting it to a primary,
then upgrade the former primary.
</para>
<para>
For <emphasis>major</emphasis> version upgrades (e.g. from PostgreSQL 9.6 to PostgreSQL 10),
the traditional approach is to "reseed" a cluster by upgrading a single
node with <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade</ulink>
and recloning standbys from this.
</para>
<para>
To minimize downtime during major upgrades, for more recent PostgreSQL
versions (PostgreSQL 9.4 and later),
<ulink url="https://www.2ndquadrant.com/en/resources/pglogical/">pglogical</ulink>
can be used to set up a parallel cluster using the newer PostgreSQL version,
which can be kept in sync with the existing production cluster until the
new cluster is ready to be put into production.
</para>
</sect2>
<sect2 id="faq-libdir-repmgr-error">
<title>What does this error mean: <literal>ERROR: could not access file "$libdir/repmgr"</literal>?</title>
<para>
It means the &repmgr; extension code is not installed in the
PostgreSQL application directory. This typically happens when using PostgreSQL
packages provided by a third-party vendor, which often have different
filesystem layouts.
</para>
<para>
Either use PostgreSQL packages provided by the community or 2ndQuadrant; if this
is not possible, contact your vendor for assistance.
</para>
</sect2>
</sect1> </sect1>
<sect1 id="faq-repmgr" xreflabel="repmgr"> <sect1 id="faq-repmgr" xreflabel="repmgr">
<title><command>repmgr</command></title> <title><command>repmgr</command></title>
<sect2 id="faq-register-existing-node" xreflabel="registering an existing node"> <sect2 id="faq-register-existing-node" xreflabel="">
<title>Can I register an existing PostgreSQL server with repmgr?</title> <title>Can I register an existing PostgreSQL server with repmgr?</title>
<para> <para>
Yes, any existing PostgreSQL server which is part of the same replication Yes, any existing PostgreSQL server which is part of the same replication
@@ -122,26 +83,6 @@
</para> </para>
</sect2> </sect2>
<sect2 id="faq-repmgr-clone-other-source" >
<title>Can I use a standby not cloned by &repmgr; as a &repmgr; node?</title>
<para>
For a standby which has been manually cloned or recovered from an external
backup manager such as Barman, the command
<command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
can be used to create the correct <filename>recovery.conf</filename> file for
use with &repmgr; (and will create a replication slot if required). Once this has been done,
<link linkend="repmgr-standby-register">register the node</link> as usual.
</para>
</sect2>
<sect2 id="faq-repmgr-recovery-conf" >
<title>What does &repmgr; write in <filename>recovery.conf</filename>, and what options can be set there?</title>
<para>
See section <link linkend="repmgr-standby-clone-recovery-conf">Customising recovery.conf</link>.
</para>
</sect2>
<sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby"> <sect2 id="faq-repmgr-failed-primary-standby" xreflabel="Reintegrate a failed primary as a standby">
<title>How can a failed primary be re-added as a standby?</title> <title>How can a failed primary be re-added as a standby?</title>
<para> <para>
@@ -150,23 +91,19 @@
needs to be re-registered as a standby. needs to be re-registered as a standby.
</para> </para>
<para> <para>
It's possible to use <command>pg_rewind</command> to re-synchronise the existing data In PostgreSQL 9.5 and later, it's possible to use <command>pg_rewind</command>
directory, which will usually be much to re-synchronise the existing data directory, which will usually be much
faster than re-cloning the server. However <command>pg_rewind</command> can only faster than re-cloning the server. However <command>pg_rewind</command> can only
be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or be used if PostgreSQL either has <varname>wal_log_hints</varname> enabled, or
data checksums were enabled when the cluster was initialized. data checksums were enabled when the cluster was initialized.
</para> </para>
<para>
Note that <command>pg_rewind</command> is available as part of the core PostgreSQL
distribution from PostgreSQL 9.5, and as a third-party utility for PostgreSQL 9.3 and 9.4.
</para>
<para> <para>
&repmgr; provides the command <command>repmgr node rejoin</command> which can &repmgr; provides the command <command>repmgr node rejoin</command> which can
optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin"> optionally execute <command>pg_rewind</command>; see the <xref linkend="repmgr-node-rejoin">
documentation for details, in particular the section <xref linkend="repmgr-node-rejoin-pg-rewind">. documentation for details.
</para> </para>
<para> <para>
If <command>pg_rewind</command> cannot be used, then the data directory will need If <command>pg_rewind</command> cannot be used, then the data directory will have
to be re-cloned from scratch. to be re-cloned from scratch.
</para> </para>
@@ -243,9 +180,6 @@
</para> </para>
</sect2> </sect2>
</sect1> </sect1>
<sect1 id="faq-repmgrd" xreflabel="repmgrd"> <sect1 id="faq-repmgrd" xreflabel="repmgrd">

View File

@@ -1,119 +1,48 @@
<appendix id="appendix-packages" xreflabel="Package details"> <appendix id="appendix-packages" xreflabel="Package details">
<indexterm> <indexterm>
<primary>packages</primary> <primary>packages</primary>
</indexterm> </indexterm>
<title>&repmgr; package details</title> <title>&repmgr; package details</title>
<para>
This section provides technical details about various &repmgr; binary
packages, such as location of the installed binaries and
configuration files.
</para>
<sect1 id="packages-centos" xreflabel="CentOS packages">
<title>CentOS, RHEL, Scientific Linux etc.</title>
<para> <para>
This section provides technical details about various &repmgr; binary Currently packages are provided for versions 6.x and 7.x of CentOS et al.
packages, such as location of the installed binaries and
configuration files.
</para> </para>
<sect1 id="packages-centos" xreflabel="CentOS packages"> <note>
<title>CentOS Packages</title>
<indexterm>
<primary>packages</primary>
<secondary>CentOS packages</secondary>
</indexterm>
<para> <para>
Currently, &repmgr; RPM packages are provided for versions 6.x and 7.x of CentOS. These should also For PostgreSQL 9.6 and lower, the CentOS packages use a mixture of <literal>9.6</literal>
work on matching versions of Red Hat Enterprise Linux, Scientific Linux and Oracle Enterprise Linux; and <literal>96</literal> in various places to designate the major version;
together with CentOS, these are the same RedHat-based distributions for which the main community project from PostgreSQL 10, the first part of the version number (e.g. <literal>10</literal>) is
(PGDG) provides packages (see the <ulink url="https://yum.postgresql.org/">PostgreSQL RPM Building Project</ulink> the major version, so there is more consistency in file/path/package naming.
page for details).
</para> </para>
</note>
<para>
Note these &repmgr; RPM packages are not designed to work with SuSE/OpenSuSE.
</para>
<note>
<para>
&repmgr; packages are designed to be compatible with community-provided PostgreSQL packages.
They may not work with vendor-specific packages such as those provided by RedHat for RHEL
customers, as the filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
</para>
</note>
<sect2 id="packages-centos-repositories">
<title>CentOS repositories</title>
<para>
&repmgr; packages are available from the public 2ndQuadrant repository, and also the
PostgreSQL community repository. The 2ndQuadrant repository is updated immediately
after each
&repmgr; release.
</para>
<table id="centos-2ndquadrant-repository">
<title>2ndQuadrant public repository</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://rpm.2ndquadrant.com/">https://rpm.2ndquadrant.com/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://repmgr.org/docs/4.0/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ">https://repmgr.org/docs/4.0/installation-packages.html#INSTALLATION-PACKAGES-REDHAT-2NDQ</ulink></entry>
</row>
</tbody>
</tgroup>
</table>
<table id="centos-pgdg-repository">
<title>PostgreSQL community repository (PGDG)</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2 id="packages-centos-details">
<title>CentOS package details</title>
<para>
The two tables below list relevant information, paths, commands etc. for the &repmgr; packages on
CentOS 7 (with systemd) and CentOS 6 (no systemd). Substitute the appropriate PostgreSQL major
version number for your installation.
</para>
<note>
<para>
For PostgreSQL 9.6 and lower, the CentOS packages use a mixture of <literal>9.6</literal>
and <literal>96</literal> in various places to designate the major version; e.g. the
package name is <literal>repmgr96</literal>, but the binary directory is
<filename>/var/lib/pgsql/9.6/data</filename>.
</para>
<para>
From PostgreSQL 10, the first part of the version number (e.g. <literal>10</literal>) is
the major version, so there is more consistency in file/path/package naming
(package <literal>repmgr10</literal>, binary directory <filename>/var/lib/pgsql/10/data</filename>).
</para>
</note>
<table id="centos-7-packages"> <table id="centos-7-packages">
<title>CentOS 7 packages</title> <title>CentOS 7 packages</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
</row>
<row> <row>
<entry>Package name example:</entry> <entry>Package name example:</entry>
<entry><filename>repmgr10-4.0.4-1.rhel7.x86_64</filename></entry> <entry><filename>repmgr10-4.0.0-1.rhel7.x86_64</filename></entry>
</row> </row>
<row> <row>
@@ -123,7 +52,7 @@
<row> <row>
<entry>Installation command:</entry> <entry>Installation command:</entry>
<entry><literal>yum install repmgr10</literal></entry> <entry><literal>yum install -y repmgr10</literal></entry>
</row> </row>
<row> <row>
@@ -132,7 +61,7 @@
</row> </row>
<row> <row>
<entry>repmgr in default path:</entry> <entry>In default path:</entry>
<entry>NO</entry> <entry>NO</entry>
</row> </row>
@@ -141,14 +70,9 @@
<entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry> <entry><filename>/etc/repmgr/10/repmgr.conf</filename></entry>
</row> </row>
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/pgsql/10/data</filename></entry>
</row>
<row> <row>
<entry>repmgrd service command:</entry> <entry>repmgrd service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] repmgr10</command></entry> <entry><literal>service repmgr10</literal></entry>
</row> </row>
<row> <row>
@@ -158,7 +82,7 @@
<row> <row>
<entry>repmgrd log file location:</entry> <entry>repmgrd log file location:</entry>
<entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry> <entry>(not specified)</entry>
</row> </row>
</tbody> </tbody>
@@ -170,20 +94,29 @@
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="https://yum.postgresql.org/repopackages.php">https://yum.postgresql.org/repopackages.php</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://yum.postgresql.org/">https://yum.postgresql.org/</ulink></entry>
</row>
<row> <row>
<entry>Package name example:</entry> <entry>Package name example:</entry>
<entry><filename>repmgr96-4.0.4-1.rhel6.x86_64</filename></entry> <entry><filename>repmgr96-4.0.0-1.rhel6.x86_64</filename></entry>
</row> </row>
<row> <row>
<entry>Metapackage:</entry> <entry>Metapackage:</entry>
<entry>(none)</entry> <entry>NO</entry>
</row> </row>
<row> <row>
<entry>Installation command:</entry> <entry>Installation command:</entry>
<entry><literal>yum install repmgr96</literal></entry> <entry><literal>yum install -y repmgr96</literal></entry>
</row> </row>
<row> <row>
@@ -192,7 +125,7 @@
</row> </row>
<row> <row>
<entry>repmgr in default path:</entry> <entry>In default path:</entry>
<entry>NO</entry> <entry>NO</entry>
</row> </row>
@@ -201,14 +134,9 @@
<entry><filename>/etc/repmgr/9.6/repmgr.conf</filename></entry> <entry><filename>/etc/repmgr/9.6/repmgr.conf</filename></entry>
</row> </row>
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/pgsql/9.6/data</filename></entry>
</row>
<row> <row>
<entry>repmgrd service command:</entry> <entry>repmgrd service command:</entry>
<entry><literal>service [start|stop|restart|reload] repmgr-9.6</literal></entry> <entry>service repmgr-9.6</entry>
</row> </row>
<row> <row>
@@ -225,187 +153,6 @@
</tgroup> </tgroup>
</table> </table>
</sect2>
</sect1> </sect1>
<sect1 id="packages-debian-ubuntu" xreflabel="Debian/Ubuntu packages">
<title>Debian/Ubuntu Packages</title>
<indexterm>
<primary>packages</primary>
<secondary>Debian/Ubuntu packages</secondary>
</indexterm>
<para>
&repmgr; <literal>.deb</literal> packages are provided via the
PostgreSQL Community APT repository, and are available for each community-supported
PostgreSQL version, currently supported Debian releases, and currently supported
Ubuntu LTS releases.
</para>
<sect2 id="packages-apt-repository">
<title>APT repository</title>
<para>
&repmgr; packages are available from the PostgreSQL Community APT repository,
which is updated immediately after each &repmgr; release.
</para>
<table id="apt-repository">
<title>PostgreSQL Community APT repository (PGDG)</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Repository URL:</entry>
<entry><ulink url="http://apt.postgresql.org/">http://apt.postgresql.org/</ulink></entry>
</row>
<row>
<entry>Repository documentation:</entry>
<entry><ulink url="https://wiki.postgresql.org/wiki/Apt)">https://wiki.postgresql.org/wiki/Apt)</ulink></entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2 id="packages-debian-details">
<title>Debian/Ubuntu package details</title>
<para>
The table below lists relevant information, paths, commands etc. for the &repmgr; packages on
Debian 9.x ("Stretch"). Substitute the appropriate PostgreSQL major
version number for your installation.
</para>
<para>
See also <xref linkend="repmgrd-configuration-debian-ubuntu"> for some specifics related
to configuring the <application>repmgrd</application> daemon.
</para>
<table id="debian-9-packages">
<title>Debian 9.x packages</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Package name example:</entry>
<entry><filename>postgresql-10-repmgr</filename></entry>
</row>
<row>
<entry>Metapackage:</entry>
<entry><filename>repmgr-common</filename></entry>
</row>
<row>
<entry>Installation command:</entry>
<entry><literal>apt-get install postgresql-10-repmgr</literal></entry>
</row>
<row>
<entry>Binary location:</entry>
<entry><filename>/usr/lib/postgresql/10/bin</filename></entry>
</row>
<row>
<entry>repmgr in default path:</entry>
<entry>Yes (via wrapper script <filename>/usr/bin/repmgr</filename>)</entry>
</row>
<row>
<entry>Configuration file location:</entry>
<entry>(not set by package)</entry>
</row>
<row>
<entry>Data directory:</entry>
<entry><filename>/var/lib/postgresql/10/main</filename></entry>
</row>
<row>
<entry>PostgreSQL service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] postgresql@10-main</command></entry>
</row>
<row>
<entry>repmgrd service command:</entry>
<entry><command>systemctl [start|stop|restart|reload] repmgrd</command></entry>
</row>
<row>
<entry>repmgrd service file location:</entry>
<entry><filename>/etc/init.d/repmgrd</filename> (defaults in: <filename>/etc/defaults/repmgrd</filename>)</entry>
</row>
<row>
<entry>repmgrd log file location:</entry>
<entry>(not specified by package; set in <filename>repmgr.conf</filename>)</entry>
</row>
</tbody>
</tgroup>
</table>
<note>
<para>
Instead of using the <application>systemd</application> service command directly,
it's recommended to execute <command>pg_ctlcluster</command> (as <literal>root</literal>,
either directly or via <command>sudo</command>), e.g.:
<programlisting>
<command>pg_ctlcluster 10 main [start|stop|restart|reload]</command></programlisting>
</para>
<para>
For pre-<application>systemd</application> systems, <command>pg_ctlcluster</command>
can be executed directly by the <literal>postgres</literal> user.
</para>
</note>
</sect2>
</sect1>
<sect1 id="packages-packager-info" xreflabel="Information for packagers">
<title>Information for packagers</title>
<indexterm>
<primary>packages</primary>
<secondary>information for packagers</secondary>
</indexterm>
<para>
We recommend patching the following parameters when
building the package as built-in default values for user convenience.
These values can nevertheless be overridden by the user, if desired.
</para>
<itemizedlist>
<listitem>
<para>
Configuration file location: the default configuration file location
can be hard-coded by patching <varname>package_conf_file</varname>
in <filename>configfile.c</filename>:
<programlisting>
/* packagers: if feasible, patch configuration file path into "package_conf_file" */
char package_conf_file[MAXPGPATH] = "";</programlisting>
</para>
<para>
See also: <xref linkend="configuration-file">
</para>
</listitem>
<listitem>
<para>
PID file location: the default <application>repmgrd</application> PID file
location can be hard-coded by patching <varname>package_pid_file</varname>
in <filename>repmgrd.c</filename>:
<programlisting>
/* packagers: if feasible, patch PID file path into "package_pid_file" */
char package_pid_file[MAXPGPATH] = "";</programlisting>
</para>
<para>
See also: <xref linkend="repmgrd-pid-file">
</para>
</listitem>
</itemizedlist>
</sect1>
</appendix> </appendix>

View File

@@ -11,751 +11,18 @@
before performing an upgrade, as there may be version-specific upgrade steps. before performing an upgrade, as there may be version-specific upgrade steps.
</para> </para>
<para> <para>
See also: <xref linkend="upgrading-repmgr"> See also: <xref linkend="upgrading-repmgr">
</para> </para>
<sect1 id="release-4.1.0">
<title>Release 4.1.0</title>
<para><emphasis>???? ??, 2018</emphasis></para>
<para>
&repmgr; 4.1.0 introduces some changes to <application>repmgrd</application>
behaviour and some additional configuration parameters.
</para>
<para>
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.6.
The following post-upgrade steps must be carried out:
<itemizedlist>
<listitem>
<para>
<application>repmgrd</application> (if running) must be restarted.
</para>
</listitem>
<listitem>
<para>
Execute <command>ALTER EXTENSION repmgr UPDATE</command>
on the primary server in the database where &repmgr; is installed.
</para>
</listitem>
</itemizedlist>
A restart of the PostgreSQL server is <emphasis>not</emphasis> required
for this release.
</para>
<para>
See <xref linkend="upgrading-repmgr-extension"> for more details.
</para>
<para>
Configuration changes are backwards-compatible and no changes to
<filename>repmgr.conf</filename> are required. However users should
review the changes listed below.
</para>
<sect2>
<title>Configuration file changes</title>
<para>
<itemizedlist>
<listitem>
<para>
Default for <xref linkend="repmgr-conf-log-level"> is now <option>INFO</option>.
This produces additional informative log output, without creating excessive additional
log file volume, and matches the setting assumed for examples in the documentation.
(GitHub #470).
</para>
</listitem>
<listitem>
<para>
<varname>recovery_min_apply_delay</varname> now accepts a minimum value
of <literal>zero</literal> (GitHub #448).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>repmgr enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
<application>repmgr</application>: always exit with an error if an unrecognised
command line option is provided. This matches the behaviour of other PostgreSQL
utilities such as <application>psql</application>. (GitHub #464).
</para>
</listitem>
<listitem>
<para>
<application>repmgr</application>: add <option>-q/--quiet</option> option to suppress non-error
output. (GitHub #468).
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>,
<command><link linkend="repmgr-node-check">repmgr node check</link></command> and
<command><link linkend="repmgr-node-status">repmgr node status</link></command>
return non-zero exit code if node status issues detected. (GitHub #456).
</para>
</listitem>
<listitem>
<para>
Add <option>--csv</option> output option for
<command><link linkend="repmgr-cluster-event">repmgr cluster event</link></command>.
(GitHub #471).
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-witness-unregister">repmgr witness unregister</link></command>
can be run on any node, by providing the ID of the witness node with <option>--node-id</option>.
(GitHub #472).
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
will refuse to run if an exclusive backup is taking place on the current primary.
(GitHub #476).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>repmgrd enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
<application>repmgrd</application>: create a PID file by default
(GitHub #457). For details, see <xref linkend="repmgrd-pid-file">.
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: daemonize process by default.
In case, for whatever reason, the user does not wish to daemonize the
process, provide <option>--daemonize=false</option>.
(GitHub #458).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
<command><link linkend="repmgr-standby-register">repmgr standby register --wait-sync</link></command>:
fix behaviour when no timeout provided.
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-cluster-cleanup">repmgr cluster cleanup</link></command>:
add missing help options. (GitHub #461/#462).
</para>
</listitem>
<listitem>
<para>
Ensure witness node follows new primary after switchover. (GitHub #453).
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-node-check">repmgr node check</link></command> and
<command><link linkend="repmgr-node-status">repmgr node status</link></command>:
fix witness node handling. (GitHub #451).
</para>
</listitem>
<listitem>
<para>
When using <command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>
with <option>--recovery-conf-only</option> and replication slots, ensure
<varname>primary_slot_name</varname> is set correctly. (GitHub #474).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.6">
<title>Release 4.0.6</title>
<para><emphasis>June 14, 2018</emphasis></para>
<para>
&repmgr; 4.0.6 contains a number of bug fixes and usability enhancements.
</para>
<para>
We recommend upgrading to this version as soon as possible.
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.5;
<application>repmgrd</application> (if running) should be restarted. See <xref linkend="upgrading-repmgr">
for more details.
</para>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
<command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command> and
<command><link linkend="repmgr-cluster-matrix">repmgr cluster matrix</link></command>:
return non-zero exit code if node connection issues detected (GitHub #447)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>:
Improve handling of external configuration file copying, including consideration in
<option>--dry-run</option> check
(GitHub #443)
</para>
</listitem>
<listitem>
<para>
When using <option>--dry-run</option>, force log level to <literal>INFO</literal>
to ensure output will always be displayed
(GitHub #441)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>:
Improve documentation of <option>--recovery-conf-only</option> mode
(GitHub #438)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-standby-clone">repmgr standby clone</link></command>:
Don't require presence of <varname>user</varname> parameter in conninfo string
(GitHub #437)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
<command><link linkend="repmgr-witness-register">repmgr witness register</link></command>:
prevent registration of a witness server with the same name as an existing node
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command>:
check node has actually connected to new primary before reporting success
(GitHub #444)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>:
Fix bug when parsing <option>--config-files</option> parameter
(GitHub #442)
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: ensure local node is counted as quorum member
(GitHub #439)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.5">
<title>Release 4.0.5</title>
<para><emphasis>Wed May 2, 2018</emphasis></para>
<para>
&repmgr; 4.0.5 contains a number of usability enhancements related to
<application>pg_rewind</application> usage, <filename>recovery.conf</filename>
generation and (in <application>repmgrd</application>) handling of various
corner-case situations, as well as a number of bug fixes.
</para>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
Various documentation improvements, with particular emphasis on
the importance of setting appropriate <link linkend="configuration-file-service-commands">service commands</link>
instead of relying on <application>pg_ctl</application>.
</para>
</listitem>
<listitem>
<para>
Poll demoted primary after restart as a standby during a switchover operation (GitHub #408).
</para>
</listitem>
<listitem>
<para>
Add configuration parameter <option>config_directory</option> (GitHub #424).
</para>
</listitem>
<listitem>
<para>
Add sanity check if <option>--upstream-node-id</option> not supplied when executing
<xref linkend="repmgr-standby-register"> (GitHub #395).
</para>
</listitem>
<listitem>
<para>
Enable <link linkend="repmgr-node-rejoin-pg-rewind">pg_rewind</link> to be used with
PostgreSQL 9.3/9.4 (GitHub #413).
</para>
</listitem>
<listitem>
<para>
When generating replication connection strings, set <literal>dbname=replication</literal>
if appropriate (GitHub #421).
</para>
</listitem>
<listitem>
<para>
Enable provision of <option>archive_cleanup_command</option> in <filename>recovery.conf</filename>
(GitHub #416).
</para>
</listitem>
<listitem>
<para>
Actively check for node to <link linkend="repmgr-node-rejoin">rejoin</link> cluster (GitHub #415).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: set <literal>connect_timeout=2</literal> (if not explicitly set)
when pinging a server.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
Fix display of conninfo parsing error messages.
</para>
</listitem>
<listitem>
<para>
Fix minimum accepted value for <varname>degraded_monitoring_timeout</varname> (GitHub #411).
</para>
</listitem>
<listitem>
<para>
Fix superuser password handling (GitHub #400)
</para>
</listitem>
<listitem>
<para>
Fix parsing of <varname>archive_ready_critical</varname> configuration file parameter (GitHub #426).
</para>
</listitem>
<listitem>
<para>
Fix <command><link linkend="repmgr-cluster-crosscheck">repmgr cluster crosscheck</link></command>
output (GitHub #389)
</para>
</listitem>
<listitem>
<para>
Fix memory leaks in witness code (GitHub #402).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: handle <command>pg_ctl promote</command> timeout (GitHub #425).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: handle failover situation with only two nodes in the primary
location, and at least one node in another location (GitHub #407).
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: prevent standby connection handle from going stale.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.4">
<title>Release 4.0.4</title>
<para><emphasis>Fri Mar 9, 2018</emphasis></para>
<para>
&repmgr; 4.0.4 contains some bug fixes and and a number of
usability enhancements related to logging/diagnostics,
event notifications and pre-action checks.
</para>
<para>
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.3;
<application>repmgrd</application> (if running) should be restarted. See <xref linkend="upgrading-repmgr">
for more details.
</para>
<note>
<para>
It is not possible to perform a switchover where the demotion candidate is
running &repmgr; 4.0.2 or lower; all nodes should be upgraded to the latest version (4.0.4).
This is due to additional checks introduced in 4.0.3 which require the presence of
4.0.3 or later versions on all nodes.
</para>
</note>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
add <command><link linkend="repmgr-standby-clone">repmgr standby clone --recovery-conf-only</link></command>
option to enable integration of a standby cloned from another source into a &repmgr; cluster (GitHub #382)
</para>
</listitem>
<listitem>
<para>
remove restriction on using replication slots when cloning from a Barman server (GitHub #379)
</para>
</listitem>
<listitem>
<para>
make <command><link linkend="repmgr-standby-promote">repmgr standby promote</link></command>
timeout values configurable (GitHub #387)
</para>
</listitem>
<listitem>
<para>
add missing options to main <literal>--help</literal> output (GitHub #391, #392)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
ensure <command><link linkend="repmgr-node-rejoin">repmgr node rejoin</link></command>
honours the <option>--dry-run</option> option (GitHub #383)
</para>
</listitem>
<listitem>
<para>
improve replication slot warnings generated by
<command><link linkend="repmgr-node-status">repmgr node status</link></command>
(GitHub #385)
</para>
</listitem>
<listitem>
<para>
fix --superuser handling when cloning a standby (GitHub #380)
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: improve detection of status change from primary to
standby
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: improve reconnection to the local node after a
failover (previously a connection error due to the node starting up was being
interpreted as the node being unavailable)
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: when running on a witness server, correctly connect
to new primary after a failover
</para>
</listitem>
<listitem>
<para>
<application>repmgrd</application>: add <link linkend="event-notifications">event notification</link>
<literal>repmgrd_shutdown</literal> (GitHub #393)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.3">
<title>Release 4.0.3</title>
<para><emphasis>Thu Feb 15, 2018</emphasis></para>
<para>
&repmgr; 4.0.3 contains some bug fixes and and a number of
usability enhancements related to logging/diagnostics,
event notifications and pre-action checks.
</para>
<para>
This release can be installed as a simple package upgrade from repmgr 4.0 ~ 4.0.2;
repmgrd (if running) should be restarted.
</para>
<note>
<para>
It is not possible to perform a switchover where the demotion candidate is
running &repmgr; 4.0.2 or lower; all nodes should be upgraded to 4.0.3. This is due
to additional checks introduced in 4.0.3 which require the presence of
4.0.3 or later versions on all nodes.
</para>
</note>
<sect2>
<title>Usability enhancements</title>
<para>
<itemizedlist>
<listitem>
<para>
improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
behaviour when <command>pg_ctl</command> is used to control the server and logging output is
not explicitly redirected
</para>
</listitem>
<listitem>
<para>
improve <command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
log messages and provide new exit code <literal>ERR_SWITCHOVER_INCOMPLETE</literal> when old primary could
not be shut down cleanly
</para>
</listitem>
<listitem>
<para>
add check to verify the demotion candidate can make a replication connection to the
promotion candidate before executing a switchover (GitHub #370)
</para>
</listitem>
<listitem>
<para>
add check for sufficient walsenders and replication slots on the promotion candidate before executing
<command><link linkend="repmgr-standby-switchover">repmgr standby switchover</link></command>
(GitHub #371)
</para>
</listitem>
<listitem>
<para>
add --dry-run mode to <command><link linkend="repmgr-standby-switchover">repmgr standby follow</link></command>
(GitHub #368)
</para>
</listitem>
<listitem>
<para>
provide information about the primary node for
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command> and
<command><link linkend="repmgr-standby-follow">repmgr standby follow</link></command> event notifications (GitHub #375)
</para>
</listitem>
<listitem>
<para>
add <literal>standby_register_sync</literal> <link linkend="event-notifications">event notification</link>, which is fired when
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command>
is run with the <option>--wait-sync</option> option and the new or updated standby node
record has synchronised to the standby (GitHub #374)
</para>
</listitem>
<listitem>
<para>
when running <command><link linkend="repmgr-cluster-show">repmgr cluster show</link></command>,
if any node is unreachable, output the error message encountered in the list of warnings
(GitHub #369)
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2>
<title>Bug fixes</title>
<para>
<itemizedlist>
<listitem>
<para>
ensure an inactive data directory can be overwritten when
cloning a standby (GitHub #366)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-node-status">repmgr node status</link></command>
upstream node display fixed (GitHub #363)
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-primary-unregister">repmgr primary unregister</link></command>:
clarify usage and fix <literal>--help</literal> output (GitHub #373)
</para>
</listitem>
<listitem>
<para>
parsing of <varname>pg_basebackup_options</varname> fixed (GitHub #376)
</para>
</listitem>
<listitem>
<para>
ensure the <filename>pg_subtrans</filename> directory is created when cloning a
standby in Barman mode
</para>
</listitem>
<listitem>
<para>
<command><link linkend="repmgr-witness-register">repmgr witness register</link></command>:
fix primary node check (GitHub #377).
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
</sect1>
<sect1 id="release-4.0.2"> <sect1 id="release-4.0.2">
<title>Release 4.0.2</title> <title>Release 4.0.2</title>
<para><emphasis>Thu Jan 18, 2018</emphasis></para> <para><emphasis>Thu Jan 18, 2018</emphasis></para>
<para> <para>
&repmgr; 4.0.2 contains some bug fixes and small usability enhancements. repmgr 4.0.2 contains some bug fixes and minor usability enhancements.
</para> </para>
<para>
This release can be installed as a simple package upgrade from &repmgr; 4.0.1 or 4.0;
<application>repmgrd</application> (if running) should be restarted.
</para>
<sect2> <sect2>
<title>Usability enhancements</title> <title>Usability enhancements</title>
@@ -854,7 +121,7 @@
<para><emphasis>Wed Dec 13, 2017</emphasis></para> <para><emphasis>Wed Dec 13, 2017</emphasis></para>
<para> <para>
&repmgr; 4.0.1 is a bugfix release. repmgr 4.0.1 is a bugfix release.
</para> </para>
<sect2> <sect2>
<title>Bug fixes</title> <title>Bug fixes</title>

View File

@@ -33,5 +33,34 @@
</sect1> </sect1>
<sect1 id="repmgr-rpm-key" xreflabel="repmgr rpm key">
<title>repmgr RPM signing key</title>
<para>
The signing key ID used for <application>repmgr</application> source code bundles is:
<ulink url="http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr">
<literal>0x702D883A</literal></ulink>.
</para>
<para>
To download the <application>repmgr</application> source key to your computer:
<programlisting>
curl -s http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr | gpg --import
gpg --fingerprint 0x702D883A
</programlisting>
then verify that the fingerprint is the expected value:
<programlisting>
AE4E 390E A58E 0037 6148 3F29 888D 018B 702D 883A</programlisting>
</para>
<para>
To check a repository RPM, use <application>rpmkeys</application> to load the
packaging signing key into the RPM database then use <literal>rpm -K</literal>, e.g.:
<programlisting>
sudo rpmkeys --import http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr
rpm -K postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch.rpm
</programlisting>
</para>
</sect1>
</appendix> </appendix>

View File

@@ -51,7 +51,7 @@
</itemizedlist> </itemizedlist>
</para> </para>
<sect2 id="cloning-from-barman-prerequisites"> <sect2 id="cloning-from-barman-prerequisites" xreflabel="Prerequisites for cloning from Barman">
<title>Prerequisites for cloning from Barman</title> <title>Prerequisites for cloning from Barman</title>
<para> <para>
In order to enable Barman support for <command>repmgr standby clone</command>, following In order to enable Barman support for <command>repmgr standby clone</command>, following
@@ -356,7 +356,7 @@
By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup By default, <command>pg_basebackup</command> performs a checkpoint before beginning the backup
process. However, a normal checkpoint may take some time to complete; process. However, a normal checkpoint may take some time to complete;
a fast checkpoint can be forced with the <literal>-c/--fast-checkpoint</literal> option. a fast checkpoint can be forced with the <literal>-c/--fast-checkpoint</literal> option.
Note that this may impact performance of the server being cloned from (typically the primary) However this may impact performance of the server being cloned from (typically the primary)
so should be used with care. so should be used with care.
</para> </para>
<tip> <tip>
@@ -384,16 +384,11 @@
<sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords"> <sect2 id="cloning-advanced-managing-passwords" xreflabel="Managing passwords">
<title>Managing passwords</title> <title>Managing passwords</title>
<indexterm>
<primary>cloning</primary>
<secondary>using passwords</secondary>
</indexterm>
<para> <para>
If replication connections to a standby's upstream server are password-protected, If replication connections to a standby's upstream server are password-protected,
the standby must be able to provide the password so it can begin streaming replication. the standby must be able to provide the password so it can begin streaming
replication.
</para> </para>
<para> <para>
The recommended way to do this is to store the password in the <literal>postgres</literal> system The recommended way to do this is to store the password in the <literal>postgres</literal> system
user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the user's <filename>~/.pgpass</filename> file. It's also possible to store the password in the
@@ -401,17 +396,6 @@
security reasons. For more details see the security reasons. For more details see the
<ulink url="https://www.postgresql.org/docs/current/static/libpq-pgpass.html">PostgreSQL password file documentation</ulink>. <ulink url="https://www.postgresql.org/docs/current/static/libpq-pgpass.html">PostgreSQL password file documentation</ulink>.
</para> </para>
<note>
<para>
If using a <filename>pgpass</filename> file, an entry for the replication user (by default the
user who connects to the <literal>repmgr</literal> database) <emphasis>must</emphasis>
be provided, with database name set to <literal>replication</literal>, e.g.:
<programlisting>
node1:5432:replication:repmgr:12345</programlisting>
</para>
</note>
<para> <para>
If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>, If, for whatever reason, you wish to include the password in <filename>recovery.conf</filename>,
set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in set <varname>use_primary_conninfo_password</varname> to <literal>true</literal> in
@@ -423,7 +407,8 @@
</para> </para>
<para> <para>
It is of course also possible to include the password value in the <varname>conninfo</varname> It is of course also possible to include the password value in the <varname>conninfo</varname>
string for each node, but this is obviously a security risk and should be avoided. string for each node, but this is obviously a security risk and should be
avoided.
</para> </para>
<para> <para>
From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname> From PostgreSQL 9.6, <application>libpq</application> supports the <varname>passfile</varname>

View File

@@ -1,107 +0,0 @@
<sect1 id="configuration-file-log-settings" xreflabel="log settings">
<indexterm>
<primary>repmgr.conf</primary>
<secondary>log settings</secondary>
</indexterm>
<indexterm>
<primary>log settings</primary>
<secondary>configuration in repmgr.conf</secondary>
</indexterm>
<title>Log settings</title>
<para>
By default, &repmgr; and <application>repmgrd</application> write log output to
<literal>STDERR</literal>. An alternative log destination can be specified
(either a file or <literal>syslog</literal>).
</para>
<note>
<para>
The &repmgr; application itself will continue to write log output to <literal>STDERR</literal>
even if another log destination is configured, as otherwise any output resulting from a command
line operation will "disappear" into the log.
</para>
<para>
This behaviour can be overriden with the command line option <option>--log-to-file</option>,
which will redirect all logging output to the configured log destination. This is recommended
when &repmgr; is executed by another application, particularly <application>repmgrd</application>,
to enable log output generated by the &repmgr; application to be stored for later reference.
</para>
</note>
<variablelist>
<varlistentry id="repmgr-conf-log-level" xreflabel="log_level">
<term><varname>log_level</varname> (<type>string</type>)
<indexterm>
<primary><varname>log_level</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
One of <option>DEBUG</option>, <option>INFO</option>, <option>NOTICE</option>,
<option>WARNING</option>, <option>ERROR</option>, <option>ALERT</option>, <option>CRIT</option>
or <option>EMERG</option>.
</para>
<para>
Default is <option>INFO</option>.
</para>
<para>
Note that <option>DEBUG</option> will produce a substantial amount of log output
and should not be enabled in normal use.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-log-facility" xreflabel="log_facility">
<term><varname>log_facility</varname> (<type>string</type>)
<indexterm>
<primary><varname>log_facility</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Logging facility: possible values are <option>STDERR</option> (default), or for
syslog integration, one of <option>LOCAL0</option>, <option>LOCAL1</option>, <option>...</option>,
<option>LOCAL7</option>, <option>USER</option>.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-log-file" xreflabel="log_file">
<term><varname>log_file</varname> (<type>string</type>)
<indexterm>
<primary><varname>log_file</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
If <xref linkend="repmgr-conf-log-facility"> is set to <option>STDERR</option>, log output
can be redirected to the specified file.
</para>
<para>
See <xref linkend="repmgrd-log-rotation"> for information on configuring log rotation.
</para>
</listitem>
</varlistentry>
<varlistentry id="repmgr-conf-log-status-interval" xreflabel="log_status_interval">
<term><varname>log_status_interval</varname> (<type>integer</type>)
<indexterm>
<primary><varname>log_status_interval</varname> configuration file parameter</primary>
</indexterm>
</term>
<listitem>
<para>
This setting causes <application>repmgrd</application> to emit a status log
line at the specified interval (in seconds, default <literal>300</literal>)
describing <application>repmgrd</application>'s current state, e.g.:
</para>
<programlisting>
[2018-07-12 00:47:32] [INFO] monitoring connection to upstream node "node1" (node ID: 1)</programlisting>
</listitem>
</varlistentry>
</variablelist>
</sect1>

View File

@@ -1,123 +0,0 @@
<sect1 id="configuration-file-service-commands" xreflabel="service command settings">
<indexterm>
<primary>repmgr.conf</primary>
<secondary>service command settings</secondary>
</indexterm>
<indexterm>
<primary>service command settings</primary>
<secondary>configuration in repmgr.conf</secondary>
</indexterm>
<title>Service command settings</title>
<para>
In some circumstances, &repmgr; (and <application>repmgrd</application>) need to
be able to stop, start or restart PostgreSQL. &repmgr; commands which need to do this
include <link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>,
<link linkend="repmgr-standby-switchover"><command>repmgr standby switchover</command></link> and
<link linkend="repmgr-node-rejoin"><command>repmgr node rejoin</command></link>.
</para>
<para>
By default, &repmgr; will use PostgreSQL's <command>pg_ctl</command> to control the PostgreSQL
server. However this can lead to various problems, particularly when PostgreSQL has been
installed from packages, and expecially so if <application>systemd</application> is in use.
</para>
<note>
<para>
If using <application>systemd</application>, ensure you have <varname>RemoteIPC</varname> set to <literal>off</literal>.
See the <ulink url="https://wiki.postgresql.org/wiki/Systemd">systemd</ulink>
entry in the <ulink url="https://wiki.postgresql.org/wiki/Main_Page">PostgreSQL wiki</ulink> for details.
</para>
</note>
<para>
With this in mind, we recommend to <emphasis>always</emphasis> configure &repmgr; to use the
available system service commands.
</para>
<para>
To do this, specify the appropriate command for each action
in <filename>repmgr.conf</filename> using the following configuration
parameters:
<programlisting>
service_start_command
service_stop_command
service_restart_command
service_reload_command</programlisting>
</para>
<note>
<para>
It's also possible to specify a <varname>service_promote_command</varname>.
This is intended for systems which provide a package-level promote command,
such as Debian's <application>pg_ctlcluster</application>, to promote the
PostgreSQL from standby to primary.
</para>
<para>
If your packaging system does not provide such a command, it can be left empty,
and &repmgr; will generate the appropriate `pg_ctl ... promote` command.
</para>
<para>
Do not confuse this with <varname>promote_command</varname>, which is used
by <application>repmgrd</application> to execute <xref linkend="repmgr-standby-promote">.
</para>
</note>
<para>
To confirm which command &repmgr; will execute for each action, use
<command>repmgr node service --list --action=...</command>, e.g.:
<programlisting>
repmgr -f /etc/repmgr.conf node service --list --action=stop
repmgr -f /etc/repmgr.conf node service --list --action=start
repmgr -f /etc/repmgr.conf node service --list --action=restart
repmgr -f /etc/repmgr.conf node service --list --action=reload</programlisting>
</para>
<para>
These commands will be executed by the system user which &repmgr; runs as (usually <literal>postgres</literal>)
and will probably require passwordless sudo access to be able to execute the command.
</para>
<para>
For example, using <application>systemd</application> on CentOS 7, the service commands can be
set as follows:
<programlisting>
service_start_command = 'sudo systemctl start postgresql-9.6'
service_stop_command = 'sudo systemctl stop postgresql-9.6'
service_restart_command = 'sudo systemctl restart postgresql-9.6'
service_reload_command = 'sudo systemctl reload postgresql-9.6'</programlisting>
and <filename>/etc/sudoers</filename> should be set as follows:
<programlisting>
Defaults:postgres !requiretty
postgres ALL = NOPASSWD: /usr/bin/systemctl stop postgresql-9.6, \
/usr/bin/systemctl start postgresql-9.6, \
/usr/bin/systemctl restart postgresql-9.6 \
/usr/bin/systemctl reload postgresql-9.6</programlisting>
</para>
<important>
<indexterm>
<primary>pg_ctlcluster</primary>
<secondary>service command settings</secondary>
</indexterm>
<para>
Debian/Ubuntu users: instead of calling <command>sudo systemctl</command> directly, use
<command>sudo pg_ctlcluster</command>, e.g.:
<programlisting>
service_start_command = 'sudo pg_ctlcluster 9.6 main start'
service_stop_command = 'sudo pg_ctlcluster 9.6 main stop'
service_restart_command = 'sudo pg_ctlcluster 9.6 main restart'
service_reload_command = 'sudo pg_ctlcluster 9.6 main reload'</programlisting>
and set <filename>/etc/sudoers</filename> accordingly.
</para>
<para>
While <command>pg_ctlcluster</command> will work when executed as user <literal>postgres</literal>,
it's strongly recommended to use <command>sudo pg_ctlcluster</command> on <application>systemd</application>
systems, to ensure <application>systemd</application> has a correct picture of
the PostgreSQL application state.
</para>
</important>
</sect1>

View File

@@ -1,10 +1,10 @@
<sect1 id="configuration-file-settings" xreflabel="required configuration file settings"> <sect1 id="configuration-file-settings" xreflabel="configuration file settings">
<indexterm> <indexterm>
<primary>repmgr.conf</primary> <primary>repmgr.conf</primary>
<secondary>required settings</secondary> <secondary>settings</secondary>
</indexterm> </indexterm>
<title>Required configuration file settings</title> <title>Configuration file settings</title>
<para> <para>
Each <filename>repmgr.conf</filename> file must contain the following parameters: Each <filename>repmgr.conf</filename> file must contain the following parameters:
</para> </para>
@@ -92,10 +92,7 @@
<para> <para>
For a full list of annotated configuration items, see the file For a full list of annotated configuration items, see the file
<ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink>. <ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</>.
</para>
<para>
For <application>repmgrd</application>-specific settings, see <xref linkend="repmgrd-configuration">.
</para> </para>
<note> <note>

View File

@@ -2,17 +2,15 @@
<title>repmgr configuration</title> <title>repmgr configuration</title>
&configuration-file; &configuration-file;
&configuration-file-required-settings; &configuration-file-settings;
&configuration-file-log-settings;
&configuration-file-service-commands;
<sect1 id="configuration-permissions" xreflabel="Database user permissions"> <sect1 id="configuration-permissions" xreflabel="User permissions">
<indexterm> <indexterm>
<primary>configuration</primary> <primary>configuration</primary>
<secondary>database user permissions</secondary> <secondary>user permissions</secondary>
</indexterm> </indexterm>
<title>repmgr database user permissions</title> <title>repmgr user permissions</title>
<para> <para>
&repmgr; will create an extension database containing objects &repmgr; will create an extension database containing objects
for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname> for administering &repmgr; metadata. The user defined in the <varname>conninfo</varname>

View File

@@ -37,7 +37,7 @@
<filename>repmgr.conf</filename>. <filename>repmgr.conf</filename>.
</para> </para>
<para> <para>
The following format placeholders are provided for all event notifications: This parameter accepts the following format placeholders:
</para> </para>
<variablelist> <variablelist>
@@ -84,8 +84,18 @@
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist>
<varlistentry>
<term><option>%p</option></term>
<listitem>
<para>
node ID of the demoted standby (<xref linkend="repmgr-standby-switchover"> only)
</para>
</listitem>
</varlistentry>
</variablelist>
<para> <para>
The values provided for <literal>%t</literal> and <literal>%d</literal> The values provided for <literal>%t</literal> and <literal>%d</literal>
will probably contain spaces, so should be quoted in the provided command will probably contain spaces, so should be quoted in the provided command
@@ -94,60 +104,34 @@
event_notification_command='/path/to/some/script %n %e %s "%t" "%d"' event_notification_command='/path/to/some/script %n %e %s "%t" "%d"'
</programlisting> </programlisting>
</para> </para>
<para> <para>
The following parameters are provided for a subset of event notifications: Additionally the following format placeholders are available for the event
type <varname>bdr_failover</varname> and optionally <varname>bdr_recovery</varname>:
</para> </para>
<variablelist> <variablelist>
<varlistentry>
<term><option>%p</option></term>
<listitem>
<para>
node ID of the current primary (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
node ID of the demoted primary (<xref linkend="repmgr-standby-switchover"> only)
</para>
</listitem>
</varlistentry>
<varlistentry> <varlistentry>
<term><option>%c</option></term> <term><option>%c</option></term>
<listitem> <listitem>
<para> <para>
<literal>conninfo</literal> string of the primary node conninfo string of the next available node
(<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">)
</para>
<para>
<literal>conninfo</literal> string of the next available node
(<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term><option>%a</option></term> <term><option>%a</option></term>
<listitem> <listitem>
<para> <para>
name of the current primary node (<xref linkend="repmgr-standby-register"> and <xref linkend="repmgr-standby-follow">) name of the next available node
</para>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para> <para>
The values provided for <literal>%c</literal> and <literal>%a</literal> These should always be quoted.
will probably contain spaces, so should always be quoted.
</para> </para>
<para> <para>
By default, all notification types will be passed to the designated script; By default, all notification types will be passed to the designated script;
the notification types can be filtered to explicitly named ones using the the notification types can be filtered to explicitly named ones:
<varname>event_notifications</varname> parameter:
<itemizedlist spacing="compact" mark="bullet"> <itemizedlist spacing="compact" mark="bullet">
@@ -160,9 +144,6 @@
<listitem> <listitem>
<simpara><literal>standby_register</literal></simpara> <simpara><literal>standby_register</literal></simpara>
</listitem> </listitem>
<listitem>
<simpara><literal>standby_register_sync</literal></simpara>
</listitem>
<listitem> <listitem>
<simpara><literal>standby_unregister</literal></simpara> <simpara><literal>standby_unregister</literal></simpara>
</listitem> </listitem>
@@ -205,18 +186,6 @@
<listitem> <listitem>
<simpara><literal>repmgrd_failover_follow</literal></simpara> <simpara><literal>repmgrd_failover_follow</literal></simpara>
</listitem> </listitem>
<listitem>
<simpara><literal>repmgrd_failover_aborted</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_upstream_disconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_upstream_reconnect</literal></simpara>
</listitem>
<listitem>
<simpara><literal>repmgrd_promote_error</literal></simpara>
</listitem>
<listitem> <listitem>
<simpara><literal>bdr_failover</literal></simpara> <simpara><literal>bdr_failover</literal></simpara>
</listitem> </listitem>
@@ -235,7 +204,6 @@
</itemizedlist> </itemizedlist>
</para> </para>
<para> <para>
Note that under some circumstances (e.g. when no replication cluster primary Note that under some circumstances (e.g. when no replication cluster primary
could be located), it will not be possible to write an entry into the could be located), it will not be possible to write an entry into the

View File

@@ -38,9 +38,7 @@
<!ENTITY quickstart SYSTEM "quickstart.sgml"> <!ENTITY quickstart SYSTEM "quickstart.sgml">
<!ENTITY configuration SYSTEM "configuration.sgml"> <!ENTITY configuration SYSTEM "configuration.sgml">
<!ENTITY configuration-file SYSTEM "configuration-file.sgml"> <!ENTITY configuration-file SYSTEM "configuration-file.sgml">
<!ENTITY configuration-file-required-settings SYSTEM "configuration-file-required-settings.sgml"> <!ENTITY configuration-file-settings SYSTEM "configuration-file-settings.sgml">
<!ENTITY configuration-file-log-settings SYSTEM "configuration-file-log-settings.sgml">
<!ENTITY configuration-file-service-commands SYSTEM "configuration-file-service-commands.sgml">
<!ENTITY cloning-standbys SYSTEM "cloning-standbys.sgml"> <!ENTITY cloning-standbys SYSTEM "cloning-standbys.sgml">
<!ENTITY promoting-standby SYSTEM "promoting-standby.sgml"> <!ENTITY promoting-standby SYSTEM "promoting-standby.sgml">
<!ENTITY follow-new-primary SYSTEM "follow-new-primary.sgml"> <!ENTITY follow-new-primary SYSTEM "follow-new-primary.sgml">

View File

@@ -5,107 +5,83 @@
system. system.
</para> </para>
<sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, CentOS and Fedora"> <sect2 id="installation-packages-redhat" xreflabel="Installing from packages on RHEL, Fedora and CentOS">
<indexterm> <indexterm>
<primary>installation</primary> <primary>installation</primary>
<secondary>on Red Hat/CentOS/Fedora etc.</secondary> <secondary>on Redhat/CentOS/Fedora etc.</secondary>
</indexterm> </indexterm>
<title>RedHat/CentOS/Fedora</title> <title>RedHat/Fedora/CentOS</title>
<para> <para>
&repmgr; RPM packages for RedHat/CentOS variants and Fedora are available from the RPM packages for &repmgr; are available via Yum through
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink>
<ulink url="https://rpm.2ndquadrant.com/">public RPM repository</ulink>; see following
section for details.
</para>
<para>
RPM packages for &repmgr; are also available via Yum through
the PostgreSQL Global Development Group RPM repository the PostgreSQL Global Development Group RPM repository
(<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>). (<ulink url="https://yum.postgresql.org/">http://yum.postgresql.org/</ulink>).
Follow the instructions for your distribution (RedHat, CentOS, Follow the instructions for your distribution (RedHat, CentOS,
Fedora, etc.) and architecture as detailed there. Note that it can take some days Fedora, etc.) and architecture as detailed there.
for new &repmgr; packages to become available via the this repository.
</para> </para>
<note>
<para>
&repmgr; packages are designed to be compatible with the community-provided PostgreSQL packages.
They may not work with vendor-specific packages such as those provided by RedHat for RHEL
customers, as the filesystem layout may be different to the community RPMs.
Please contact your support vendor for assistance.
</para>
</note>
<para> <para>
For more information on the package contents, including details of installation <ulink url="https://2ndquadrant.com">2ndQuadrant</ulink> also provides its
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>, own RPM packages which are made available
see the appendix section <xref linkend="packages-centos">. at the same time as each &repmgr; release, as it can take some days for
them to become available via the main PGDG repository. See following section for details:
</para> </para>
<sect3 id="installation-packages-redhat-2ndq"> <sect3 id="installation-packages-redhat-2ndq">
<title>2ndQuadrant public RPM yum repository</title> <title>2ndQuadrant repmgr yum repository</title>
<note>
<para>
<ulink url="https://2ndquadrant.com">2ndQuadrant</ulink> previously provided a dedicated
&repmgr; repository at
<ulink url="http://packages.2ndquadrant.com/repmgr/">http://packages.2ndquadrant.com/repmgr/</ulink>.
This repository will be deprecated in a future release as it is now replaced by
the <ulink url="https://rpm.2ndquadrant.com/">public RPM repository</ulink>
documented below.
</para>
</note>
<para> <para>
Beginning with <ulink url="https://repmgr.org/docs/4.0/release-4.0.5.html">repmgr 4.0.5</ulink>, Beginning with <ulink url="http://repmgr.org/release-notes-3.1.3.html">repmgr 3.1.3</ulink>,
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal> <ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a dedicated <literal>yum</literal>
<ulink url="https://rpm.2ndquadrant.com/">public RPM repository</ulink> for 2ndQuadrant software, repository for &repmgr; releases. This repository complements the main
including &repmgr;. We recommend using this for all future &repmgr; releases. <ulink url="https://yum.postgresql.org/repopackages.php">PGDG community repository</ulink>,
</para> but enables repmgr users to access the latest &repmgr; packages before they are
<para> available via the PGDG repository, which can take several days to be updated following
General instructions for using this repository can be found on its a fresh &repmgr; release.
<ulink url="https://rpm.2ndquadrant.com/">homepage</ulink>. Specific instructions </para>
for installing &repmgr; follow below.
</para>
<para> <para>
<emphasis>Installation</emphasis> <emphasis>Installation</emphasis>
<itemizedlist> <itemizedlist>
<listitem>
<para>
Locate the repository RPM for your PostgreSQL version from the list at:
<ulink url="https://rpm.2ndquadrant.com/">https://rpm.2ndquadrant.com/</ulink>
</para>
</listitem>
<listitem> <listitem>
<para> <para>
Install the repository RPM for your distribution and PostgreSQL version Import the repository public key (optional but recommended):
(this enables the 2ndQuadrant repository as a source of &repmgr; packages).
</para>
<para>
For example, for PostgreSQL 10 on CentOS, execute:
<programlisting>
sudo yum install https://rpm.2ndquadrant.com/site/content/2ndquadrant-repo-10-1-1.el7.noarch.rpm
</programlisting>
</para>
<para>
Verify that the repository is installed with:
<programlisting>
sudo yum repolist</programlisting>
The output should contain two entries like this:
<programlisting>
2ndquadrant-repo-10/7/x86_64 2ndQuadrant packages for PG10 for rhel 7 - x86_64 1
2ndquadrant-repo-10-debug/7/x86_64 2ndQuadrant packages for PG10 for rhel 7 - x86_64 - Debug 1</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting> <programlisting>
$ yum install repmgr10</programlisting> rpm --import http://packages.2ndquadrant.com/repmgr/RPM-GPG-KEY-repmgr</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the repository RPM for your distribution (this enables the 2ndQuadrant
repository as a source of repmgr packages):
<itemizedlist>
<listitem>
<simpara>
<emphasis>Fedora:</emphasis>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-fedora-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
<listitem>
<simpara>
<emphasis>RHEL, CentOS etc:</emphasis>
<ulink url="http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm">http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</ulink>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
e.g.:
<programlisting>
$ yum install http://packages.2ndquadrant.com/repmgr/yum-repo-rpms/repmgr-rhel-1.0-1.noarch.rpm</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr96</literal>), e.g.:
<programlisting>
$ yum install repmgr96</programlisting>
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
@@ -115,13 +91,13 @@ $ yum install repmgr10</programlisting>
<emphasis>Compatibility with PGDG Repositories</emphasis> <emphasis>Compatibility with PGDG Repositories</emphasis>
</para> </para>
<para> <para>
The 2ndQuadrant &repmgr; yum repository packages use the same definitions and file system layout as the The 2ndQuadrant &repmgr; yum repository uses exactly the same package definitions as the
main PGDG repository. main PGDG repository and is effectively a selective mirror for &repmgr; packages only.
</para> </para>
<para> <para>
Normally <application>yum</application> will prioritize the repository with the most recent &repmgr; version. Normally yum should prioritize the repository with the most recent &repmgr; version.
Once the PGDG repository has been updated, it doesn't matter which repository Once the PGDG repository has been updated, it doesn't matter which repository
the packages are installed from. the packages are installed from.
</para> </para>
<para> <para>
To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal> To ensure the 2ndQuadrant repository is always prioritised, install <literal>yum-plugin-priorities</literal>
@@ -135,23 +111,30 @@ $ yum install repmgr10</programlisting>
To install a specific package version, execute <command>yum --showduplicates list</command> To install a specific package version, execute <command>yum --showduplicates list</command>
for the package in question: for the package in question:
<programlisting> <programlisting>
[root@localhost ~]# yum --showduplicates list repmgr10 [root@localhost ~]# yum --showduplicates list repmgr96
Loaded plugins: fastestmirror Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile Loading mirror speeds from cached hostfile
* base: ftp.iij.ad.jp * base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp * extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp * updates: ftp.iij.ad.jp
Available Packages Available Packages
repmgr10.x86_64 4.0.3-1.rhel7 pgdg10 repmgr96.x86_64 3.2-1.el6 2ndquadrant-repmgr
repmgr10.x86_64 4.0.4-1.rhel7 pgdg10 repmgr96.x86_64 3.2.1-1.el6 2ndquadrant-repmgr
repmgr10.x86_64 4.0.5-1.el7 2ndquadrant-repo-10</programlisting> repmgr96.x86_64 3.3-1.el6 2ndquadrant-repmgr
repmgr96.x86_64 3.3.1-1.el6 2ndquadrant-repmgr
repmgr96.x86_64 3.3.2-1.el6 2ndquadrant-repmgr
repmgr96.x86_64 3.3.2-1.rhel6 pgdg96
repmgr96.x86_64 4.0.0-1.el6 2ndquadrant-repmgr
repmgr96.x86_64 4.0.0-1.rhel6 pgdg96</programlisting>
then append the appropriate version number to the package name with a hyphen, e.g.: then append the appropriate version number to the package name with a hyphen, e.g.:
<programlisting> <programlisting>
[root@localhost ~]# yum install repmgr10-4.0.3-1.rhel7</programlisting> [root@localhost ~]# yum install repmgr96-3.3.2-1.el6</programlisting>
</para> </para>
</sect3> </sect3>
</sect2> </sect2>
<sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu"> <sect2 id="installation-packages-debian" xreflabel="Installing from packages on Debian or Ubuntu">
<indexterm> <indexterm>
@@ -165,85 +148,6 @@ $ yum install repmgr10</programlisting>
Instructions can be found in the APT section of the PostgreSQL Wiki Instructions can be found in the APT section of the PostgreSQL Wiki
(<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>). (<ulink url="https://wiki.postgresql.org/wiki/Apt">https://wiki.postgresql.org/wiki/Apt</ulink>).
</para> </para>
<para>
For more information on the package contents, including details of installation
paths and relevant <link linkend="configuration-file-service-commands">service commands</link>,
see the appendix section <xref linkend="packages-debian-ubuntu">.
</para>
<sect3 id="installation-packages-debian-ubuntu-2ndq">
<title>2ndQuadrant public apt repository for Debian/Ubuntu</title>
<para>
Beginning with <ulink url="https://repmgr.org/docs/4.0/release-4.0.5.html">repmgr 4.0.5</ulink>,
<ulink url="https://2ndquadrant.com/">2ndQuadrant</ulink> provides a
<ulink url="https://apt.2ndquadrant.com/">public apt repository</ulink> for 2ndQuadrant software,
including &repmgr;.
</para>
<para>
General instructions for using this repository can be found on its
<ulink url="https://apt.2ndquadrant.com/">homepage</ulink>. Specific instructions
for installing &repmgr; follow below.
</para>
<para>
<emphasis>Installation</emphasis>
<itemizedlist>
<listitem>
<para>
If not already present, install the <application>apt-transport-https</application> package:
<programlisting>
sudo apt-get install apt-transport-https</programlisting>
</para>
</listitem>
<listitem>
<para>
Create <filename>/etc/apt/sources.list.d/2ndquadrant.list</filename> as follows:
<programlisting>
sudo sh -c 'echo "deb https://apt.2ndquadrant.com/ $(lsb_release -cs)-2ndquadrant main" > /etc/apt/sources.list.d/2ndquadrant.list'</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the 2ndQuadrant <ulink url="https://apt.2ndquadrant.com/site/keys/9904CD4BD6BAF0C3.asc">repository key</ulink>:
<programlisting>
sudo apt-get install curl ca-certificates
curl https://apt.2ndquadrant.com/site/keys/9904CD4BD6BAF0C3.asc | sudo apt-key add -</programlisting>
</para>
</listitem>
<listitem>
<para>
Update the package list
<programlisting>
sudo apt-get update</programlisting>
</para>
</listitem>
<listitem>
<para>
Install the &repmgr version appropriate for your PostgreSQL version (e.g. <literal>repmgr10</literal>):
<programlisting>
$ apt-get install postgresql-10-repmgr</programlisting>
</para>
<note>
<para>
For packages for PostgreSQL 9.6 and earlier, the package name includes
a period between major and minor version numbers, e.g.
<literal>postgresql-9.6-repmgr</literal>.
</para>
</note>
</listitem>
</itemizedlist>
</para>
</sect3>
</sect2> </sect2>
</sect1> </sect1>

View File

@@ -80,7 +80,7 @@
</para> </para>
<para> <para>
There are also tags for each &repmgr; release, e.g. <filename>4.0.5</filename>. There are also tags for each &repmgr; release, e.g. <filename>REL4_0_STABLE</filename>.
</para> </para>
<para> <para>

View File

@@ -2,8 +2,7 @@
<title>repmgr overview</title> <title>repmgr overview</title>
<para> <para>
This chapter provides a high-level overview of &repmgr;'s components and This chapter provides a high-level overview of repmgr's components and functionality.
functionality.
</para> </para>
<sect1 id="repmgr-concepts" xreflabel="Concepts"> <sect1 id="repmgr-concepts" xreflabel="Concepts">
@@ -179,8 +178,8 @@
<para> <para>
In order to effectively manage a replication cluster, &repmgr; needs to store In order to effectively manage a replication cluster, &repmgr; needs to store
information about the servers in the cluster in a dedicated database schema. information about the servers in the cluster in a dedicated database schema.
This schema is automatically created by the &repmgr; extension, which is installed This schema is automatically by the &repmgr; extension, which is installed
during the first step in initializing a &repmgr;-administered cluster during the first step in initialising a &repmgr;-administered cluster
(<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>) (<command><link linkend="repmgr-primary-register">repmgr primary register</link></command>)
and contains the following objects: and contains the following objects:
<variablelist> <variablelist>

View File

@@ -234,7 +234,7 @@
<para> <para>
<filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory, <filename>repmgr.conf</filename> should not be stored inside the PostgreSQL data directory,
as it could be overwritten when setting up or reinitialising the PostgreSQL as it could be overwritten when setting up or reinitialising the PostgreSQL
server. See sections <xref linkend="configuration"> and <xref linkend="configuration-file"> server. See sections on <xref linkend="configuration-file"> and <xref linkend="configuration-file-settings">
for further details about <filename>repmgr.conf</filename>. for further details about <filename>repmgr.conf</filename>.
</para> </para>
<tip> <tip>

37
doc/repmgr-bdr.sgml Normal file
View File

@@ -0,0 +1,37 @@
<chapter id="repmgrd-bdr">
<indexterm>
<primary>repmgrd</primary>
<secondary>BDR</secondary>
</indexterm>
<indexterm>
<primary>BDR</primary>
</indexterm>
<title>BDR failover with repmgrd</title>
<para>
&repmgr; 4.x provides support for monitoring BDR nodes and taking action in
case one of the nodes fails.
</para>
<note>
<simpara>
Due to the nature of BDR, it's only safe to use this solution for
a two-node scenario. Introducing additional nodes will create an inherent
risk of node desynchronisation if a node goes down without being cleanly
removed from the cluster.
</simpara>
</note>
<para>
In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes
with `repmgrd` and redirecting queries from the failed node to the remaining
active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically
reconfigure a proxy server/connection pooler such as <application>PgBouncer</application>.
</para>
<sect1 id="prerequisites" xreflable="BDR prequisites">
</sect1>
</chapter>

View File

@@ -38,34 +38,5 @@
and therefore determine the state of outbound connections from that node. and therefore determine the state of outbound connections from that node.
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr cluster crosscheck</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The check completed successfully and all nodes are reachable.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more nodes could not be reached.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
</refentry> </refentry>

View File

@@ -49,22 +49,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Output format</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>--csv</literal>: generate output in CSV format. Note that the <literal>Details</literal>
column will currently not be emitted in CSV format.
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1> <refsect1>
<title>Example</title> <title>Example</title>
<para> <para>

View File

@@ -97,35 +97,5 @@
useful result. useful result.
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr cluster matrix</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The check completed successfully and all nodes are reachable.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more nodes could not be reached.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
</refentry> </refentry>

View File

@@ -113,40 +113,4 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr cluster show</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status">, <xref linkend="repmgr-node-check">
</para>
</refsect1>
</refentry> </refentry>

View File

@@ -61,9 +61,7 @@
<listitem> <listitem>
<simpara> <simpara>
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived, <literal>--archive-ready</literal>: checks for WAL files which have not yet been archived
and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
</simpara> </simpara>
</listitem> </listitem>
@@ -79,110 +77,11 @@
</simpara> </simpara>
</listitem> </listitem>
<listitem>
<simpara>
<literal>--missing-slots</literal>: checks there are no missing replication slots
</simpara>
</listitem>
</itemizedlist> </itemizedlist>
</para> </para>
<para>
Individual checks can also be output in a Nagios-compatible format by additionally
providing the option <literal>--nagios</literal>.
</para>
</refsect1> </refsect1>
<refsect1>
<title>Output format</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>--csv</literal>: generate output in CSV format (not available
for individual checks)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--nagios</literal>: generate output in a Nagios-compatible format
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
When executing <command>repmgr node check</command> with one of the individual
checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
(even if <literal>--nagios</literal> is not supplied):
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>0</literal>: OK
</simpara>
</listitem>
<listitem>
<simpara>
<literal>1</literal>: WARNING
</simpara>
</listitem>
<listitem>
<simpara>
<literal>2</literal>: ERROR
</simpara>
</listitem>
<listitem>
<simpara>
<literal>3</literal>: UNKNOWN
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
Following exit codes can be emitted by <command>repmgr status check</command>
if no individual check was specified.
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>See also</title>
<para>
<xref linkend="repmgr-node-status">, <xref linkend="repmgr-cluster-show">
</para>
</refsect1>
</refentry> </refentry>

View File

@@ -45,94 +45,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually execute the rejoin.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term>
<listitem>
<para>
Execute <application>pg_rewind</application> if necessary.
</para>
<para>
It is only necessary to provide the <application>pg_rewind</application>
if using PostgreSQL 9.3 or 9.4, and <application>pg_rewind</application>
is not installed in the PostgreSQL <filename>bin</filename> directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-files</option></term>
<listitem>
<para>
comma-separated list of configuration files to retain after
executing <application>pg_rewind</application>.
</para>
<para>
Currently <application>pg_rewind</application> will overwrite
the local node's configuration files with the files from the source node,
so it's advisable to use this option to ensure they are kept.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--config-archive-dir</option></term>
<listitem>
<para>
Directory to temporarily store configuration files specified with
<option>--config-files</option>; default: <filename>/tmp</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-W/--no-wait</option></term>
<listitem>
<para>
Don't wait for the node to rejoin cluster.
</para>
<para>
If this option is supplied, &repmgr; will restart the node but
not wait for it to connect to the primary.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>node_rejoin_timeout</literal>:
the maximum length of time (in seconds) to wait for
the node to reconnect to the replication cluster (defaults to
the value set in <literal>standby_reconnect_timeout</literal>,
60 seconds).
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
@@ -165,18 +77,11 @@
</refsect1> </refsect1>
<refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind"> <refsect1 id="repmgr-node-rejoin-pg-rewind" xreflabel="Using pg_rewind">
<indexterm>
<primary>pg_rewind</primary>
<secondary>using with "repmgr node rejoin"</secondary>
</indexterm>
<title>Using <command>pg_rewind</command></title> <title>Using <command>pg_rewind</command></title>
<para> <para>
<command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a <command>repmgr node rejoin</command> can optionally use <command>pg_rewind</command> to re-integrate a
node which has diverged from the rest of the cluster, typically a failed primary. node which has diverged from the rest of the cluster, typically a failed primary.
<command>pg_rewind</command> is available in PostgreSQL 9.5 and later as part of the core distribution, <command>pg_rewind</command> is available in PostgreSQL 9.5 and later.
and can be installed from external sources for PostgreSQL 9.3 and 9.4.
</para> </para>
<note> <note>
<para> <para>

View File

@@ -24,7 +24,7 @@
<title>Example</title> <title>Example</title>
<para> <para>
<programlisting> <programlisting>
$ repmgr -f /etc/repmgr.conf node status $ repmgr -f /etc/repmgr.comf node status
Node "node1": Node "node1":
PostgreSQL version: 10beta1 PostgreSQL version: 10beta1
Total data size: 30 MB Total data size: 30 MB
@@ -38,54 +38,10 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Output format</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>--csv</literal>: generate output in CSV format
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr node status</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
No issues were detected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_NODE_STATUS (25)</option></term>
<listitem>
<para>
One or more issues were detected.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>See also</title> <title>See also</title>
<para> <para>
See <xref linkend="repmgr-node-check"> to diagnose issues and <xref linkend="repmgr-cluster-show"> See <xref linkend="repmgr-node-check"> to diagnose issues.
for an overview of all nodes in the cluster.
</para> </para>
</refsect1> </refsect1>
</refentry> </refentry>

View File

@@ -17,7 +17,7 @@
<title>Description</title> <title>Description</title>
<para> <para>
<command>repmgr primary register</command> registers a primary node in a <command>repmgr primary register</command> registers a primary node in a
streaming replication cluster, and configures it for use with &repmgr;, including streaming replication cluster, and configures it for use with repmgr, including
installing the &repmgr; extension. This command needs to be executed before any installing the &repmgr; extension. This command needs to be executed before any
standby nodes are registered. standby nodes are registered.
</para> </para>
@@ -26,7 +26,7 @@
<refsect1> <refsect1>
<title>Execution</title> <title>Execution</title>
<para> <para>
Execute with the <option>--dry-run</option> option to check what would happen without Execute with the <literal>--dry-run</literal> option to check what would happen without
actually registering the primary. actually registering the primary.
</para> </para>
<para> <para>
@@ -36,7 +36,7 @@
<note> <note>
<para> <para>
If providing the configuration file location with <option>-f/--config-file</option>, If providing the configuration file location with <literal>-f/--config-file</literal>,
avoid using a relative path, as &repmgr; stores the configuration file location avoid using a relative path, as &repmgr; stores the configuration file location
in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during in the repmgr metadata for use when &repmgr; is executed remotely (e.g. during
<xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the <xref linkend="repmgr-standby-switchover">). &repmgr; will attempt to convert the
@@ -48,33 +48,6 @@
</note> </note>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually register the primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option>, <option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>

View File

@@ -21,10 +21,6 @@
<refsect1> <refsect1>
<title>Execution</title> <title>Execution</title>
<para>
<command>repmgr primary unregister</command> can be run on any active &repmgr; node,
with the ID of the node to unregister passed as <option>--node-id</option>.
</para>
<para> <para>
Execute with the <literal>--dry-run</literal> option to check what would happen without Execute with the <literal>--dry-run</literal> option to check what would happen without
actually unregistering the node. actually unregistering the node.
@@ -36,34 +32,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually unregister the primary.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--node-id</option></term>
<listitem>
<para>
ID of the inactive primary to be unregistered.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>

View File

@@ -25,11 +25,9 @@
<note> <note>
<simpara> <simpara>
<command>repmgr standby clone</command> does not start the standby, and after cloning <command>repmgr standby clone</command> does not start the standby, and after cloning
a standby, the command <command>repmgr standby register</command> must be executed to <command>repmgr standby register</command> must be executed to notify &repmgr; of its presence.
notify &repmgr; of its existence.
</simpara> </simpara>
</note> </note>
</refsect1> </refsect1>
@@ -67,71 +65,7 @@
</tip> </tip>
</refsect1> </refsect1>
<refsect1 id="repmgr-standby-clone-recovery-conf"> <refsect1 id="repmgr-standby-clone-wal-management" xreflabel="Managing WAL during the cloning process">
<indexterm>
<primary>recovery.conf</primary>
<secondary>customising with "repmgr standby clone"</secondary>
</indexterm>
<title>Customising recovery.conf</title>
<para>
By default, &repmgr; will create a minimal <filename>recovery.conf</filename>
containing following parameters:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>standby_mode</varname> (always <literal>'on'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>recovery_target_timeline</varname> (always <literal>'latest'</literal>)</simpara>
</listitem>
<listitem>
<simpara><varname>primary_conninfo</varname></simpara>
</listitem>
<listitem>
<simpara><varname>primary_slot_name</varname> (if replication slots in use)</simpara>
</listitem>
</itemizedlist>
<para>
The following additional parameters can be specified in <filename>repmgr.conf</filename>
for inclusion in <filename>recovery.conf</filename>:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara><varname>restore_command</varname></simpara>
</listitem>
<listitem>
<simpara><varname>archive_cleanup_command</varname></simpara>
</listitem>
<listitem>
<simpara><varname>recovery_min_apply_delay</varname></simpara>
</listitem>
</itemizedlist>
<note>
<para>
We recommend using <ulink url="https://www.pgbarman.org/">Barman</ulink> to manage
WAL file archiving. For more details on combining &repmgr; and <application>Barman</application>,
in particular using <varname>restore_command</varname> to configure Barman as a backup source of
WAL files, see <xref linkend="cloning-from-barman">.
</para>
</note>
</refsect1>
<refsect1 id="repmgr-standby-clone-wal-management">
<title>Managing WAL during the cloning process</title> <title>Managing WAL during the cloning process</title>
<para> <para>
When initially cloning a standby, you will need to ensure When initially cloning a standby, you will need to ensure
@@ -166,173 +100,6 @@
</note> </note>
</refsect1> </refsect1>
<refsect1 id="repmgr-standby-create-recovery-conf">
<indexterm>
<primary>recovery.conf</primary>
<secondary>generating for a standby cloned by another method</secondary>
</indexterm>
<title>Using a standby cloned by another method</title>
<para>
&repmgr; supports standbys cloned by another method (e.g. using <application>barman</application>'s
<command><ulink url="http://docs.pgbarman.org/release/2.4/#recover">barman recover</ulink></command> command).
</para>
<para>
To integrate the standby as a &repmgr; node, ensure the <filename>repmgr.conf</filename>
file is created for the node, and that it has been registered using
<command><link linkend="repmgr-standby-register">repmgr standby register</link></command>.
Then execute the command <command>repmgr standby clone --recovery-conf-only</command>.
This will create the <filename>recovery.conf</filename> file needed to attach
the node to its upstream, and will also create a replication slot on the
upstream node if required.
</para>
<para>
Note that the upstream node must be running. An existing
<filename>recovery.conf</filename> will not be overwritten unless the
<option>-F/--force</option> option is provided.
</para>
<para>
Execute <command>repmgr standby clone --recovery-conf-only --dry-run</command>
to check the prerequisites for creating the <filename>recovery.conf</filename> file,
and display the contents of the file without actually creating it.
</para>
<note>
<para>
<option>--recovery-conf-only</option> was introduced in &repmgr; <link linkend="release-4.0.4">4.0.4</link>.
</para>
</note>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>-d, --dbname=CONNINFO</option></term>
<listitem>
<para>
Connection string of the upstream node to use for cloning.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually clone the standby.
</para>
<para>
If <option>--recovery-conf-only</option> specified, the contents of
the generated <filename>recovery.conf</filename> file will be displayed
but the file itself not written.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-c, --fast-checkpoint</option></term>
<listitem>
<para>
Force fast checkpoint (not effective when cloning from Barman).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--copy-external-config-files[={samepath|pgdata}]</option></term>
<listitem>
<para>
Copy configuration files located outside the data directory on the source
node to the same path on the standby (default) or to the
PostgreSQL data directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--no-upstream-connection</option></term>
<listitem>
<para>
When using Barman, do not connect to upstream node.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R, --remote-user=USERNAME</option></term>
<listitem>
<para>
Remote system username for SSH operations (default: current local system username).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option> --recovery-conf-only</option></term>
<listitem>
<para>
Create <filename>recovery.conf</filename> file for a previously cloned instance. &repmgr 4.0.4 and later.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--replication-user</option></term>
<listitem>
<para>
User to make replication connections with (optional, not usually required).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--superuser</option></term>
<listitem>
<para>
If the &repmgr; user is not a superuser, the name of a valid superuser must
be provided with this option.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--upstream-conninfo</option></term>
<listitem>
<para>
<literal>primary_conninfo</literal> value to write in recovery.conf
when the intended upstream server does not yet exist.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--upstream-node-id</option></term>
<listitem>
<para>
ID of the upstream node to replicate from (optional, defaults to primary node)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--without-barman </option></term>
<listitem>
<para>
Do not use Barman even if configured.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
@@ -340,11 +107,5 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>See also</title>
<para>
See <xref linkend="cloning-standbys"> for details about various aspects of cloning.
</para>
</refsect1>
</refentry> </refentry>

View File

@@ -26,19 +26,10 @@
running. It can only be used to attach an active standby to the current primary node running. It can only be used to attach an active standby to the current primary node
(and not to another standby). (and not to another standby).
</para> </para>
<tip> <para>
<para> To re-add an inactive node to the replication cluster, see
To re-add an inactive node to the replication cluster, use <xref linkend="repmgr-node-rejoin">
<xref linkend="repmgr-node-rejoin">. </para>
</para>
</tip>
<para>
<command>repmgr standby follow</command> will wait up to
<varname>standby_follow_timeout</varname> seconds (default: <literal>30</literal>)
to verify the standby has actually connected to the new primary.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
@@ -57,56 +48,14 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually follow a new standby.
</para>
<important>
<para>
This does not guarantee the standby can follow the primary; in
particular, whether the primary and standby timelines have diverged,
can currently only be determined by actually attempting to
attach the standby to the primary.
</para>
</important>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<term><option>--wait</option></term>
<listitem>
<para>
Wait for a primary to appear. &repmgr; will wait for up to
<varname>primary_follow_timeout</varname> seconds
(default: 60 seconds) to verify that the standby is following the new primary.
This value can be defined in <filename>repmgr.conf</filename>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated. A <literal>standby_follow</literal> <link linkend="event-notifications">event notification</link> will be generated.
</para> </para>
<para>
If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the primary
being followed, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>See also</title> <title>See also</title>
<para> <para>
<xref linkend="repmgr-node-rejoin"> <xref linkend="repmgr-node-rejoin">

View File

@@ -26,13 +26,6 @@
by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application> by using <xref linkend="repmgr-standby-follow">; if <application>repmgrd</application>
is active, it will handle this automatically. is active, it will handle this automatically.
</para> </para>
<para>
Note that &repmgr; will wait for up to <varname>promote_check_timeout</varname> seconds
(default: 60 seconds) to verify that the standby has been promoted, and will
check the promotion every <varname>promote_check_interval</varname> seconds (default: 1 second).
Both values can be defined in <filename>repmgr.conf</filename>.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
@@ -49,7 +42,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>

View File

@@ -57,16 +57,16 @@
<refsect1 id="repmgr-standby-register-wait-sync" xreflabel="repmgr standby register --wait-sync"> <refsect1 id="repmgr-standby-register-wait-sync" xreflabel="repmgr standby register --wait-sync">
<title>Waiting for the registration to propagate to the standby</title> <title>Waiting for the registration to propagate to the standby</title>
<para> <para>
Depending on your environment and workload, it may take some time for the standby's node record Depending on your environment and workload, it may take some time for
to propagate from the primary to the standby. Some actions (such as starting the standby's node record to propagate from the primary to the standby. Some
<application>repmgrd</application>) require that the standby's node record actions (such as starting <application>repmgrd</application>) require that the standby's node record
is present and up-to-date to function correctly. is present and up-to-date to function correctly.
</para> </para>
<para> <para>
By providing the option <option>--wait-sync</option> to the By providing the option <literal>--wait-sync</literal> to the
<command>repmgr standby register</command> command, &repmgr; will wait <command>repmgr standby register</command> command, &repmgr; will wait
until the record is synchronised before exiting. An optional timeout (in until the record is synchronised before exiting. An optional timeout (in
seconds) can be added to this option (e.g. <option>--wait-sync=60</option>). seconds) can be added to this option (e.g. <literal>--wait-sync=60</literal>).
</para> </para>
</refsect1> </refsect1>
@@ -75,109 +75,29 @@
<para> <para>
Under some circumstances you may wish to register a standby which is not Under some circumstances you may wish to register a standby which is not
yet running; this can be the case when using provisioning tools to create yet running; this can be the case when using provisioning tools to create
a complex replication cluster. In this case, by using the <option>-F/--force</option> a complex replication cluster. In this case, by using the <literal>-F/--force</literal>
option and providing the connection parameters to the primary server, option and providing the connection parameters to the primary server,
the standby can be registered. the standby can be registered.
</para> </para>
<para> <para>
Similarly, with cascading replication it may be necessary to register Similarly, with cascading replication it may be necessary to register
a standby whose upstream node has not yet been registered - in this case, a standby whose upstream node has not yet been registered - in this case,
using <option>-F/--force</option> will result in the creation of an inactive placeholder using <literal>-F/--force</literal> will result in the creation of an inactive placeholder
record for the upstream node, which will however later need to be registered record for the upstream node, which will however later need to be registered
with the <option>-F/--force</option> option too. with the <literal>-F/--force</literal> option too.
</para> </para>
<para> <para>
When used with <command>repmgr standby register</command>, care should be taken that use of the When used with <command>repmgr standby register</command>, care should be taken that use of the
<option>-F/--force</option> option does not result in an incorrectly configured cluster. <literal>-F/--force</literal> option does not result in an incorrectly configured cluster.
</para> </para>
</refsect1> </refsect1>
<refsect1 id="repmgr-standby-register-node-cloned-other-source">
<title>Registering a node not cloned by repmgr</title>
<para>
If you've cloned a standby using another method (e.g. <application>barman</application>'s
<command>barman recover</command> command), first execute
<link linkend="repmgr-standby-create-recovery-conf">repmgr standby clone --recovery-conf-only</link>
to add the <filename>recovery.conf</filename> file, then register the standby as usual.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually register the standby.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-F</option><option>--force</option></term>
<listitem>
<para>
Overwrite an existing node record
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--upstream-node-id</option></term>
<listitem>
<para>
ID of the upstream node to replicate from (optional)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--wait-start</option></term>
<listitem>
<para>
wait for the standby to start (timeout in seconds, default 30 seconds)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--wait-sync</option></term>
<listitem>
<para>
wait for the node record to synchronise to the standby (optional timeout in seconds)
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>
A <literal>standby_register</literal> <link linkend="event-notifications">event notification</link> A <literal>standby_register</literal> <link linkend="event-notifications">event notification</link>
will be generated immediately after the node record is updated on the primary. will be generated.
</para> </para>
<para>
If the <option>--wait-sync</option> option is provided, a <literal>standby_register_sync</literal>
event notification will be generated immediately after the node record has synchronised to the
standby.
</para>
<para>
If provided, &repmgr; will substitute the placeholders <literal>%p</literal> with the node ID of the
primary node, <literal>%c</literal> with its <literal>conninfo</literal> string, and
<literal>%a</literal> with its node name.
</para>
</refsect1> </refsect1>
</refentry> </refentry>

View File

@@ -12,7 +12,6 @@
<refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose> <refpurpose>promote a standby to primary and demote the existing primary to a standby</refpurpose>
</refnamediv> </refnamediv>
<refsect1> <refsect1>
<title>Description</title> <title>Description</title>
@@ -23,34 +22,9 @@
</para> </para>
<para> <para>
If other standbys are connected to the demotion candidate, &repmgr; can instruct If other standbys are connected to the demotion candidate, &repmgr; can instruct
these to follow the new primary if the option <literal>--siblings-follow</literal> these to follow the new primary if the option <literal>--siblings-follow</literal>
is specified. This requires a passwordless SSH connection between the promotion is specified.
candidate (new primary) and the standbys attached to the demotion candidate
(existing primary).
</para> </para>
<note>
<para>
Performing a switchover is a non-trivial operation. In particular it
relies on the current primary being able to shut down cleanly and quickly.
&repmgr; will attempt to check for potential issues but cannot guarantee
a successful switchover.
</para>
</note>
<para>
For more details on performing a switchover, including preparation and configuration,
see section <xref linkend="performing-switchover">.
</para>
<note>
<para>
<application>repmgrd</application> should not be active on any nodes while a switchover is being
executed. This restriction may be lifted in a later version.
</para>
<para>
&repmgr; will not perform the switchover if an exclusive backup is running on the current primary.
</para>
</note>
</refsect1> </refsect1>
<refsect1> <refsect1>
@@ -73,13 +47,6 @@
<para> <para>
Check prerequisites but don't actually execute a switchover. Check prerequisites but don't actually execute a switchover.
</para> </para>
<important>
<para>
Success of <option>--dry-run</option> does not imply the switchover will
complete successfully, only that
the prerequisites for performing the operation are met.
</para>
</important>
</listitem> </listitem>
</varlistentry> </varlistentry>
@@ -90,24 +57,15 @@
<para> <para>
Ignore warnings and continue anyway. Ignore warnings and continue anyway.
</para> </para>
<para>
Specifically, if a problem is encountered when shutting down the current primary,
using <option>-F/--force</option> will cause &repmgr; to continue by promoting
the standby to be the new primary, and if <option>--siblings-follow</option> is
specified, attach any other standbys to the new primary.
</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term><option>--force-rewind[=/path/to/pg_rewind]</option></term> <term><option>--force-rewind</option></term>
<listitem> <listitem>
<para> <para>
Use <application>pg_rewind</application> to reintegrate the old primary if necessary Use <application>pg_rewind</application> to reintegrate the old primary if necessary
(and the prerequisites for using <application>pg_rewind</application> are met). (PostgreSQL 9.5 and later).
If using PostgreSQL 9.3 or 9.4, and the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory,
provide its full path. For more details see also <xref linkend="switchover-pg-rewind">.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@@ -134,48 +92,6 @@
</refsect1> </refsect1>
<refsect1>
<title>Configuration file settings</title>
<para>
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>reconnect_attempts</literal>: number of times to check the original primary
for a clean shutdown after executing the shutdown command, before aborting
</simpara>
</listitem>
<listitem>
<simpara>
<literal>reconnect_interval</literal>: interval (in seconds) to check the original
primary for a clean shutdown after executing the shutdown command (up to a maximum
of <literal>reconnect_attempts</literal> tries)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>replication_lag_critical</literal>:
if replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the <literal>-F/--force</literal> option
is provided)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>standby_reconnect_timeout</literal>:
number of seconds to attempt to wait for the demoted primary
to reconnect to the promoted primary (default: 60 seconds)
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1> <refsect1>
<title>Execution</title> <title>Execution</title>
@@ -183,16 +99,9 @@
Execute with the <literal>--dry-run</literal> option to test the switchover as far as Execute with the <literal>--dry-run</literal> option to test the switchover as far as
possible without actually changing the status of either node. possible without actually changing the status of either node.
</para> </para>
<important>
<para>
<application>repmgrd</application> must be shut down on all nodes while a switchover is being
executed. This restriction will be removed in a future &repmgr; version.
</para>
</important>
<para> <para>
External database connections, e.g. from an application, should not be permitted while <application>repmgrd</application> should not be active on any nodes while a switchover is being
the switchover is taking place. In particular, active transactions on the primary executed. This restriction may be lifted in a later version.
can potentially disrupt the shutdown process.
</para> </para>
</refsect1> </refsect1>
@@ -206,48 +115,10 @@
<para> <para>
If using an event notification script, <literal>standby_switchover</literal> If using an event notification script, <literal>standby_switchover</literal>
will populate the placeholder parameter <literal>%p</literal> with the node ID of will populate the placeholder parameter <literal>%p</literal> with the node ID of
the former primary. the former standby.
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Exit codes</title>
<para>
Following exit codes can be emitted by <command>repmgr standby switchover</command>:
</para>
<variablelist>
<varlistentry>
<term><option>SUCCESS (0)</option></term>
<listitem>
<para>
The switchover completed successfully.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_FAIL (18)</option></term>
<listitem>
<para>
The switchover could not be executed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>ERR_SWITCHOVER_INCOMPLETE (22)</option></term>
<listitem>
<para>
The switchover was executed but a problem was encountered.
Typically this means the former primary could not be reattached
as a standby. Check preceding log messages for more information.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>See also</title> <title>See also</title>

View File

@@ -43,22 +43,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--node-id</option></term>
<listitem>
<para>
<varname>node_id</varname> of the node to unregister (optional)
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>
<para> <para>

View File

@@ -20,10 +20,7 @@
</para> </para>
<para> <para>
The node does not have to be running to be unregistered, however if this is the The node does not have to be running to be unregistered, however if this is the
case then either provide connection information for the primary server, or case then connection information for the primary server must be provided.
execute <command>repmgr witness unregister</command> on a running node and
provide the parameter <option>--node-id</option> with the node ID of the
witness server.
</para> </para>
<para> <para>
Execute with the <literal>--dry-run</literal> option to check what would happen Execute with the <literal>--dry-run</literal> option to check what would happen
@@ -39,17 +36,17 @@
INFO: connecting to witness node "node3" (ID: 3) INFO: connecting to witness node "node3" (ID: 3)
INFO: unregistering witness node 3 INFO: unregistering witness node 3
INFO: witness unregistration complete INFO: witness unregistration complete
DETAIL: witness node with UD 3 successfully unregistered</programlisting> DETAIL: witness node with id 3 (conninfo: host=node3 dbname=repmgr user=repmgr port=5499) successfully unregistered</programlisting>
</para> </para>
<para> <para>
Unregistering a non-running witness node: Unregistering a non-running witness node:
<programlisting> <programlisting>
$ repmgr -f /etc/repmgr.conf witness unregister -h node1 -p 5501 -F $ repmgr -f /etc/repmgr.conf witness unregister -h node1 -p 5501 -F
INFO: connecting to node "node3" (ID: 3) INFO: connecting to witness node "node3" (ID: 3)
NOTICE: unable to connect to node "node3" (ID: 3), removing node record on cluster primary only NOTICE: unable to connect to witness node "node3" (ID: 3), removing node record on cluster primary only
INFO: unregistering witness node 3 INFO: unregistering witness node 3
INFO: witness unregistration complete INFO: witness unregistration complete
DETAIL: witness node with id ID 3 successfully unregistered</programlisting> DETAIL: witness node with id 3 (conninfo: host=node3 dbname=repmgr user=repmgr port=5499) successfully unregistered</programlisting>
</para> </para>
</refsect1> </refsect1>
@@ -65,32 +62,6 @@
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Options</title>
<variablelist>
<varlistentry>
<term><option>--dry-run</option></term>
<listitem>
<para>
Check prerequisites but don't actually unregister the witness.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--node-id</option></term>
<listitem>
<para>
Unregister witness server with the specified node ID.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1> <refsect1>
<title>Event notifications</title> <title>Event notifications</title>

View File

@@ -24,7 +24,7 @@
<para> <para>
In contrast to streaming replication, there's no concept of "promoting" a new In contrast to streaming replication, there's no concept of "promoting" a new
primary node with BDR. Instead, "failover" involves monitoring both nodes primary node with BDR. Instead, "failover" involves monitoring both nodes
with <application>repmgrd</application> and redirecting queries from the failed node to the remaining with `repmgrd` and redirecting queries from the failed node to the remaining
active node. This can be done by using an active node. This can be done by using an
<link linkend="event-notifications">event notification</link> script <link linkend="event-notifications">event notification</link> script
which is called by <application>repmgrd</application> to dynamically which is called by <application>repmgrd</application> to dynamically
@@ -99,16 +99,15 @@
replication cluster. The database must be the BDR-enabled database. replication cluster. The database must be the BDR-enabled database.
</para> </para>
<para> <para>
If defined, the <varname>event_notifications</varname> parameter will restrict If defined, the evenr <application>event_notifications</application> parameter
execution of the script defined in <varname>event_notification_command</varname> will restrict execution of <varname>event_notification_command</varname>
to the specified event(s). to the specified event(s).
</para> </para>
<note> <note>
<simpara> <simpara>
<varname>event_notification_command</varname> is the script which does the actual "heavy lifting" <varname>event_notification_command</varname> is the script which does the actual "heavy lifting"
of reconfiguring the proxy server/ connection pooler. It is fully of reconfiguring the proxy server/ connection pooler. It is fully
user-definable; see section <xref linkend="bdr-event-notification-command"> for a reference user-definable; a reference implementation is documented below.
implementation.
</simpara> </simpara>
</note> </note>
@@ -170,18 +169,22 @@
</para> </para>
</sect1> </sect1>
<sect1 id="bdr-event-notification-command" xreflabel="Defining the BDR failover &quot;event_notification command&quot;"> <sect1 id="bdr-event-notification-command" xreflabel="BDR failover event notification command">
<title>Defining the BDR failover "event_notification_command"</title> <title>Defining the "event_notification_command"</title>
<para> <para>
Key to "failover" execution is the <literal>event_notification_command</literal>, Key to "failover" execution is the <literal>event_notification_command</literal>,
which is a user-definable script specified in <filename>repmpgr.conf</filename> which is a user-definable script specified in <filename>repmpgr.conf</filename>
and which can use a &repmgr; <link linkend="event-notifications">event notification</link> and which should reconfigure the proxy server/ connection pooler to point
to reconfigure the proxy server / connection pooler so it points to the other, still-active node. to the other, still-active node.
Details of the event will be passed as parameters to the script.
</para> </para>
<para> <para>
Following parameter placeholders are available for the script definition in <filename>repmpgr.conf</filename>; Each time &repmgr; (or <application>repmgrd</application>) records an event,
these will be replaced with the appropriate value when the script is executed: it can optionally execute the script defined in
<literal>event_notification_command</literal> to take further action;
details of the event will be passed as parameters.
</para>
<para>
Following placeholders are available to the script:
</para> </para>
<variablelist> <variablelist>
@@ -228,37 +231,20 @@
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><option>%c</option></term>
<listitem>
<para>
conninfo string of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>%a</option></term>
<listitem>
<para>
name of the next available node (<varname>bdr_failover</varname> and <varname>bdr_recovery</varname>)
</para>
</listitem>
</varlistentry>
</variablelist> </variablelist>
<para> <para>
Note that <literal>%c</literal> and <literal>%a</literal> are only provided with Note that <literal>%c</literal> and <literal>%a</literal> will only be provided during
particular failover events, in this case <varname>bdr_failover</varname>. <varname>bdr_failover</varname> events, which is what is of interest here.
</para> </para>
<para> <para>
The provided sample script The provided sample script (`scripts/bdr-pgbouncer.sh`) is configured like
(<literal><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/scripts/bdr-pgbouncer.sh">scripts/bdr-pgbouncer.sh</ulink></literal>) this:
is configured as follows:
<programlisting> <programlisting>
event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting> event_notification_command='/path/to/bdr-pgbouncer.sh %n %e %s "%c" "%a"'</programlisting>
</para> </para>
<para> <para>
and parses the placeholder parameters like this: and parses the configures parameters like this:
<programlisting> <programlisting>
NODE_ID=$1 NODE_ID=$1
EVENT_TYPE=$2 EVENT_TYPE=$2
@@ -266,14 +252,12 @@
NEXT_CONNINFO=$4 NEXT_CONNINFO=$4
NEXT_NODE_NAME=$5</programlisting> NEXT_NODE_NAME=$5</programlisting>
</para> </para>
<note> <para>
<para> The script also contains some hard-coded values about the <application>PgBouncer</application>
The sample script also contains some hard-coded values for the <application>PgBouncer</application> configuration for both nodes; these will need to be adjusted for your local environment
configuration for both nodes; these will need to be adjusted for your local environment (ideally the scripts would be maintained as templates and generated by some
(ideally the scripts would be maintained as templates and generated by some kind of provisioning system).
kind of provisioning system). </para>
</para>
</note>
<para> <para>
The script performs following steps: The script performs following steps:

View File

@@ -1,298 +1,60 @@
<chapter id="repmgrd-configuration"> <chapter id="repmgrd-configuration">
<indexterm> <indexterm>
<primary>repmgrd</primary> <primary>repmgrd</primary>
<secondary>configuration</secondary> <secondary>configuration</secondary>
</indexterm> </indexterm>
<title>repmgrd configuration</title> <title>repmgrd configuration</title>
<para>
To use <application>repmgrd</application>, its associated function library must be
included in <filename>postgresql.conf</filename> with:
<para> <programlisting>
<application>repmgrd</application> is a daemon which runs on each PostgreSQL node, shared_preload_libraries = 'repmgr'</programlisting>
monitoring the local node, and (unless it's the primary node) the upstream server </para>
(the primary server or with cascading replication, another standby) which it's <para>
connected to. Changing this setting requires a restart of PostgreSQL; for more details see
</para> the <ulink url="https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>.
<para> </para>
<application>repmgrd</application> can be configured to provide failover <para>
capability in case the primary upstream node becomes unreachable, and/or Additionally the following <application>repmgrd</application> options *must* be set in
provide monitoring data to the &repmgr; metadatabase. <filename>repmgr.conf</filename> (adjust configuration file locations as appropriate):
</para> <programlisting>
failover=automatic
<sect1 id="repmgrd-basic-configuration"> promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
<title>repmgrd basic configuration</title> follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
</para>
<para> <para>
To use <application>repmgrd</application>, its associated function library <emphasis>must</emphasis> be Note that the <literal>--log-to-file</literal> option will cause
included via <filename>postgresql.conf</filename> with: output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
<programlisting> See <filename>repmgr.conf.sample</filename> for further <application>repmgrd</application>-specific settings.
shared_preload_libraries = 'repmgr'</programlisting> </para>
</para> <para>
<para> When <varname>failover</varname> is set to <literal>automatic</literal>, upon detecting failure
Changing this setting requires a restart of PostgreSQL; for more details see of the current primary, <application>repmgrd</application> will execute one of
the <ulink url="https://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">PostgreSQL documentation</ulink>. <varname>promote_command</varname> or <varname>follow_command</varname>,
</para> depending on whether the current server is to become the new primary, or
needs to follow another server which has become the new primary. Note that
<para> these commands can be any valid shell script which results in one of these
To apply configuration file changes to a running <application>repmgrd</application> two actions happening, but if &repmgr;'s <command>standby follow</command> or
daemon, execute the operating system's r<application>repmgrd</application> service reload command <command>standby promote</command>
(see <xref linkend="appendix-packages"> for examples), commands are not executed (either directly as shown here, or from a script which
or for instances which were manually started, execute <command>kill -HUP</command>, e.g. performs other actions), the &repmgr; metadata will not be updated and
<command>kill -HUP `cat /tmp/repmgrd.pid`</command>. &repmgr; will no longer function reliably.
</para> </para>
<note> <para>
<para> The <varname>follow_command</varname> should provide the <literal>--upstream-node-id=%n</literal>
Check the <application>repmgrd</application> log to see what changes were option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
applied, or if any issues were encountered when reloading the configuration. <application>repmgrd</application> with the ID of the new primary node. If this is not provided, &repmgr;
</para> will attempt to determine the new primary by itself, but if the
</note> original primary comes back online after the new primary is promoted, there is a risk that
<para> <command>repmgr standby follow</command> will result in the node continuing to follow
Note that only a subset of configuration file parameters can be changed on a the original primary.
running <application>repmgrd</application> daemon. </para>
</para> <sect1 id="repmgrd-connection-settings">
<title>repmgrd connection settings</title>
<sect2 id="repmgrd-automatic-failover-configuration">
<title>automatic failover configuration</title>
<para>
If using automatic failover, the following <application>repmgrd</application> options *must* be set in
<filename>repmgr.conf</filename> :
<programlisting>
failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'</programlisting>
</para>
<para>
Adjust file paths as appropriate; we recomment specifying the full path to the &repmgr; binary.
</para>
<para>
Note that the <literal>--log-to-file</literal> option will cause
output generated by the &repmgr; command, when executed by <application>repmgrd</application>,
to be logged to the same destination configured to receive log output for <application>repmgrd</application>.
See <filename><ulink url="https://raw.githubusercontent.com/2ndQuadrant/repmgr/master/repmgr.conf.sample">repmgr.conf.sample</ulink></filename>
for further <application>repmgrd</application>-specific settings.
</para>
<para>
When <varname>failover</varname> is set to <literal>automatic</literal>, upon detecting failure
of the current primary, <application>repmgrd</application> will execute one of:
</para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<varname>promote_command</varname> (if the current server is to become the new primary)
</simpara>
</listitem>
<listitem>
<simpara>
<varname>follow_command</varname> (if the current server needs to follow another server which has
become the new primary)
</simpara>
</listitem>
</itemizedlist>
<note>
<para>
These commands can be any valid shell script which results in one of these
two actions happening, but if &repmgr;'s <command>standby follow</command> or
<command>standby promote</command>
commands are not executed (either directly as shown here, or from a script which
performs other actions), the &repmgr; metadata will not be updated and
&repmgr; will no longer function reliably.
</para>
</note>
<para>
The <varname>follow_command</varname> should provide the <literal>--upstream-node-id=%n</literal>
option to <command>repmgr standby follow</command>; the <literal>%n</literal> will be replaced by
<application>repmgrd</application> with the ID of the new primary node. If this is not provided, &repmgr;
will attempt to determine the new primary by itself, but if the
original primary comes back online after the new primary is promoted, there is a risk that
<command>repmgr standby follow</command> will result in the node continuing to follow
the original primary.
</para>
</sect2>
<sect2 id="repmgrd-service-configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>PostgreSQL service configuration</secondary>
</indexterm>
<title>PostgreSQL service configuration</title>
<para>
If using automatic failover, currently <application>repmgrd</application> will need to execute
<link linkend="repmgr-standby-follow"><command>repmgr standby follow</command></link>
to restart PostgreSQL on standbys to have them follow a new primary.
</para>
<para>
To ensure this happens smoothly, it's essential to provide the appropriate system/service restart
command appropriate to your operating system via <varname>service_restart_command</varname>
in <filename>repmgr.conf</filename>. If you don't do this, <application>repmgrd</application>
will default to using <command>pg_ctl</command>, which can result in unexpected problems,
particularly on <application>systemd</application>-based systems.
</para>
<para>
For more details, see <xref linkend="configuration-file-service-commands">.
</para>
</sect2>
<sect2 id="repmgrd-monitoring-configuration">
<indexterm>
<primary>repmgrd</primary>
<secondary>monitoring configuration</secondary>
</indexterm>
<title>Monitoring configuration</title>
<para>
To enable monitoring, set:
<programlisting>
monitoring_history=yes</programlisting>
in <filename>repmgr.conf</filename>.
</para>
<para>
The default monitoring interval is 2 seconds; this value can be explicitly set using:
<programlisting>
monitor_interval_secs=&lt;seconds&gt;</programlisting>
in <filename>repmgr.conf</filename>.
</para>
<para>
For more details on monitoring, see <xref linkend="repmgrd-monitoring">.
</para>
</sect2>
</sect1>
<sect1 id="repmgrd-daemon">
<indexterm>
<primary>repmgrd</primary>
<secondary>starting and stopping</secondary>
</indexterm>
<title>repmgrd daemon</title>
<para>
If installed from a package, the <application>repmgrd</application> can be started
via the operating system's service command, e.g. in <application>systemd</application>
using <command>systemctl</command>.
</para>
<para>
See appendix <xref linkend="appendix-packages"> for details of service commands
for different distributions.
</para>
<para>
<application>repmgrd</application> can be started manually like this:
<programlisting>
repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid</programlisting>
and stopped with <command>kill `cat /tmp/repmgrd.pid`</command>. Adjust paths as appropriate.
</para>
<sect2 id="repmgrd-pid-file" xreflabel="repmgrd's PID file">
<indexterm>
<primary>repmgrd</primary>
<secondary>PID file</secondary>
</indexterm>
<indexterm>
<primary>PID file</primary>
<secondary>repmgrd</secondary>
</indexterm>
<title>repmgrd's PID file</title>
<para>
<application>repmgrd</application> will generate a PID file by default.
</para>
<note>
<simpara>
This is a behaviour change from previous versions (earlier than 4.1), where
the PID file had to be explicitly specified with the command line
parameter <option> --pid-file</option>.
</simpara>
</note>
<para>
The PID file can be specified in <filename>repmgr.conf</filename> with the configuration
parameter <varname>repmgrd_pid_file</varname>.
</para>
<para>
It can also be specified on the command line (as in previous versions) with
the command line parameter <option>--pid-file</option>. Note this will override
any value set in <filename>repmgr.conf</filename> with <varname>repmgrd_pid_file</varname>.
<option>--pid-file</option> may be deprecated in future releases.
</para>
<para>
If a PID file location was specified by the package maintainer, <application>repmgrd</application>
will use that. This only applies if &repmgr; was installed from a package and the package
maintainer has specified the PID file location.
</para>
<para>
If none of the above apply, <application>repmgrd</application> will create a PID file
in the operating system's temporary directory (das etermined by the environment variable
<varname>TMPDIR</varname>, or if that is not set, will use <filename>/tmp</filename>).
</para>
<para>
To prevent a PID file being generated at all, provide the command line option
<option>--no-pid-file</option>.
</para>
<para>
To see which PID file <application>repmgrd</application> would use, execute <application>repmgrd</application>
with the option <option>--show-pid-file</option>. <application>repmgrd</application>
will not start if this option is provided. Note that the value shown is the
file <application>repmgrd</application> would use next time it starts, and is
not necessarily the PID file currently in use.
</para>
</sect2>
<sect2 id="repmgrd-configuration-debian-ubuntu">
<indexterm>
<primary>repmgrd</primary>
<secondary>Debian/Ubuntu and daemon configuration</secondary>
</indexterm>
<indexterm>
<primary>Debian/Ubuntu</primary>
<secondary>repmgrd daemon configuration</secondary>
</indexterm>
<title>repmgrd daemon configuration on Debian/Ubuntu</title>
<para>
If &repmgr; was installed from Debian/Ubuntu packages, additional configuration
is required before <application>repmgrd</application> is started as a daemon.
</para>
<para>
This is done via the file <filename>/etc/default/repmgrd</filename>, which by default
looks like this:
<programlisting>
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=no
# configuration file (required)
#REPMGRD_CONF="/path/to/repmgr.conf"
# additional options
#REPMGRD_OPTS=""
# user to run repmgrd as
#REPMGRD_USER=postgres
# repmgrd binary
#REPMGRD_BIN=/usr/bin/repmgrd
# pid file
#REPMGRD_PIDFILE=/var/run/repmgrd.pid</programlisting>
</para>
<para>
Set <varname>REPMGRD_ENABLED</varname> to <literal>yes</literal>, and <varname>REPMGRD_CONF</varname>
to the <filename>repmgr.conf</filename> file you are using.
</para>
<para>
If using <application>systemd</application>, you may need to execute <command>systemctl daemon-reload</command>.
Also, if you attempted to start <application>repmgrd</application> using <command>systemctl start repmgrd</command>,
you'll need to execute <command>systemctl stop repmgrd</command>. Because that's how <application>systemd</application>
rolls.
</para>
</sect2>
</sect1>
<sect1 id="repmgrd-connection-settings">
<title>repmgrd connection settings</title>
<para> <para>
In addition to the &repmgr; configuration settings, parameters in the In addition to the &repmgr; configuration settings, parameters in the
<varname>conninfo</varname> string influence how &repmgr; makes a network connection to <varname>conninfo</varname> string influence how &repmgr; makes a network connection to
@@ -314,21 +76,12 @@ REPMGRD_ENABLED=no
<ulink url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>. <ulink url="https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS">PostgreSQL documentation</ulink>.
</para> </para>
</sect1> </sect1>
<sect1 id="repmgrd-log-rotation"> <sect1 id="repmgrd-log-rotation">
<indexterm>
<primary>log rotation</primary>
<secondary>repmgrd</secondary>
</indexterm>
<title>repmgrd log rotation</title> <title>repmgrd log rotation</title>
<para> <para>
To ensure the current <application>repmgrd</application> logfile To ensure the current <application>repmgrd</application> logfile does not grow
(specified in <filename>repmgr.conf</filename> with the parameter indefinitely, configure your system's <command>logrotate</command> to
<option>log_file</option> does not grow indefinitely, configure your regularly rotate it.
system's <command>logrotate</command> to regularly rotate it.
</para> </para>
<para> <para>
Sample configuration to rotate logfiles weekly with retention for Sample configuration to rotate logfiles weekly with retention for

View File

@@ -40,7 +40,7 @@
</listitem> </listitem>
<listitem> <listitem>
<simpara>repmgrd is monitoring the primary node, but it is not available (and no other node has been promoted as primary)</simpara> <simpara>repmgrd is monitoring the primary node, but it is not available</simpara>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
</para> </para>
@@ -69,15 +69,7 @@
By default, <literal>repmgrd</literal> will continue in degraded monitoring mode indefinitely. By default, <literal>repmgrd</literal> will continue in degraded monitoring mode indefinitely.
However a timeout (in seconds) can be set with <varname>degraded_monitoring_timeout</varname>, However a timeout (in seconds) can be set with <varname>degraded_monitoring_timeout</varname>,
after which <application>repmgrd</application> will terminate. after which <application>repmgrd</application> will terminate.
</para> </para>
<note>
<para>
If <application>repmgrd</application> is monitoring a primary mode which has been stopped
and manually restarted as a standby attached to a new primary, it will automatically detect
the status change and update the node record to reflect the node's new status
as an active standby. It will then resume monitoring the node as a standby.
</para>
</note>
</chapter> </chapter>

View File

@@ -3,10 +3,6 @@
<primary>repmgrd</primary> <primary>repmgrd</primary>
<secondary>monitoring</secondary> <secondary>monitoring</secondary>
</indexterm> </indexterm>
<indexterm>
<primary>monitoring</primary>
<secondary>with repmgrd</secondary>
</indexterm>
<title>Monitoring with repmgrd</title> <title>Monitoring with repmgrd</title>
<para> <para>

View File

@@ -57,31 +57,16 @@
<para> <para>
As mentioned in the previous section, success of the switchover operation depends on As mentioned in the previous section, success of the switchover operation depends on
&repmgr; being able to shut down the current primary server quickly and cleanly. &repmgr; being able to shut down the current primary server quickly and cleanly.
</para> </para>
<para>
Ensure that a passwordless SSH connection is possible from the promotion candidate
(standby) to the demotion candidate (current primary). If <literal>--siblings-follow</literal>
will be used, ensure that passwordless SSH connections are possible from the
promotion candidate to all standbys attached to the demotion candidate.
</para>
<note>
<simpara>
&repmgr; expects to find the &repmgr; binary in the same path on the remote
server as on the local server.
</simpara>
</note>
<para> <para>
Double-check which commands will be used to stop/start/restart the current Double-check which commands will be used to stop/start/restart the current
primary; on the current primary execute: primary; on the primary execute:
<programlisting> <programlisting>
repmgr -f /etc/repmgr.conf node service --list --action=stop repmgr -f /etc/repmgr.conf node service --list --action=stop
repmgr -f /etc/repmgr.conf node service --list --action=start repmgr -f /etc/repmgr.conf node service --list --action=start
repmgr -f /etc/repmgr.conf node service --list --action=restart</programlisting> repmgr -f /etc/repmgr.conf node service --list --action=restart</programlisting>
</para> </para>
<para> <para>
@@ -100,11 +85,7 @@
<para> <para>
If the <option>service_*_command</option> options aren't defined, &repmgr; will If the <option>service_*_command</option> options aren't defined, &repmgr; will
fall back to using <application>pg_ctl</application> to stop/start/restart fall back to using <application>pg_ctl</application> to stop/start/restart
PostgreSQL, which may not work properly, particularly when executed on a remote PostgreSQL, which may not work properly.
server.
</para>
<para>
For more details, see <xref linkend="configuration-file-service-commands">.
</para> </para>
</important> </important>
@@ -122,20 +103,13 @@
</note> </note>
<para> <para>
Check that access from applications is minimalized or preferably blocked Check that access from applications is minimalized or preferably blocked
completely, so applications are not unexpectedly interrupted. completely, so applications are not unexpectedly interrupted.
</para> </para>
<note>
<para>
If an exclusive backup is running on the current primary, &repmgr; will not perform the
switchover.
</para>
</note>
<para> <para>
Check there is no significant replication lag on standbys attached to the Check there is no significant replication lag on standbys attached to the
current primary. current primary.
</para> </para>
<para> <para>
@@ -146,13 +120,10 @@
manually with <command>repmgr node check --archive-ready</command>. manually with <command>repmgr node check --archive-ready</command>.
</para> </para>
<note> <para>
<para> Ensure that <application>repmgrd</application> is *not* running anywhere to prevent it unintentionally
Ensure that <application>repmgrd</application> is *not* running anywhere to prevent it unintentionally promoting a node.
promoting a node. This restriction will be removed in a future &repmgr; version. </para>
</para>
</note>
<para> <para>
Finally, consider executing <command>repmgr standby switchover</command> with the Finally, consider executing <command>repmgr standby switchover</command> with the
@@ -185,60 +156,34 @@
</para> </para>
</important> </important>
<para>
Note that following parameters in <filename>repmgr.conf</filename> are relevant to the
switchover operation:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<literal>reconnect_attempts</literal>: number of times to check the original primary
for a clean shutdown after executing the shutdown command, before aborting
</simpara>
</listitem>
<listitem>
<simpara>
<literal>reconnect_interval</literal>: interval (in seconds) to check the original
primary for a clean shutdown after executing the shutdown command (up to a maximum
of <literal>reconnect_attempts</literal> tries)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>replication_lag_critical</literal>:
if replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the <literal>-F/--force</literal> option
is provided)
</simpara>
</listitem>
<note> </itemizedlist>
<simpara> </para>
See <xref linkend="repmgr-standby-switchover"> for a full list of available
command line options and <filename>repmgr.conf</filename> settings relevant
to performing a switchover.
</simpara>
</note>
<sect2 id="switchover-pg-rewind" xreflabel="Switchover and pg_rewind">
<indexterm>
<primary>pg_rewind</primary>
<secondary>using with "repmgr standby switchover"</secondary>
</indexterm>
<title>Switchover and pg_rewind</title>
<para>
If the demotion candidate does not shut down smoothly or cleanly, there's a risk it
will have a slightly divergent timeline and will not be able to attach to the new
primary. To fix this situation without needing to reclone the old primary, it's
possible to use the <application>pg_rewind</application> utility, which will usually be
able to resync the two servers.
</para>
<para>
To have &repmgr; execute <application>pg_rewind</application> if it detects this
situation after promoting the new primary, add the <option>--force-rewind</option>
option.
</para>
<note>
<simpara>
If &repmgr; detects a situation where it needs to execute <application>pg_rewind</application>,
it will execute a <literal>CHECKPOINT</literal> on the new primary before executing
<application>pg_rewind</application>.
</simpara>
</note>
<para>
For more details on <application>pg_rewind</application>, see:
<ulink url="https://www.postgresql.org/docs/current/static/app-pgrewind.html">https://www.postgresql.org/docs/current/static/app-pgrewind.html</ulink>.
</para>
<para>
<application>pg_rewind</application> has been part of the core PostgreSQL distribution since
version 9.5. Users of versions 9.3 and 9.4 will need to manually install it; the source code is available here:
<ulink url="https://github.com/vmware/pg_rewind">https://github.com/vmware/pg_rewind</ulink>.
If the <application>pg_rewind</application>
binary is not installed in the PostgreSQL <filename>bin</filename> directory, provide
its full path on the demotion candidate with <option>--force-rewind</option>.
</para>
<para>
Note that building the 9.3/9.4 version of <application>pg_rewind</application> requires the PostgreSQL
source code. Also, PostgreSQL 9.3 does not provide <varname>wal_log_hints</varname>,
meaning data checksums must have been enabled when the database was initialized.
</para>
</sect2>
</sect1> </sect1>
<sect1 id="switchover-execution" xreflabel="Executing the switchover command"> <sect1 id="switchover-execution" xreflabel="Executing the switchover command">

View File

@@ -11,92 +11,22 @@
containing bugfixes and other minor improvements. Any substantial new containing bugfixes and other minor improvements. Any substantial new
functionality will be included in a feature release (e.g. 4.0.x to 4.1.x). functionality will be included in a feature release (e.g. 4.0.x to 4.1.x).
</para> </para>
<para>
&repmgr; is implemented as a PostgreSQL extension; to upgrade it, first
install the updated package (or compile the updated source), then in the
database where the &repmgr; extension is installed, execute
<command>ALTER EXTENSION repmgr UPDATE</command>.
</para>
<para>
If <application>repmgrd</application> is running, it may be necessary to restart
the PostgreSQL server if the upgrade contains changes to the shared object
file used by <application>repmgrd</application>; check the release notes for details.
</para>
<sect1 id="upgrading-repmgr-extension" xreflabel="Upgrading repmgr 4.x and later"> <para>
<indexterm> Please check the <link linkend="appendix-release-notes">release notes</link> for every
<primary>upgrading</primary> release as they may contain upgrade instructions particular to individual versions.
<secondary>repmgr 4.x and later</secondary> </para>
</indexterm>
<title>Upgrading repmgr 4.x and later</title>
<para>
&repmgr; 4.x is implemented as a PostgreSQL extension; normally the upgrade consists
of the two following steps:
<orderedlist>
<listitem>
<simpara>
Install the updated package (or compile the updated source)
</simpara>
</listitem>
<listitem>
<simpara>
<application>repmgrd</application> (if running) must be restarted.
</simpara>
</listitem>
<listitem>
<simpara>
For major releases, e.g. from <literal>4.0.x</literal> to <literal>4.1</literal>,
execute <command>ALTER EXTENSION repmgr UPDATE</command>
on the primary node in the database where the &repmgr; extension is installed.
</simpara>
<simpara>
This will update the extension metadata and, if necessary, apply
changes to the &repmgr; extension objects.
</simpara>
</listitem>
</orderedlist>
</para>
<para>
Always check the <link linkend="appendix-release-notes">release notes</link> for every
release as they may contain upgrade instructions particular to individual versions.
</para>
<para>
Note that it may be necessary to restart the PostgreSQL server if the upgrade contains
changes to the shared object file used by <application>repmgrd</application>; check the
release notes for details.
</para>
</sect1>
<sect1 id="upgrading-and-pg-upgrade" xreflabel="pg_upgrade and repmgr">
<indexterm>
<primary>upgrading</primary>
<secondary>pg_upgrade</secondary>
</indexterm>
<indexterm>
<primary>pg_upgrade</primary>
</indexterm>
<title>pg_upgrade and repmgr</title>
<para>
<application>pg_upgrade</application> requires that if any functions are
dependent on a shared library, this library must be present in both
the old and new installations before <application>pg_upgrade</application>
can be executed.
</para>
<para>
To minimize the risk of any upgrade issues (particularly if an upgrade to
a new major &repmgr; version is involved), we recommend upgrading
&repmgr; on the old server <emphasis>before</emphasis> running
<application>pg_upgrade</application> to ensure that old and new
versions are the same.
</para>
<note>
<simpara>
This issue applies to any PostgreSQL extension which has
dependencies on a shared library.
</simpara>
</note>
<para>
For further details please see the <ulink url="https://www.postgresql.org/docs/current/static/pgupgrade.html">pg_upgrade documentation</ulink>.
</para>
<para>
If replication slots are in use, bear in mind these will <emphasis>not</emphasis>
be recreated by <application>pg_upgrade</application>. These will need to
be recreated manually.
</para>
</sect1>
<sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x"> <sect1 id="upgrading-from-repmgr-3" xreflabel="Upgrading from repmgr 3.x">
<indexterm> <indexterm>

View File

@@ -1 +1 @@
<!ENTITY repmgrversion "4.1.0"> <!ENTITY repmgrversion "4.0.2">

View File

@@ -43,9 +43,5 @@
#define ERR_BARMAN 19 #define ERR_BARMAN 19
#define ERR_REGISTRATION_SYNC 20 #define ERR_REGISTRATION_SYNC 20
#define ERR_OUT_OF_MEMORY 21 #define ERR_OUT_OF_MEMORY 21
#define ERR_SWITCHOVER_INCOMPLETE 22
#define ERR_FOLLOW_FAIL 23
#define ERR_REJOIN_FAIL 24
#define ERR_NODE_STATUS 25
#endif /* _ERRCODE_H_ */ #endif /* _ERRCODE_H_ */

19
log.c
View File

@@ -42,7 +42,7 @@ _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_li
__attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0))); __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0)));
int log_type = REPMGR_STDERR; int log_type = REPMGR_STDERR;
int log_level = LOG_INFO; int log_level = LOG_NOTICE;
int last_log_level = LOG_INFO; int last_log_level = LOG_INFO;
int verbose_logging = false; int verbose_logging = false;
int terse_logging = false; int terse_logging = false;
@@ -70,7 +70,7 @@ _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_li
/* /*
* Store the requested level so that if there's a subsequent log_hint() or * Store the requested level so that if there's a subsequent log_hint() or
* log_detail(), we can suppress that if --terse was specified, * log_detail(), we can suppress that if appropriate.
*/ */
last_log_level = level; last_log_level = level;
@@ -329,21 +329,6 @@ logger_set_terse(void)
} }
void
logger_set_level(int new_log_level)
{
log_level = new_log_level;
}
void
logger_set_min_level(int min_log_level)
{
if (min_log_level > log_level)
log_level = min_log_level;
}
int int
detect_log_level(const char *level) detect_log_level(const char *level)
{ {

2
log.h
View File

@@ -128,8 +128,6 @@ bool logger_shutdown(void);
void logger_set_verbose(void); void logger_set_verbose(void);
void logger_set_terse(void); void logger_set_terse(void);
void logger_set_min_level(int min_log_level);
void logger_set_level(int new_log_level);
void void
log_detail(const char *fmt,...) log_detail(const char *fmt,...)

View File

@@ -1,2 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit

View File

@@ -1,167 +0,0 @@
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION repmgr" to load this file. \quit
CREATE TABLE repmgr.nodes (
node_id INTEGER PRIMARY KEY,
upstream_node_id INTEGER NULL REFERENCES nodes (node_id) DEFERRABLE,
active BOOLEAN NOT NULL DEFAULT TRUE,
node_name TEXT NOT NULL,
type TEXT NOT NULL CHECK (type IN('primary','standby','witness','bdr')),
location TEXT NOT NULL DEFAULT 'default',
priority INT NOT NULL DEFAULT 100,
conninfo TEXT NOT NULL,
repluser VARCHAR(63) NOT NULL,
slot_name TEXT NULL,
config_file TEXT NOT NULL
);
CREATE TABLE repmgr.events (
node_id INTEGER NOT NULL,
event TEXT NOT NULL,
successful BOOLEAN NOT NULL DEFAULT TRUE,
event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
details TEXT NULL
);
DO $repmgr$
DECLARE
DECLARE server_version_num INT;
BEGIN
SELECT setting
FROM pg_catalog.pg_settings
WHERE name = 'server_version_num'
INTO server_version_num;
IF server_version_num >= 90400 THEN
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location PG_LSN NOT NULL,
last_wal_standby_location PG_LSN,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
ELSE
EXECUTE $repmgr_func$
CREATE TABLE repmgr.monitoring_history (
primary_node_id INTEGER NOT NULL,
standby_node_id INTEGER NOT NULL,
last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL,
last_apply_time TIMESTAMP WITH TIME ZONE,
last_wal_primary_location TEXT NOT NULL,
last_wal_standby_location TEXT,
replication_lag BIGINT NOT NULL,
apply_lag BIGINT NOT NULL
)
$repmgr_func$;
END IF;
END$repmgr$;
CREATE INDEX idx_monitoring_history_time
ON repmgr.monitoring_history (last_monitor_time, standby_node_id);
CREATE VIEW repmgr.show_nodes AS
SELECT n.node_id,
n.node_name,
n.active,
n.upstream_node_id,
un.node_name AS upstream_node_name,
n.type,
n.priority,
n.conninfo
FROM repmgr.nodes n
LEFT JOIN repmgr.nodes un
ON un.node_id = n.upstream_node_id;
/* XXX update upgrade scripts! */
CREATE TABLE repmgr.voting_term (
term INT NOT NULL
);
CREATE UNIQUE INDEX voting_term_restrict
ON repmgr.voting_term ((TRUE));
CREATE RULE voting_term_delete AS
ON DELETE TO repmgr.voting_term
DO INSTEAD NOTHING;
/* ================= */
/* repmgrd functions */
/* ================= */
/* monitoring functions */
CREATE FUNCTION set_local_node_id(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'set_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION get_local_node_id()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_local_node_id'
LANGUAGE C STRICT;
CREATE FUNCTION standby_set_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_set_last_updated'
LANGUAGE C STRICT;
CREATE FUNCTION standby_get_last_updated()
RETURNS TIMESTAMP WITH TIME ZONE
AS 'MODULE_PATHNAME', 'standby_get_last_updated'
LANGUAGE C STRICT;
/* failover functions */
CREATE FUNCTION notify_follow_primary(INT)
RETURNS VOID
AS 'MODULE_PATHNAME', 'notify_follow_primary'
LANGUAGE C STRICT;
CREATE FUNCTION get_new_primary()
RETURNS INT
AS 'MODULE_PATHNAME', 'get_new_primary'
LANGUAGE C STRICT;
CREATE FUNCTION reset_voting_status()
RETURNS VOID
AS 'MODULE_PATHNAME', 'reset_voting_status'
LANGUAGE C STRICT;
CREATE FUNCTION am_bdr_failover_handler(INT)
RETURNS BOOL
AS 'MODULE_PATHNAME', 'am_bdr_failover_handler'
LANGUAGE C STRICT;
CREATE FUNCTION unset_bdr_failover_handler()
RETURNS VOID
AS 'MODULE_PATHNAME', 'unset_bdr_failover_handler'
LANGUAGE C STRICT;
CREATE VIEW repmgr.replication_status AS
SELECT m.primary_node_id, m.standby_node_id, n.node_name AS standby_name,
n.type AS node_type, n.active, last_monitor_time,
CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location,
m.last_wal_standby_location,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag,
CASE WHEN n.type='standby' THEN
CASE WHEN replication_lag > 0 THEN age(now(), m.last_apply_time) ELSE '0'::INTERVAL END
ELSE NULL
END AS replication_time_lag,
CASE WHEN n.type='standby' THEN pg_catalog.pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag,
AGE(NOW(), CASE WHEN pg_catalog.pg_is_in_recovery() THEN repmgr.standby_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag
FROM repmgr.monitoring_history m
JOIN repmgr.nodes n ON m.standby_node_id = n.node_id
WHERE (m.standby_node_id, m.last_monitor_time) IN (
SELECT m1.standby_node_id, MAX(m1.last_monitor_time)
FROM repmgr.monitoring_history m1 GROUP BY 1
);

View File

@@ -83,10 +83,9 @@ do_bdr_register(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* BDR 2 implementation is for 2 nodes only */ if (bdr_nodes.node_count > 2)
if (get_bdr_version_num() < 3 && bdr_nodes.node_count > 2)
{ {
log_error(_("repmgr can only support BDR 2.x clusters with 2 nodes")); log_error(_("repmgr can only support BDR clusters with 2 nodes"));
log_detail(_("this BDR cluster has %i nodes"), bdr_nodes.node_count); log_detail(_("this BDR cluster has %i nodes"), bdr_nodes.node_count);
PQfinish(conn); PQfinish(conn);
pfree(dbname); pfree(dbname);
@@ -177,7 +176,6 @@ do_bdr_register(void)
if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false) if (bdr_node_has_repmgr_set(conn, config_file_options.node_name) == false)
{ {
log_debug("bdr_node_has_repmgr_set() = false");
bdr_node_set_repmgr_set(conn, config_file_options.node_name); bdr_node_set_repmgr_set(conn, config_file_options.node_name);
} }
@@ -203,7 +201,6 @@ do_bdr_register(void)
if (bdr_nodes.node_count == 0) if (bdr_nodes.node_count == 0)
{ {
log_error(_("unable to retrieve any BDR node records")); log_error(_("unable to retrieve any BDR node records"));
log_detail("%s", PQerrorMessage(conn));
PQfinish(conn); PQfinish(conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
@@ -255,35 +252,7 @@ do_bdr_register(void)
} }
/* Add the repmgr extension tables to a replication set */ /* Add the repmgr extension tables to a replication set */
add_extension_tables_to_bdr_replication_set(conn);
if (get_bdr_version_num() < 3)
{
add_extension_tables_to_bdr_replication_set(conn);
}
else
{
/* this is the only table we need to replicate */
char *replication_set = get_default_bdr_replication_set(conn);
/*
* this probably won't happen, but we need to be sure we're using
* the replication set metadata correctly...
*/
if (conn == NULL)
{
log_error(_("unable to retrieve default BDR replication set"));
log_hint(_("see preceding messages"));
log_debug("check query in get_default_bdr_replication_set()");
exit(ERR_BAD_CONFIG);
}
if (is_table_in_bdr_replication_set(conn, "nodes", replication_set) == false)
{
add_table_to_bdr_replication_set(conn, "nodes", replication_set);
}
pfree(replication_set);
}
initPQExpBuffer(&event_details); initPQExpBuffer(&event_details);

View File

@@ -82,8 +82,6 @@ do_cluster_show(void)
NodeInfoListCell *cell = NULL; NodeInfoListCell *cell = NULL;
int i = 0; int i = 0;
ItemList warnings = {NULL, NULL}; ItemList warnings = {NULL, NULL};
bool success = false;
bool error_found = false;
/* Connect to local database to obtain cluster connection data */ /* Connect to local database to obtain cluster connection data */
log_verbose(LOG_INFO, _("connecting to database")); log_verbose(LOG_INFO, _("connecting to database"));
@@ -93,19 +91,11 @@ do_cluster_show(void)
else else
conn = establish_db_connection_by_params(&source_conninfo, true); conn = establish_db_connection_by_params(&source_conninfo, true);
success = get_all_node_records_with_upstream(conn, &nodes); get_all_node_records_with_upstream(conn, &nodes);
if (success == false)
{
/* get_all_node_records_with_upstream() will print error message */
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
if (nodes.node_count == 0) if (nodes.node_count == 0)
{ {
log_error(_("no node records were found")); log_error(_("unable to retrieve any node records"));
log_hint(_("ensure at least one node is registered"));
PQfinish(conn); PQfinish(conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
@@ -141,14 +131,8 @@ do_cluster_show(void)
} }
else else
{ {
char error[MAXLEN];
strncpy(error, PQerrorMessage(cell->node_info->conn), MAXLEN);
cell->node_info->node_status = NODE_STATUS_DOWN; cell->node_info->node_status = NODE_STATUS_DOWN;
cell->node_info->recovery_type = RECTYPE_UNKNOWN; cell->node_info->recovery_type = RECTYPE_UNKNOWN;
item_list_append_format(&warnings,
"when attempting to connect to node \"%s\" (ID: %i), following error encountered :\n\"%s\"",
cell->node_info->node_name, cell->node_info->node_id, trim(error));
} }
initPQExpBuffer(&details); initPQExpBuffer(&details);
@@ -174,13 +158,15 @@ do_cluster_show(void)
break; break;
case RECTYPE_STANDBY: case RECTYPE_STANDBY:
appendPQExpBuffer(&details, "! running as standby"); appendPQExpBuffer(&details, "! running as standby");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is registered as primary but running as standby", "node \"%s\" (ID: %i) is registered as primary but running as standby",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
case RECTYPE_UNKNOWN: case RECTYPE_UNKNOWN:
appendPQExpBuffer(&details, "! unknown"); appendPQExpBuffer(&details, "! unknown");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) has unknown replication status", "node \"%s\" (ID: %i) has unknown replication status",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
@@ -191,14 +177,16 @@ do_cluster_show(void)
if (cell->node_info->recovery_type == RECTYPE_PRIMARY) if (cell->node_info->recovery_type == RECTYPE_PRIMARY)
{ {
appendPQExpBuffer(&details, "! running"); appendPQExpBuffer(&details, "! running");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is running but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
else else
{ {
appendPQExpBuffer(&details, "! running as standby"); appendPQExpBuffer(&details, "! running as standby");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is registered as an inactive primary but running as standby", "node \"%s\" (ID: %i) is registered as an inactive primary but running as standby",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -211,7 +199,8 @@ do_cluster_show(void)
if (cell->node_info->active == true) if (cell->node_info->active == true)
{ {
appendPQExpBuffer(&details, "? unreachable"); appendPQExpBuffer(&details, "? unreachable");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is registered as an active primary but is unreachable", "node \"%s\" (ID: %i) is registered as an active primary but is unreachable",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -219,7 +208,6 @@ do_cluster_show(void)
else else
{ {
appendPQExpBuffer(&details, "- failed"); appendPQExpBuffer(&details, "- failed");
error_found = true;
} }
} }
} }
@@ -238,7 +226,8 @@ do_cluster_show(void)
break; break;
case RECTYPE_PRIMARY: case RECTYPE_PRIMARY:
appendPQExpBuffer(&details, "! running as primary"); appendPQExpBuffer(&details, "! running as primary");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is registered as standby but running as primary", "node \"%s\" (ID: %i) is registered as standby but running as primary",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
break; break;
@@ -256,14 +245,16 @@ do_cluster_show(void)
if (cell->node_info->recovery_type == RECTYPE_STANDBY) if (cell->node_info->recovery_type == RECTYPE_STANDBY)
{ {
appendPQExpBuffer(&details, "! running"); appendPQExpBuffer(&details, "! running");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is running but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
else else
{ {
appendPQExpBuffer(&details, "! running as primary"); appendPQExpBuffer(&details, "! running as primary");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is running as primary but the repmgr node record is inactive", "node \"%s\" (ID: %i) is running as primary but the repmgr node record is inactive",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
@@ -276,14 +267,14 @@ do_cluster_show(void)
if (cell->node_info->active == true) if (cell->node_info->active == true)
{ {
appendPQExpBuffer(&details, "? unreachable"); appendPQExpBuffer(&details, "? unreachable");
item_list_append_format(&warnings, item_list_append_format(
&warnings,
"node \"%s\" (ID: %i) is registered as an active standby but is unreachable", "node \"%s\" (ID: %i) is registered as an active standby but is unreachable",
cell->node_info->node_name, cell->node_info->node_id); cell->node_info->node_name, cell->node_info->node_id);
} }
else else
{ {
appendPQExpBuffer(&details, "- failed"); appendPQExpBuffer(&details, "- failed");
error_found = true;
} }
} }
} }
@@ -295,27 +286,17 @@ do_cluster_show(void)
if (cell->node_info->node_status == NODE_STATUS_UP) if (cell->node_info->node_status == NODE_STATUS_UP)
{ {
if (cell->node_info->active == true) if (cell->node_info->active == true)
{
appendPQExpBuffer(&details, "* running"); appendPQExpBuffer(&details, "* running");
}
else else
{
appendPQExpBuffer(&details, "! running"); appendPQExpBuffer(&details, "! running");
error_found = true;
}
} }
/* node is unreachable */ /* node is unreachable */
else else
{ {
if (cell->node_info->active == true) if (cell->node_info->active == true)
{
appendPQExpBuffer(&details, "? unreachable"); appendPQExpBuffer(&details, "? unreachable");
}
else else
{
appendPQExpBuffer(&details, "- failed"); appendPQExpBuffer(&details, "- failed");
error_found = true;
}
} }
} }
break; break;
@@ -323,7 +304,6 @@ do_cluster_show(void)
{ {
/* this should never happen */ /* this should never happen */
appendPQExpBuffer(&details, "? unknown node type"); appendPQExpBuffer(&details, "? unknown node type");
error_found = true;
} }
break; break;
} }
@@ -428,6 +408,7 @@ do_cluster_show(void)
PQfinish(conn); PQfinish(conn);
/* emit any warnings */ /* emit any warnings */
if (warnings.head != NULL && runtime_options.terse == false && runtime_options.output_mode != OM_CSV) if (warnings.head != NULL && runtime_options.terse == false && runtime_options.output_mode != OM_CSV)
{ {
ItemListCell *cell = NULL; ItemListCell *cell = NULL;
@@ -435,23 +416,9 @@ do_cluster_show(void)
printf(_("\nWARNING: following issues were detected\n")); printf(_("\nWARNING: following issues were detected\n"));
for (cell = warnings.head; cell; cell = cell->next) for (cell = warnings.head; cell; cell = cell->next)
{ {
printf(_(" - %s\n"), cell->string); printf(_(" %s\n"), cell->string);
} }
} }
/*
* If warnings were noted, even if they're not displayed (e.g. in --csv node),
* that means something's not right so we need to emit a non-zero exit code.
*/
if (warnings.head != NULL)
{
error_found = true;
}
if (error_found == true)
{
exit(ERR_NODE_STATUS);
}
} }
@@ -463,7 +430,6 @@ do_cluster_show(void)
* --all * --all
* --node-[id|name] * --node-[id|name]
* --event * --event
* --csv
*/ */
void void
@@ -508,12 +474,8 @@ do_cluster_event(void)
strncpy(headers_event[EV_TIMESTAMP].title, _("Timestamp"), MAXLEN); strncpy(headers_event[EV_TIMESTAMP].title, _("Timestamp"), MAXLEN);
strncpy(headers_event[EV_DETAILS].title, _("Details"), MAXLEN); strncpy(headers_event[EV_DETAILS].title, _("Details"), MAXLEN);
/* /* if --terse provided, simply omit the "Details" column */
* If --terse or --csv provided, simply omit the "Details" column. if (runtime_options.terse == true)
* In --csv mode we'd need to quote/escape the contents "Details" column,
* which is doable but which will remain a TODO for now.
*/
if (runtime_options.terse == true || runtime_options.output_mode == OM_CSV)
column_count --; column_count --;
for (i = 0; i < column_count; i++) for (i = 0; i < column_count; i++)
@@ -536,64 +498,47 @@ do_cluster_event(void)
} }
if (runtime_options.output_mode == OM_TEXT) for (i = 0; i < column_count; i++)
{ {
for (i = 0; i < column_count; i++) if (i == 0)
{ printf(" ");
if (i == 0) else
printf(" "); printf(" | ");
else
printf(" | ");
printf("%-*s", printf("%-*s",
headers_event[i].max_length, headers_event[i].max_length,
headers_event[i].title); headers_event[i].title);
}
printf("\n");
printf("-");
for (i = 0; i < column_count; i++)
{
int j;
for (j = 0; j < headers_event[i].max_length; j++)
printf("-");
if (i < (column_count - 1))
printf("-+-");
else
printf("-");
}
printf("\n");
} }
printf("\n");
printf("-");
for (i = 0; i < column_count; i++)
{
int j;
for (j = 0; j < headers_event[i].max_length; j++)
printf("-");
if (i < (column_count - 1))
printf("-+-");
else
printf("-");
}
printf("\n");
for (i = 0; i < PQntuples(res); i++) for (i = 0; i < PQntuples(res); i++)
{ {
int j; int j;
if (runtime_options.output_mode == OM_CSV) printf(" ");
for (j = 0; j < column_count; j++)
{ {
for (j = 0; j < column_count; j++) printf("%-*s",
{ headers_event[j].max_length,
printf("%s", PQgetvalue(res, i, j)); PQgetvalue(res, i, j));
if ((j + 1) < column_count)
{
printf(",");
}
}
}
else
{
printf(" ");
for (j = 0; j < column_count; j++)
{
printf("%-*s",
headers_event[j].max_length,
PQgetvalue(res, i, j));
if (j < (column_count - 1)) if (j < (column_count - 1))
printf(" | "); printf(" | ");
}
} }
printf("\n"); printf("\n");
@@ -603,8 +548,7 @@ do_cluster_event(void)
PQfinish(conn); PQfinish(conn);
if (runtime_options.output_mode == OM_TEXT) puts("");
puts("");
} }
@@ -619,8 +563,6 @@ do_cluster_crosscheck(void)
t_node_status_cube **cube; t_node_status_cube **cube;
bool error_found = false;
n = build_cluster_crosscheck(&cube, &name_length); n = build_cluster_crosscheck(&cube, &name_length);
if (runtime_options.output_mode == OM_CSV) if (runtime_options.output_mode == OM_CSV)
{ {
@@ -700,11 +642,9 @@ do_cluster_crosscheck(void)
{ {
case -2: case -2:
c = '?'; c = '?';
error_found = true;
break; break;
case -1: case -1:
c = 'x'; c = 'x';
error_found = true;
break; break;
case 0: case 0:
c = '*'; c = '*';
@@ -743,11 +683,6 @@ do_cluster_crosscheck(void)
free(cube); free(cube);
} }
if (error_found == true)
{
exit(ERR_NODE_STATUS);
}
} }
@@ -763,8 +698,6 @@ do_cluster_matrix()
t_node_matrix_rec **matrix_rec_list; t_node_matrix_rec **matrix_rec_list;
bool error_found = false;
n = build_cluster_matrix(&matrix_rec_list, &name_length); n = build_cluster_matrix(&matrix_rec_list, &name_length);
if (runtime_options.output_mode == OM_CSV) if (runtime_options.output_mode == OM_CSV)
@@ -803,11 +736,9 @@ do_cluster_matrix()
{ {
case -2: case -2:
c = '?'; c = '?';
error_found = true;
break; break;
case -1: case -1:
c = 'x'; c = 'x';
error_found = true;
break; break;
case 0: case 0:
c = '*'; c = '*';
@@ -833,11 +764,6 @@ do_cluster_matrix()
} }
free(matrix_rec_list); free(matrix_rec_list);
if (error_found == true)
{
exit(ERR_NODE_STATUS);
}
} }
@@ -1032,7 +958,8 @@ build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length)
initPQExpBuffer(&command_output); initPQExpBuffer(&command_output);
(void) remote_command(host, (void) remote_command(
host,
runtime_options.remote_user, runtime_options.remote_user,
command.data, command.data,
&command_output); &command_output);
@@ -1211,12 +1138,13 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)
/* fix to work with --node-id */ /* fix to work with --node-id */
if (cube[i]->node_id == config_file_options.node_id) if (cube[i]->node_id == config_file_options.node_id)
{ {
(void) local_command_simple(command.data, (void) local_command(
&command_output); command.data,
&command_output);
} }
else else
{ {
t_conninfo_param_list remote_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER; t_conninfo_param_list remote_conninfo;
char *host = NULL; char *host = NULL;
PQExpBufferData quoted_command; PQExpBufferData quoted_command;
@@ -1236,7 +1164,8 @@ build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)
log_verbose(LOG_DEBUG, "build_cluster_crosscheck(): executing\n %s", quoted_command.data); log_verbose(LOG_DEBUG, "build_cluster_crosscheck(): executing\n %s", quoted_command.data);
(void) remote_command(host, (void) remote_command(
host,
runtime_options.remote_user, runtime_options.remote_user,
quoted_command.data, quoted_command.data,
&command_output); &command_output);
@@ -1397,7 +1326,6 @@ do_cluster_help(void)
printf(_(" %s [OPTIONS] cluster matrix\n"), progname()); printf(_(" %s [OPTIONS] cluster matrix\n"), progname());
printf(_(" %s [OPTIONS] cluster crosscheck\n"), progname()); printf(_(" %s [OPTIONS] cluster crosscheck\n"), progname());
printf(_(" %s [OPTIONS] cluster event\n"), progname()); printf(_(" %s [OPTIONS] cluster event\n"), progname());
printf(_(" %s [OPTIONS] cluster cleanup\n"), progname());
puts(""); puts("");
printf(_("CLUSTER SHOW\n")); printf(_("CLUSTER SHOW\n"));
@@ -1437,7 +1365,6 @@ do_cluster_help(void)
printf(_(" --event filter specific event\n")); printf(_(" --event filter specific event\n"));
printf(_(" --node-id restrict entries to node with this ID\n")); printf(_(" --node-id restrict entries to node with this ID\n"));
printf(_(" --node-name restrict entries to node with this name\n")); printf(_(" --node-name restrict entries to node with this name\n"));
printf(_(" --csv emit output as CSV\n"));
puts(""); puts("");
printf(_("CLUSTER CLEANUP\n")); printf(_("CLUSTER CLEANUP\n"));

File diff suppressed because it is too large Load Diff

View File

@@ -548,8 +548,7 @@ do_primary_help(void)
printf(_(" \"primary unregister\" unregisters an inactive primary node.\n")); printf(_(" \"primary unregister\" unregisters an inactive primary node.\n"));
puts(""); puts("");
printf(_(" --dry-run check what would happen, but don't actually unregister the primary\n")); printf(_(" --dry-run check what would happen, but don't actually unregister the primary\n"));
printf(_(" --node-id ID of the inactive primary node to unregister.\n")); printf(_(" -F, --force force removal of the record\n"));
printf(_(" -F, --force force removal of an active record\n"));
puts(""); puts("");

File diff suppressed because it is too large Load Diff

View File

@@ -28,7 +28,7 @@ extern void do_standby_switchover(void);
extern void do_standby_help(void); extern void do_standby_help(void);
extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output, int *error_code); extern bool do_standby_follow_internal(PGconn *primary_conn, t_node_info *primary_node_record, PQExpBufferData *output);

View File

@@ -65,7 +65,7 @@ do_witness_register(void)
if (recovery_type == RECTYPE_STANDBY) if (recovery_type == RECTYPE_STANDBY)
{ {
log_error(_("provided node is a standby")); log_error(_("provided node is a standby"));
log_hint(_("a witness node must run on an independent primary server")); log_error(_("a witness node must run on an independent primary server"));
PQfinish(witness_conn); PQfinish(witness_conn);
@@ -86,7 +86,6 @@ do_witness_register(void)
/* connect to primary with provided parameters */ /* connect to primary with provided parameters */
log_info(_("connecting to primary node")); log_info(_("connecting to primary node"));
/* /*
* Extract the repmgr user and database names from the conninfo string * Extract the repmgr user and database names from the conninfo string
* provided in repmgr.conf * provided in repmgr.conf
@@ -111,12 +110,12 @@ do_witness_register(void)
} }
/* check primary node's recovery type */ /* check primary node's recovery type */
recovery_type = get_recovery_type(primary_conn); recovery_type = get_recovery_type(witness_conn);
if (recovery_type == RECTYPE_STANDBY) if (recovery_type == RECTYPE_STANDBY)
{ {
log_error(_("provided primary node is a standby")); log_error(_("provided primary node is a standby"));
log_hint(_("provide the connection details of the cluster's primary server")); log_error(_("provide the connection details of the cluster's primary server"));
PQfinish(witness_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);
@@ -136,11 +135,8 @@ do_witness_register(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* /* XXX sanity check witness node is not part of main cluster */
* TODO: sanity check witness node is not part of main cluster; we could
* add a random application_name to the respective connections,
* and do a simple check of pg_stat_activity
*/
/* create repmgr extension, if does not exist */ /* create repmgr extension, if does not exist */
if (runtime_options.dry_run == false && !create_repmgr_extension(witness_conn)) if (runtime_options.dry_run == false && !create_repmgr_extension(witness_conn))
@@ -186,6 +182,7 @@ do_witness_register(void)
log_error(_("witness node is already registered")); log_error(_("witness node is already registered"));
log_hint(_("use option -F/--force to reregister the node")); log_hint(_("use option -F/--force to reregister the node"));
PQfinish(witness_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);
@@ -193,26 +190,8 @@ do_witness_register(void)
} }
} }
/*
* Check that an active node with the same node_name doesn't exist already
*/
record_status = get_node_record_by_name(primary_conn, // XXX check other node with same name does not exist
config_file_options.node_name,
&node_record);
if (record_status == RECORD_FOUND)
{
if (node_record.active == true && node_record.node_id != config_file_options.node_id)
{
log_error(_("node %i exists already with node_name \"%s\""),
node_record.node_id,
config_file_options.node_name);
PQfinish(primary_conn);
exit(ERR_BAD_CONFIG);
}
}
/* /*
* if repmgr.nodes contains entries, delete if -F/--force provided, * if repmgr.nodes contains entries, delete if -F/--force provided,
@@ -243,7 +222,6 @@ do_witness_register(void)
PQfinish(witness_conn); PQfinish(witness_conn);
exit(SUCCESS); exit(SUCCESS);
} }
/* create record on primary */ /* create record on primary */
/* /*
@@ -310,59 +288,55 @@ do_witness_register(void)
void void
do_witness_unregister(void) do_witness_unregister(void)
{ {
PGconn *local_conn = NULL; PGconn *witness_conn = NULL;
PGconn *primary_conn = NULL; PGconn *primary_conn = NULL;
t_node_info node_record = T_NODE_INFO_INITIALIZER; t_node_info node_record = T_NODE_INFO_INITIALIZER;
RecordStatus record_status = RECORD_NOT_FOUND; RecordStatus record_status = RECORD_NOT_FOUND;
bool node_record_deleted = false; bool node_record_deleted = false;
bool local_node_available = true; bool witness_available = true;
int witness_node_id = UNKNOWN_NODE_ID;
if (runtime_options.node_id != UNKNOWN_NODE_ID) log_info(_("connecting to witness node \"%s\" (ID: %i)"),
{
/* user has specified the witness node id */
witness_node_id = runtime_options.node_id;
}
else
{
/* assume witness node is local node */
witness_node_id = config_file_options.node_id;
}
log_info(_("connecting to node \"%s\" (ID: %i)"),
config_file_options.node_name, config_file_options.node_name,
config_file_options.node_id); config_file_options.node_id);
local_conn = establish_db_connection_quiet(config_file_options.conninfo); witness_conn = establish_db_connection_quiet(config_file_options.conninfo);
if (PQstatus(local_conn) != CONNECTION_OK) if (PQstatus(witness_conn) != CONNECTION_OK)
{ {
if (!runtime_options.force) if (!runtime_options.force)
{ {
log_error(_("unable to connect to node \"%s\" (ID: %i)"), log_error(_("unable to connect to witness node \"%s\" (ID: %i)"),
config_file_options.node_name, config_file_options.node_name,
config_file_options.node_id); config_file_options.node_id);
log_detail("%s", PQerrorMessage(local_conn)); log_detail("%s", PQerrorMessage(witness_conn));
log_hint(_("provide -F/--force to remove the witness record if the server is not running"));
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
log_notice(_("unable to connect to witness node \"%s\" (ID: %i), removing node record on cluster primary only"), log_notice(_("unable to connect to witness node \"%s\" (ID: %i), removing node record on cluster primary only"),
config_file_options.node_name, config_file_options.node_name,
config_file_options.node_id); config_file_options.node_id);
local_node_available = false; witness_available = false;
} }
if (local_node_available == true) if (witness_available == true)
{ {
primary_conn = get_primary_connection_quiet(local_conn, NULL, NULL); primary_conn = get_primary_connection_quiet(witness_conn, NULL, NULL);
} }
else else
{ {
/* /*
* Assume user has provided connection details for the primary server * Extract the repmgr user and database names from the conninfo string
* provided in repmgr.conf
*/ */
get_conninfo_value(config_file_options.conninfo, "user", repmgr_user);
get_conninfo_value(config_file_options.conninfo, "dbname", repmgr_db);
param_set_ine(&source_conninfo, "user", repmgr_user);
param_set_ine(&source_conninfo, "dbname", repmgr_db);
primary_conn = establish_db_connection_by_params(&source_conninfo, false); primary_conn = establish_db_connection_by_params(&source_conninfo, false);
} }
if (PQstatus(primary_conn) != CONNECTION_OK) if (PQstatus(primary_conn) != CONNECTION_OK)
@@ -370,26 +344,26 @@ do_witness_unregister(void)
log_error(_("unable to connect to primary")); log_error(_("unable to connect to primary"));
log_detail("%s", PQerrorMessage(primary_conn)); log_detail("%s", PQerrorMessage(primary_conn));
if (local_node_available == true) if (witness_available == true)
{ {
PQfinish(local_conn); PQfinish(witness_conn);
} }
else if (runtime_options.connection_param_provided == false) else
{ {
log_hint(_("provide connection details for the primary server")); log_hint(_("provide connection details to primary server"));
} }
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* Check node exists and is really a witness */ /* Check node exists and is really a witness */
record_status = get_node_record(primary_conn, witness_node_id, &node_record); record_status = get_node_record(primary_conn, config_file_options.node_id, &node_record);
if (record_status != RECORD_FOUND) if (record_status != RECORD_FOUND)
{ {
log_error(_("no record found for node %i"), witness_node_id); log_error(_("no record found for node %i"), config_file_options.node_id);
if (local_node_available == true) if (witness_available == true)
PQfinish(local_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
@@ -397,17 +371,11 @@ do_witness_unregister(void)
if (node_record.type != WITNESS) if (node_record.type != WITNESS)
{ {
/*
* The node (either explicitly provided with --node-id, or the local node)
* is not a witness.
*
* TODO: scan node list and print hint about identity of known witness servers.
*/
log_error(_("node %i is not a witness node"), config_file_options.node_id); log_error(_("node %i is not a witness node"), config_file_options.node_id);
log_detail(_("node %i is a %s node"), config_file_options.node_id, get_node_type_string(node_record.type)); log_detail(_("node %i is a %s node"), config_file_options.node_id, get_node_type_string(node_record.type));
if (local_node_available == true) if (witness_available == true)
PQfinish(local_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
@@ -416,43 +384,49 @@ do_witness_unregister(void)
if (runtime_options.dry_run == true) if (runtime_options.dry_run == true)
{ {
log_info(_("prerequisites for unregistering the witness node are met")); log_info(_("prerequisites for unregistering the witness node are met"));
if (local_node_available == true) if (witness_available == true)
PQfinish(local_conn); PQfinish(witness_conn);
PQfinish(primary_conn); PQfinish(primary_conn);
exit(SUCCESS); exit(SUCCESS);
} }
log_info(_("unregistering witness node %i"), witness_node_id); log_info(_("unregistering witness node %i"), config_file_options.node_id);
node_record_deleted = delete_node_record(primary_conn, node_record_deleted = delete_node_record(primary_conn,
witness_node_id); config_file_options.node_id);
if (node_record_deleted == false) if (node_record_deleted == false)
{ {
PQfinish(primary_conn); PQfinish(primary_conn);
PQfinish(witness_conn);
exit(ERR_BAD_CONFIG);
}
if (local_node_available == true) /* sync records from primary */
PQfinish(local_conn); if (witness_available == true && witness_copy_node_records(primary_conn, witness_conn) == false)
PQfinish(local_conn); {
log_error(_("unable to copy repmgr node records from primary"));
PQfinish(primary_conn);
PQfinish(witness_conn);
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* Log the event */ /* Log the event */
create_event_record(primary_conn, create_event_record(primary_conn,
&config_file_options, &config_file_options,
witness_node_id, config_file_options.node_id,
"witness_unregister", "witness_unregister",
true, true,
NULL); NULL);
PQfinish(primary_conn); PQfinish(primary_conn);
if (local_node_available == true) if (witness_available == true)
PQfinish(local_conn); PQfinish(witness_conn);
log_info(_("witness unregistration complete")); log_info(_("witness unregistration complete"));
log_detail(_("witness node with ID %i successfully unregistered"), log_detail(_("witness node with id %i (conninfo: %s) successfully unregistered"),
witness_node_id); config_file_options.node_id, config_file_options.conninfo);
return; return;
} }
@@ -472,19 +446,16 @@ void do_witness_help(void)
puts(""); puts("");
printf(_(" Requires provision of connection information for the primary\n")); printf(_(" Requires provision of connection information for the primary\n"));
puts(""); puts("");
printf(_(" --dry-run check prerequisites but don't make any changes\n")); printf(_(" --dry-run check prerequisites but don't make any changes\n"));
printf(_(" -F, --force overwrite an existing node record\n")); printf(_(" -F, --force overwrite an existing node record\n"));
puts(""); puts("");
printf(_("WITNESS UNREGISTER\n")); printf(_("WITNESS UNREGISTER\n"));
puts(""); puts("");
printf(_(" \"witness register\" unregisters a witness node.\n")); printf(_(" \"witness register\" unregisters a witness node.\n"));
puts(""); puts("");
printf(_(" --dry-run check prerequisites but don't make any changes\n")); printf(_(" --dry-run check prerequisites but don't make any changes\n"));
printf(_(" -F, --force unregister when witness node not running\n")); printf(_(" -F, --force unregister when witness node not running\n"));
printf(_(" --node-id node ID of the witness node (provide if executing on\n"));
printf(_(" another node)\n"));
puts(""); puts("");
return; return;

View File

@@ -42,12 +42,10 @@ typedef struct
bool force; bool force;
char pg_bindir[MAXLEN]; /* overrides setting in repmgr.conf */ char pg_bindir[MAXLEN]; /* overrides setting in repmgr.conf */
bool wait; bool wait;
bool no_wait;
/* logging options */ /* logging options */
char log_level[MAXLEN]; /* overrides setting in repmgr.conf */ char log_level[MAXLEN]; /* overrides setting in repmgr.conf */
bool log_to_file; bool log_to_file;
bool quiet;
bool terse; bool terse;
bool verbose; bool verbose;
@@ -70,7 +68,6 @@ typedef struct
int node_id; int node_id;
char node_name[MAXLEN]; char node_name[MAXLEN];
char data_dir[MAXPGPATH]; char data_dir[MAXPGPATH];
int remote_node_id;
/* "standby clone" options */ /* "standby clone" options */
bool copy_external_config_files; bool copy_external_config_files;
@@ -82,7 +79,6 @@ typedef struct
char replication_user[MAXLEN]; char replication_user[MAXLEN];
char upstream_conninfo[MAXLEN]; char upstream_conninfo[MAXLEN];
bool without_barman; bool without_barman;
bool recovery_conf_only;
/* "standby clone"/"standby follow" options */ /* "standby clone"/"standby follow" options */
int upstream_node_id; int upstream_node_id;
@@ -94,8 +90,7 @@ typedef struct
/* "standby switchover" options */ /* "standby switchover" options */
bool always_promote; bool always_promote;
bool force_rewind_used; bool force_rewind;
char force_rewind_path[MAXPGPATH];
bool siblings_follow; bool siblings_follow;
/* "node status" options */ /* "node status" options */
@@ -107,9 +102,7 @@ typedef struct
bool replication_lag; bool replication_lag;
bool role; bool role;
bool slots; bool slots;
bool missing_slots;
bool has_passfile; bool has_passfile;
bool replication_connection;
/* "node join" options */ /* "node join" options */
char config_files[MAXLEN]; char config_files[MAXLEN];
@@ -137,30 +130,30 @@ typedef struct
/* configuration metadata */ \ /* configuration metadata */ \
false, false, false, false, \ false, false, false, false, \
/* general configuration options */ \ /* general configuration options */ \
"", false, false, "", false, false, \ "", false, false, "", false, \
/* logging options */ \ /* logging options */ \
"", false, false, false, false, \ "", false, false, false, \
/* output options */ \ /* output options */ \
false, false, false, \ false, false, false, \
/* database connection options */ \ /* database connection options */ \
"", "", "", "", \ "", "", "", "", \
/* other connection options */ \ /* other connection options */ \
"", "", \ "", "", \
/* general node options */ \ /* node options */ \
UNKNOWN_NODE_ID, "", "", UNKNOWN_NODE_ID, \ UNKNOWN_NODE_ID, "", "", \
/* "standby clone" options */ \ /* "standby clone" options */ \
false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \ false, CONFIG_FILE_SAMEPATH, false, false, false, "", "", "", \
false, false, \ false, \
/* "standby clone"/"standby follow" options */ \ /* "standby clone"/"standby follow" options */ \
NO_UPSTREAM_NODE, \ NO_UPSTREAM_NODE, \
/* "standby register" options */ \ /* "standby register" options */ \
false, -1, DEFAULT_WAIT_START, \ false, 0, DEFAULT_WAIT_START, \
/* "standby switchover" options */ \ /* "standby switchover" options */ \
false, false, "", false, \ false, false, false, \
/* "node status" options */ \ /* "node status" options */ \
false, \ false, \
/* "node check" options */ \ /* "node check" options */ \
false, false, false, false, false, false, false, false, \ false, false, false, false, false, false, \
/* "node join" options */ \ /* "node join" options */ \
"", \ "", \
/* "node service" options */ \ /* "node service" options */ \
@@ -169,7 +162,7 @@ typedef struct
false, "", CLUSTER_EVENT_LIMIT, \ false, "", CLUSTER_EVENT_LIMIT, \
/* "cluster cleanup" options */ \ /* "cluster cleanup" options */ \
0, \ 0, \
/* following options for internal use */ \ /* Following options for internal use */ \
"/tmp", OM_TEXT \ "/tmp", OM_TEXT \
} }
@@ -186,7 +179,6 @@ typedef enum
ACTION_NONE, ACTION_NONE,
ACTION_START, ACTION_START,
ACTION_STOP, ACTION_STOP,
ACTION_STOP_WAIT,
ACTION_RESTART, ACTION_RESTART,
ACTION_RELOAD, ACTION_RELOAD,
ACTION_PROMOTE ACTION_PROMOTE
@@ -212,7 +204,6 @@ extern void check_93_config(void);
extern bool create_repmgr_extension(PGconn *conn); extern bool create_repmgr_extension(PGconn *conn);
extern int test_ssh_connection(char *host, char *remote_user); extern int test_ssh_connection(char *host, char *remote_user);
extern bool local_command(const char *command, PQExpBufferData *outputbuf); extern bool local_command(const char *command, PQExpBufferData *outputbuf);
extern bool local_command_simple(const char *command, PQExpBufferData *outputbuf);
extern standy_clone_mode get_standby_clone_mode(void); extern standy_clone_mode get_standby_clone_mode(void);
@@ -233,9 +224,7 @@ extern void print_help_header(void);
/* server control functions */ /* server control functions */
extern void get_server_action(t_server_action action, char *script, char *data_dir); extern void get_server_action(t_server_action action, char *script, char *data_dir);
extern bool data_dir_required_for_action(t_server_action action); extern bool data_dir_required_for_action(t_server_action action);
extern void get_node_config_directory(char *config_dir_buf);
extern void get_node_data_directory(char *data_dir_buf); extern void get_node_data_directory(char *data_dir_buf);
extern void init_node_record(t_node_info *node_record); extern void init_node_record(t_node_info *node_record);
extern bool can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason);
#endif /* _REPMGR_CLIENT_GLOBAL_H_ */ #endif /* _REPMGR_CLIENT_GLOBAL_H_ */

View File

@@ -53,7 +53,6 @@
#include "repmgr.h" #include "repmgr.h"
#include "compat.h" #include "compat.h"
#include "controldata.h"
#include "repmgr-client.h" #include "repmgr-client.h"
#include "repmgr-client-global.h" #include "repmgr-client-global.h"
#include "repmgr-action-primary.h" #include "repmgr-action-primary.h"
@@ -61,6 +60,7 @@
#include "repmgr-action-witness.h" #include "repmgr-action-witness.h"
#include "repmgr-action-bdr.h" #include "repmgr-action-bdr.h"
#include "repmgr-action-node.h" #include "repmgr-action-node.h"
#include "repmgr-action-cluster.h" #include "repmgr-action-cluster.h"
#include <storage/fd.h> /* for PG_TEMP_FILE_PREFIX */ #include <storage/fd.h> /* for PG_TEMP_FILE_PREFIX */
@@ -73,7 +73,7 @@ t_runtime_options runtime_options = T_RUNTIME_OPTIONS_INITIALIZER;
t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER; t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
/* conninfo params for the node we're operating on */ /* conninfo params for the node we're operating on */
t_conninfo_param_list source_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER; t_conninfo_param_list source_conninfo;
bool config_file_required = true; bool config_file_required = true;
char pg_bindir[MAXLEN] = ""; char pg_bindir[MAXLEN] = "";
@@ -91,14 +91,13 @@ t_node_info target_node_info = T_NODE_INFO_INITIALIZER;
static ItemList cli_errors = {NULL, NULL}; static ItemList cli_errors = {NULL, NULL};
static ItemList cli_warnings = {NULL, NULL}; static ItemList cli_warnings = {NULL, NULL};
static bool _local_command(const char *command, PQExpBufferData *outputbuf, bool simple);
int int
main(int argc, char **argv) main(int argc, char **argv)
{ {
t_conninfo_param_list default_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER; t_conninfo_param_list default_conninfo;
int optindex = 0; int optindex;
int c; int c;
char *repmgr_command = NULL; char *repmgr_command = NULL;
@@ -108,7 +107,6 @@ main(int argc, char **argv)
char *dummy_action = ""; char *dummy_action = "";
bool help_option = false; bool help_option = false;
bool option_error_found = false;
set_progname(argv[0]); set_progname(argv[0]);
@@ -179,10 +177,7 @@ main(int argc, char **argv)
strncpy(runtime_options.username, pw->pw_name, MAXLEN); strncpy(runtime_options.username, pw->pw_name, MAXLEN);
} }
/* Make getopt emitting errors */ while ((c = getopt_long(argc, argv, "?Vb:f:FWd:h:p:U:R:S:D:ckL:tvC:", long_options,
opterr = 1;
while ((c = getopt_long(argc, argv, "?Vb:f:FwWd:h:p:U:R:S:D:ck:L:qtvC:", long_options,
&optindex)) != -1) &optindex)) != -1)
{ {
/* /*
@@ -200,7 +195,13 @@ main(int argc, char **argv)
case OPT_HELP: /* --help */ case OPT_HELP: /* --help */
help_option = true; help_option = true;
break; break;
case '?':
/* Actual help option given */
if (strcmp(argv[optind - 1], "-?") == 0)
{
help_option = true;
}
break;
case 'V': case 'V':
/* /*
@@ -241,14 +242,9 @@ main(int argc, char **argv)
strncpy(runtime_options.replication_user, optarg, MAXLEN); strncpy(runtime_options.replication_user, optarg, MAXLEN);
break; break;
/* -w/--wait */ /* -W/--wait */
case 'w':
runtime_options.wait = true;
break;
/* -W/--no-wait */
case 'W': case 'W':
runtime_options.no_wait = true; runtime_options.wait = true;
break; break;
/*---------------------------- /*----------------------------
@@ -333,11 +329,6 @@ main(int argc, char **argv)
strncpy(runtime_options.node_name, optarg, MAXLEN); strncpy(runtime_options.node_name, optarg, MAXLEN);
break; break;
/* --remote-node-id */
case OPT_REMOTE_NODE_ID:
runtime_options.remote_node_id = repmgr_atoi(optarg, "--remote-node-id", &cli_errors, false);
break;
/* /*
* standby options * --------------- * standby options * ---------------
*/ */
@@ -393,11 +384,6 @@ main(int argc, char **argv)
runtime_options.without_barman = true; runtime_options.without_barman = true;
break; break;
case OPT_RECOVERY_CONF_ONLY:
runtime_options.recovery_conf_only = true;
break;
/*--------------------------- /*---------------------------
* "standby register" options * "standby register" options
*--------------------------- *---------------------------
@@ -425,13 +411,7 @@ main(int argc, char **argv)
break; break;
case OPT_FORCE_REWIND: case OPT_FORCE_REWIND:
runtime_options.force_rewind_used = true; runtime_options.force_rewind = true;
if (optarg != NULL)
{
strncpy(runtime_options.force_rewind_path, optarg, MAXPGPATH);
}
break; break;
case OPT_SIBLINGS_FOLLOW: case OPT_SIBLINGS_FOLLOW:
@@ -471,18 +451,10 @@ main(int argc, char **argv)
runtime_options.slots = true; runtime_options.slots = true;
break; break;
case OPT_MISSING_SLOTS:
runtime_options.missing_slots = true;
break;
case OPT_HAS_PASSFILE: case OPT_HAS_PASSFILE:
runtime_options.has_passfile = true; runtime_options.has_passfile = true;
break; break;
case OPT_REPL_CONN:
runtime_options.replication_connection = true;
break;
/*-------------------- /*--------------------
* "node rejoin" options * "node rejoin" options
*-------------------- *--------------------
@@ -574,12 +546,6 @@ main(int argc, char **argv)
logger_output_mode = OM_DAEMON; logger_output_mode = OM_DAEMON;
break; break;
/* --quiet */
case 'q':
runtime_options.quiet = true;
break;
/* --terse */ /* --terse */
case 't': case 't':
runtime_options.terse = true; runtime_options.terse = true;
@@ -635,29 +601,14 @@ main(int argc, char **argv)
_("--recovery-min-apply-delay is now a configuration file parameter, \"recovery_min_apply_delay\"")); _("--recovery-min-apply-delay is now a configuration file parameter, \"recovery_min_apply_delay\""));
break; break;
case ':': /* missing option argument */
option_error_found = true;
break;
case '?':
/* Actual help option given? */
if (strcmp(argv[optind - 1], "-?") == 0)
{
help_option = true;
break;
}
/* otherwise fall through to default */
default: /* invalid option */
option_error_found = true;
break;
} }
} }
/* /*
* If -d/--dbname appears to be a conninfo string, validate by attempting * If -d/--dbname appears to be a conninfo string, validate by attempting
* to parse it (and if successful, store the parsed parameters) * to parse it (and if successful, store the parsed parameters)
*/ */
if (runtime_options.dbname[0]) if (runtime_options.dbname)
{ {
if (strncmp(runtime_options.dbname, "postgresql://", 13) == 0 || if (strncmp(runtime_options.dbname, "postgresql://", 13) == 0 ||
strncmp(runtime_options.dbname, "postgres://", 11) == 0 || strncmp(runtime_options.dbname, "postgres://", 11) == 0 ||
@@ -753,10 +704,9 @@ main(int argc, char **argv)
if (cli_errors.head != NULL) if (cli_errors.head != NULL)
{ {
free_conninfo_params(&source_conninfo); free_conninfo_params(&source_conninfo);
exit_with_cli_errors(&cli_errors, NULL); exit_with_cli_errors(&cli_errors);
} }
/*---------- /*----------
* Determine the node type and action; following are valid: * Determine the node type and action; following are valid:
* *
@@ -787,6 +737,7 @@ main(int argc, char **argv)
if (repmgr_command != NULL) if (repmgr_command != NULL)
{ {
#ifndef BDR_ONLY
if (strcasecmp(repmgr_command, "PRIMARY") == 0 || strcasecmp(repmgr_command, "MASTER") == 0) if (strcasecmp(repmgr_command, "PRIMARY") == 0 || strcasecmp(repmgr_command, "MASTER") == 0)
{ {
if (help_option == true) if (help_option == true)
@@ -843,6 +794,9 @@ main(int argc, char **argv)
action = WITNESS_UNREGISTER; action = WITNESS_UNREGISTER;
} }
else if (strcasecmp(repmgr_command, "BDR") == 0) else if (strcasecmp(repmgr_command, "BDR") == 0)
#else
if (strcasecmp(repmgr_command, "BDR") == 0)
#endif
{ {
if (help_option == true) if (help_option == true)
{ {
@@ -1003,30 +957,9 @@ main(int argc, char **argv)
if (cli_errors.head != NULL) if (cli_errors.head != NULL)
{ {
free_conninfo_params(&source_conninfo); free_conninfo_params(&source_conninfo);
exit_with_cli_errors(&cli_errors);
exit_with_cli_errors(&cli_errors, valid_repmgr_command_found == true ? repmgr_command : NULL);
} }
/* no errors detected by repmgr, but getopt might have */
if (option_error_found == true)
{
if (valid_repmgr_command_found == true)
{
printf(_("Try \"%s --help\" or \"%s %s --help\" for more information.\n"),
progname(),
progname(),
repmgr_command);
}
else
{
printf(_("Try \"repmgr --help\" for more information.\n"));
}
free_conninfo_params(&source_conninfo);
exit(ERR_BAD_CONFIG);
}
/* /*
* Print any warnings about inappropriate command line options, unless * Print any warnings about inappropriate command line options, unless
* -t/--terse set * -t/--terse set
@@ -1055,10 +988,32 @@ main(int argc, char **argv)
runtime_options.output_mode = OM_OPTFORMAT; runtime_options.output_mode = OM_OPTFORMAT;
} }
/* check for conflicts between runtime options and configuration file */
/* ================================================================== */
if (action == STANDBY_CLONE)
{
standy_clone_mode mode = get_standby_clone_mode();
if (mode == barman && runtime_options.without_barman == false
&& config_file_options.use_replication_slots == true)
{
log_error(_("STANDBY CLONE in Barman mode is incompatible with configuration option \"use_replication_slots\""));
log_hint(_("set \"use_replication_slots\" to \"no\" in repmgr.conf, or use --without-barman fo clone directly from the upstream server"));
exit(ERR_BAD_CONFIG);
}
}
/* /*
* Check for configuration file items which can be overriden by runtime * Check for configuration file items which can be overriden by runtime
* options * options
* ===================================================================== */
/*
* ============================================================================
*/ */
/* /*
@@ -1112,28 +1067,6 @@ main(int argc, char **argv)
if (runtime_options.terse) if (runtime_options.terse)
logger_set_terse(); logger_set_terse();
/*
* If --dry-run specified, ensure log_level is at least LOG_INFO, regardless
* of what's in the configuration file or -L/--log-level paremeter, otherwise
* some or output might not be displayed.
*/
if (runtime_options.dry_run == true)
{
logger_set_min_level(LOG_INFO);
}
/*
* If -q/--quiet supplied, suppress any non-ERROR log output.
* This overrides everything else; we'll leave it up to the user to deal with the
* consequences of e.g. running --dry-run together with -q/--quiet.
*/
if (runtime_options.quiet == true)
{
logger_set_level(LOG_ERROR);
}
/* /*
* Node configuration information is not needed for all actions, with * Node configuration information is not needed for all actions, with
* STANDBY CLONE being the main exception. * STANDBY CLONE being the main exception.
@@ -1224,6 +1157,7 @@ main(int argc, char **argv)
switch (action) switch (action)
{ {
#ifndef BDR_ONLY
/* PRIMARY */ /* PRIMARY */
case PRIMARY_REGISTER: case PRIMARY_REGISTER:
do_primary_register(); do_primary_register();
@@ -1259,6 +1193,21 @@ main(int argc, char **argv)
case WITNESS_UNREGISTER: case WITNESS_UNREGISTER:
do_witness_unregister(); do_witness_unregister();
break; break;
#else
/* we won't ever reach here, but stop the compiler complaining */
case PRIMARY_REGISTER:
case PRIMARY_UNREGISTER:
case STANDBY_CLONE:
case STANDBY_REGISTER:
case STANDBY_UNREGISTER:
case STANDBY_PROMOTE:
case STANDBY_FOLLOW:
case STANDBY_SWITCHOVER:
case WITNESS_REGISTER:
case WITNESS_UNREGISTER:
break;
#endif
/* BDR */ /* BDR */
case BDR_REGISTER: case BDR_REGISTER:
do_bdr_register(); do_bdr_register();
@@ -1394,15 +1343,6 @@ check_cli_parameters(const int action)
_("--no-upstream-connection only effective in Barman mode")); _("--no-upstream-connection only effective in Barman mode"));
} }
} }
if (strlen(config_file_options.config_directory))
{
if (runtime_options.copy_external_config_files == false)
{
item_list_append(&cli_warnings,
_("\"config_directory\" set in repmgr.conf, but --copy-external-config-files not provided"));
}
}
} }
break; break;
@@ -1519,7 +1459,6 @@ check_cli_parameters(const int action)
{ {
case PRIMARY_UNREGISTER: case PRIMARY_UNREGISTER:
case STANDBY_UNREGISTER: case STANDBY_UNREGISTER:
case WITNESS_UNREGISTER:
case CLUSTER_EVENT: case CLUSTER_EVENT:
case CLUSTER_MATRIX: case CLUSTER_MATRIX:
case CLUSTER_CROSSCHECK: case CLUSTER_CROSSCHECK:
@@ -1560,7 +1499,6 @@ check_cli_parameters(const int action)
case STANDBY_CLONE: case STANDBY_CLONE:
case STANDBY_REGISTER: case STANDBY_REGISTER:
case STANDBY_FOLLOW: case STANDBY_FOLLOW:
case BDR_REGISTER:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(&cli_warnings,
@@ -1569,39 +1507,6 @@ check_cli_parameters(const int action)
} }
} }
if (runtime_options.replication_user[0])
{
switch (action)
{
case PRIMARY_REGISTER:
case STANDBY_REGISTER:
case STANDBY_CLONE:
break;
case STANDBY_FOLLOW:
item_list_append_format(&cli_warnings,
_("--replication-user ignored when executing %s"),
action_name(action));
default:
item_list_append_format(&cli_warnings,
_("--replication-user not required when executing %s"),
action_name(action));
}
}
if (runtime_options.recovery_conf_only == true)
{
switch (action)
{
case STANDBY_CLONE:
break;
default:
item_list_append_format(&cli_warnings,
_("--create-recovery-conf will be ignored when executing %s"),
action_name(action));
}
}
if (runtime_options.event[0]) if (runtime_options.event[0])
{ {
switch (action) switch (action)
@@ -1615,6 +1520,25 @@ check_cli_parameters(const int action)
} }
} }
if (runtime_options.replication_user[0])
{
switch (action)
{
case PRIMARY_REGISTER:
case STANDBY_REGISTER:
break;
case STANDBY_CLONE:
case STANDBY_FOLLOW:
item_list_append_format(&cli_warnings,
_("--replication-user ignored when executing %s)"),
action_name(action));
default:
item_list_append_format(&cli_warnings,
_("--replication-user not required when executing %s"),
action_name(action));
}
}
if (runtime_options.limit_provided) if (runtime_options.limit_provided)
{ {
switch (action) switch (action)
@@ -1653,41 +1577,6 @@ check_cli_parameters(const int action)
} }
} }
/* --wait/--no-wait */
if (runtime_options.wait == true && runtime_options.no_wait == true)
{
item_list_append_format(&cli_errors,
_("both --wait and --no-wait options provided"));
}
else
{
if (runtime_options.wait)
{
switch (action)
{
case STANDBY_FOLLOW:
break;
default:
item_list_append_format(&cli_warnings,
_("--wait will be ignored when executing %s"),
action_name(action));
}
}
else if (runtime_options.wait)
{
switch (action)
{
case NODE_REJOIN:
break;
default:
item_list_append_format(&cli_warnings,
_("--no-wait will be ignored when executing %s"),
action_name(action));
}
}
}
/* repmgr node service --action */ /* repmgr node service --action */
if (runtime_options.action[0] != '\0') if (runtime_options.action[0] != '\0')
{ {
@@ -1710,7 +1599,8 @@ check_cli_parameters(const int action)
case NODE_STATUS: case NODE_STATUS:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(
&cli_warnings,
_("--is-shutdown-cleanly will be ignored when executing %s"), _("--is-shutdown-cleanly will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1723,13 +1613,14 @@ check_cli_parameters(const int action)
case STANDBY_SWITCHOVER: case STANDBY_SWITCHOVER:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(
&cli_warnings,
_("--always-promote will be ignored when executing %s"), _("--always-promote will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
} }
if (runtime_options.force_rewind_used == true) if (runtime_options.force_rewind == true)
{ {
switch (action) switch (action)
{ {
@@ -1737,7 +1628,8 @@ check_cli_parameters(const int action)
case NODE_REJOIN: case NODE_REJOIN:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(
&cli_warnings,
_("--force-rewind will be ignored when executing %s"), _("--force-rewind will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1751,7 +1643,8 @@ check_cli_parameters(const int action)
case NODE_REJOIN: case NODE_REJOIN:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(
&cli_warnings,
_("--config-files will be ignored when executing %s"), _("--config-files will be ignored when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1765,7 +1658,6 @@ check_cli_parameters(const int action)
case PRIMARY_UNREGISTER: case PRIMARY_UNREGISTER:
case STANDBY_CLONE: case STANDBY_CLONE:
case STANDBY_REGISTER: case STANDBY_REGISTER:
case STANDBY_FOLLOW:
case STANDBY_SWITCHOVER: case STANDBY_SWITCHOVER:
case WITNESS_REGISTER: case WITNESS_REGISTER:
case WITNESS_UNREGISTER: case WITNESS_UNREGISTER:
@@ -1773,7 +1665,8 @@ check_cli_parameters(const int action)
case NODE_SERVICE: case NODE_SERVICE:
break; break;
default: default:
item_list_append_format(&cli_warnings, item_list_append_format(
&cli_warnings,
_("--dry-run is not effective when executing %s"), _("--dry-run is not effective when executing %s"),
action_name(action)); action_name(action));
} }
@@ -1795,7 +1688,8 @@ check_cli_parameters(const int action)
if (used_options > 1) if (used_options > 1)
{ {
/* TODO: list which options were used */ /* TODO: list which options were used */
item_list_append(&cli_errors, item_list_append(
&cli_errors,
"only one of --csv, --nagios and --optformat can be used"); "only one of --csv, --nagios and --optformat can be used");
} }
} }
@@ -1899,12 +1793,13 @@ do_help(void)
print_help_header(); print_help_header();
printf(_("Usage:\n")); printf(_("Usage:\n"));
#ifndef BDR_ONLY
printf(_(" %s [OPTIONS] primary {register|unregister}\n"), progname()); printf(_(" %s [OPTIONS] primary {register|unregister}\n"), progname());
printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow|switchover}\n"), progname()); printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow}\n"), progname());
#endif
printf(_(" %s [OPTIONS] bdr {register|unregister}\n"), progname()); printf(_(" %s [OPTIONS] bdr {register|unregister}\n"), progname());
printf(_(" %s [OPTIONS] node {status|check|rejoin|service}\n"), progname()); printf(_(" %s [OPTIONS] node status\n"), progname());
printf(_(" %s [OPTIONS] cluster {show|event|matrix|crosscheck|cleanup}\n"), progname()); printf(_(" %s [OPTIONS] cluster {show|event|matrix|crosscheck}\n"), progname());
printf(_(" %s [OPTIONS] witness {register|unregister}\n"), progname());
puts(""); puts("");
@@ -1952,7 +1847,6 @@ do_help(void)
printf(_(" --dry-run show what would happen for action, but don't execute it\n")); printf(_(" --dry-run show what would happen for action, but don't execute it\n"));
printf(_(" -L, --log-level set log level (overrides configuration file; default: NOTICE)\n")); printf(_(" -L, --log-level set log level (overrides configuration file; default: NOTICE)\n"));
printf(_(" --log-to-file log to file (or logging facility) defined in repmgr.conf\n")); printf(_(" --log-to-file log to file (or logging facility) defined in repmgr.conf\n"));
printf(_(" -q, --quiet suppress all log output apart from errors\n"));
printf(_(" -t, --terse don't display detail, hints and other non-critical output\n")); printf(_(" -t, --terse don't display detail, hints and other non-critical output\n"));
printf(_(" -v, --verbose display additional log output (useful for debugging)\n")); printf(_(" -v, --verbose display additional log output (useful for debugging)\n"));
@@ -2222,8 +2116,6 @@ test_ssh_connection(char *host, char *remote_user)
} }
/* /*
* Execute a command locally. "outputbuf" should either be an * Execute a command locally. "outputbuf" should either be an
* initialised PQexpbuffer, or NULL * initialised PQexpbuffer, or NULL
@@ -2231,26 +2123,9 @@ test_ssh_connection(char *host, char *remote_user)
bool bool
local_command(const char *command, PQExpBufferData *outputbuf) local_command(const char *command, PQExpBufferData *outputbuf)
{ {
return _local_command(command, outputbuf, false); FILE *fp;
}
bool
local_command_simple(const char *command, PQExpBufferData *outputbuf)
{
return _local_command(command, outputbuf, true);
}
static bool
_local_command(const char *command, PQExpBufferData *outputbuf, bool simple)
{
FILE *fp = NULL;
char output[MAXLEN]; char output[MAXLEN];
int retval = 0; int retval = 0;
bool success;
log_verbose(LOG_DEBUG, "executing:\n %s", command);
if (outputbuf == NULL) if (outputbuf == NULL)
{ {
@@ -2266,46 +2141,27 @@ _local_command(const char *command, PQExpBufferData *outputbuf, bool simple)
return false; return false;
} }
/* TODO: better error handling */
while (fgets(output, MAXLEN, fp) != NULL) while (fgets(output, MAXLEN, fp) != NULL)
{ {
appendPQExpBuffer(outputbuf, "%s", output); appendPQExpBuffer(outputbuf, "%s", output);
if (!feof(fp) && simple == false)
{
break;
}
} }
retval = pclose(fp); pclose(fp);
/* */
success = (WEXITSTATUS(retval) == 0 || WEXITSTATUS(retval) == 141) ? true : false;
log_verbose(LOG_DEBUG, "result of command was %i (%i)", WEXITSTATUS(retval), retval);
if (outputbuf->data != NULL) if (outputbuf->data != NULL)
log_verbose(LOG_DEBUG, "local_command(): output returned was:\n%s", outputbuf->data); log_verbose(LOG_DEBUG, "local_command(): output returned was:\n%s", outputbuf->data);
else else
log_verbose(LOG_DEBUG, "local_command(): no output returned"); log_verbose(LOG_DEBUG, "local_command(): no output returned");
return success; return true;
} }
/*
* get_superuser_connection()
*
* Check if provided connection "conn" is a superuser connection, if not attempt to
* make a superuser connection "superuser_conn" with the provided --superuser parameter.
*
* "privileged_conn" is set to whichever connection is the superuser connection.
*/
void void
get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privileged_conn) get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privileged_conn)
{ {
t_connection_user userinfo = T_CONNECTION_USER_INITIALIZER; t_connection_user userinfo = T_CONNECTION_USER_INITIALIZER;
t_conninfo_param_list conninfo_params = T_CONNINFO_PARAM_LIST_INITIALIZER;
bool is_superuser = false; bool is_superuser = false;
/* this should never happen */ /* this should never happen */
@@ -2314,7 +2170,6 @@ get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privil
log_error(_("no database connection available")); log_error(_("no database connection available"));
exit(ERR_INTERNAL); exit(ERR_INTERNAL);
} }
is_superuser = is_superuser_connection(*conn, &userinfo); is_superuser = is_superuser_connection(*conn, &userinfo);
if (is_superuser == true) if (is_superuser == true)
@@ -2332,11 +2187,9 @@ get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privil
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
initialize_conninfo_params(&conninfo_params, false); *superuser_conn = establish_db_connection_as_user(config_file_options.conninfo,
conn_to_param_list(*conn, &conninfo_params); runtime_options.superuser,
param_set(&conninfo_params, "user", runtime_options.superuser); false);
*superuser_conn = establish_db_connection_by_params(&conninfo_params, false);
if (PQstatus(*superuser_conn) != CONNECTION_OK) if (PQstatus(*superuser_conn) != CONNECTION_OK)
{ {
@@ -2356,8 +2209,6 @@ get_superuser_connection(PGconn **conn, PGconn **superuser_conn, PGconn **privil
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
log_debug("established superuser connection as \"%s\"", runtime_options.superuser);
*privileged_conn = *superuser_conn; *privileged_conn = *superuser_conn;
return; return;
} }
@@ -2499,6 +2350,9 @@ copy_remote_files(char *host, char *remote_user, char *remote_path,
} }
/* /*
* Execute a command via ssh on the remote host. * Execute a command via ssh on the remote host.
* *
@@ -2562,12 +2416,7 @@ remote_command(const char *host, const char *user, const char *command, PQExpBuf
pclose(fp); pclose(fp);
if (outputbuf != NULL) if (outputbuf != NULL)
{ log_verbose(LOG_DEBUG, "remote_command(): output returned was:\n %s", outputbuf->data);
if (strlen(outputbuf->data))
log_verbose(LOG_DEBUG, "remote_command(): output returned was:\n%s", outputbuf->data);
else
log_verbose(LOG_DEBUG, "remote_command(): no output returned");
}
return true; return true;
} }
@@ -2613,15 +2462,18 @@ get_server_action(t_server_action action, char *script, char *data_dir)
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString(&command, appendShellString(
&command,
data_dir); data_dir);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
" start"); " start");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2633,7 +2485,6 @@ get_server_action(t_server_action action, char *script, char *data_dir)
} }
case ACTION_STOP: case ACTION_STOP:
case ACTION_STOP_WAIT:
{ {
if (config_file_options.service_stop_command[0] != '\0') if (config_file_options.service_stop_command[0] != '\0')
{ {
@@ -2643,23 +2494,19 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
"%s %s -D ", "%s %s -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString(&command, appendShellString(
&command,
data_dir); data_dir);
if (action == ACTION_STOP_WAIT) appendPQExpBuffer(
appendPQExpBuffer(&command, &command,
" -w"); " -m fast -W stop");
else
appendPQExpBuffer(&command,
" -W");
appendPQExpBuffer(&command,
" -m fast stop");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2678,15 +2525,18 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString(&command, appendShellString(
&command,
data_dir); data_dir);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
" restart"); " restart");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2706,15 +2556,18 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString(&command, appendShellString(
&command,
data_dir); data_dir);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
" reload"); " reload");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2735,15 +2588,18 @@ get_server_action(t_server_action action, char *script, char *data_dir)
else else
{ {
initPQExpBuffer(&command); initPQExpBuffer(&command);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
"%s %s -w -D ", "%s %s -w -D ",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
config_file_options.pg_ctl_options); config_file_options.pg_ctl_options);
appendShellString(&command, appendShellString(
&command,
data_dir); data_dir);
appendPQExpBuffer(&command, appendPQExpBuffer(
&command,
" promote"); " promote");
strncpy(script, command.data, MAXLEN); strncpy(script, command.data, MAXLEN);
@@ -2777,7 +2633,6 @@ data_dir_required_for_action(t_server_action action)
return true; return true;
case ACTION_STOP: case ACTION_STOP:
case ACTION_STOP_WAIT:
if (config_file_options.service_stop_command[0] != '\0') if (config_file_options.service_stop_command[0] != '\0')
{ {
return false; return false;
@@ -2813,33 +2668,6 @@ data_dir_required_for_action(t_server_action action)
} }
/*
* Copy the location of the configuration file directory into the
* provided buffer; if "config_directory" provided, use that, otherwise
* default to the data directory.
*
* This is primarily intended for use with "pg_ctl" (which itself shouldn't
* be used outside of development environments).
*/
void
get_node_config_directory(char *config_dir_buf)
{
if (config_file_options.config_directory[0] != '\0')
{
strncpy(config_dir_buf, config_file_options.config_directory, MAXPGPATH);
return;
}
if (config_file_options.data_directory[0] != '\0')
{
strncpy(config_dir_buf, config_file_options.data_directory, MAXPGPATH);
return;
}
return;
}
void void
get_node_data_directory(char *data_dir_buf) get_node_data_directory(char *data_dir_buf)
{ {
@@ -2887,7 +2715,7 @@ init_node_record(t_node_info *node_record)
if (config_file_options.replication_user[0] != '\0') if (config_file_options.replication_user[0] != '\0')
{ {
/* replication user explicitly provided in configuration file */ /* replication user explicitly provided */
strncpy(node_record->repluser, config_file_options.replication_user, NAMEDATALEN); strncpy(node_record->repluser, config_file_options.replication_user, NAMEDATALEN);
} }
else else
@@ -2904,77 +2732,3 @@ init_node_record(t_node_info *node_record)
create_slot_name(node_record->slot_name, config_file_options.node_id); create_slot_name(node_record->slot_name, config_file_options.node_id);
} }
} }
bool
can_use_pg_rewind(PGconn *conn, const char *data_directory, PQExpBufferData *reason)
{
bool can_use = true;
int server_version_num = get_server_version(conn, NULL);
/* wal_log_hints not available in 9.3, so just determine if data checksums enabled */
if (server_version_num < 90400)
{
int data_checksum_version = get_data_checksum_version(data_directory);
if (data_checksum_version < 0)
{
appendPQExpBuffer(reason,
_("unable to determine data checksum version"));
can_use = false;
}
else if (data_checksum_version == 0)
{
appendPQExpBuffer(reason,
_("this cluster was initialised without data checksums"));
can_use = false;
}
return can_use;
}
/* "full_page_writes" must be on in any case */
if (guc_set(conn, "full_page_writes", "=", "off"))
{
if (can_use == false)
appendPQExpBuffer(reason, "; ");
appendPQExpBuffer(reason,
_("\"full_page_writes\" must be set to \"on\""));
can_use = false;
}
/*
* "wal_log_hints" off - are data checksums available? Note: we're
* checking the local pg_control file here as the value will be the same
* throughout the cluster and saves a round-trip to the demotion
* candidate.
*/
if (guc_set(conn, "wal_log_hints", "=", "on") == false)
{
int data_checksum_version = get_data_checksum_version(data_directory);
if (data_checksum_version < 0)
{
if (can_use == false)
appendPQExpBuffer(reason, "; ");
appendPQExpBuffer(reason,
_("\"wal_log_hints\" is set to \"off\" but unable to determine data checksum version"));
can_use = false;
}
else if (data_checksum_version == 0)
{
if (can_use == false)
appendPQExpBuffer(reason, "; ");
appendPQExpBuffer(reason,
_("\"wal_log_hints\" is set to \"off\" and data checksums are disabled"));
can_use = false;
}
}
return can_use;
}

View File

@@ -83,11 +83,6 @@
#define OPT_CONFIG_ARCHIVE_DIR 1034 #define OPT_CONFIG_ARCHIVE_DIR 1034
#define OPT_HAS_PASSFILE 1035 #define OPT_HAS_PASSFILE 1035
#define OPT_WAIT_START 1036 #define OPT_WAIT_START 1036
#define OPT_REPL_CONN 1037
#define OPT_REMOTE_NODE_ID 1038
#define OPT_RECOVERY_CONF_ONLY 1039
#define OPT_NO_WAIT 1040
#define OPT_MISSING_SLOTS 1041
/* deprecated since 3.3 */ /* deprecated since 3.3 */
#define OPT_DATA_DIR 999 #define OPT_DATA_DIR 999
@@ -106,8 +101,7 @@ static struct option long_options[] =
{"dry-run", no_argument, NULL, OPT_DRY_RUN}, {"dry-run", no_argument, NULL, OPT_DRY_RUN},
{"force", no_argument, NULL, 'F'}, {"force", no_argument, NULL, 'F'},
{"pg_bindir", required_argument, NULL, 'b'}, {"pg_bindir", required_argument, NULL, 'b'},
{"wait", no_argument, NULL, 'w'}, {"wait", no_argument, NULL, 'W'},
{"no-wait", no_argument, NULL, 'W'},
/* connection options */ /* connection options */
{"dbname", required_argument, NULL, 'd'}, {"dbname", required_argument, NULL, 'd'},
@@ -121,12 +115,10 @@ static struct option long_options[] =
{"pgdata", required_argument, NULL, 'D'}, {"pgdata", required_argument, NULL, 'D'},
{"node-id", required_argument, NULL, OPT_NODE_ID}, {"node-id", required_argument, NULL, OPT_NODE_ID},
{"node-name", required_argument, NULL, OPT_NODE_NAME}, {"node-name", required_argument, NULL, OPT_NODE_NAME},
{"remote-node-id", required_argument, NULL, OPT_REMOTE_NODE_ID},
/* logging options */ /* logging options */
{"log-level", required_argument, NULL, 'L'}, {"log-level", required_argument, NULL, 'L'},
{"log-to-file", no_argument, NULL, OPT_LOG_TO_FILE}, {"log-to-file", no_argument, NULL, OPT_LOG_TO_FILE},
{"quiet", no_argument, NULL, 'q'},
{"terse", no_argument, NULL, 't'}, {"terse", no_argument, NULL, 't'},
{"verbose", no_argument, NULL, 'v'}, {"verbose", no_argument, NULL, 'v'},
@@ -144,7 +136,6 @@ static struct option long_options[] =
{"upstream-conninfo", required_argument, NULL, OPT_UPSTREAM_CONNINFO}, {"upstream-conninfo", required_argument, NULL, OPT_UPSTREAM_CONNINFO},
{"upstream-node-id", required_argument, NULL, OPT_UPSTREAM_NODE_ID}, {"upstream-node-id", required_argument, NULL, OPT_UPSTREAM_NODE_ID},
{"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN}, {"without-barman", no_argument, NULL, OPT_WITHOUT_BARMAN},
{"recovery-conf-only", no_argument, NULL, OPT_RECOVERY_CONF_ONLY},
/* "standby register" options */ /* "standby register" options */
{"wait-start", required_argument, NULL, OPT_WAIT_START}, {"wait-start", required_argument, NULL, OPT_WAIT_START},
@@ -166,14 +157,12 @@ static struct option long_options[] =
{"replication-lag", no_argument, NULL, OPT_REPLICATION_LAG}, {"replication-lag", no_argument, NULL, OPT_REPLICATION_LAG},
{"role", no_argument, NULL, OPT_ROLE}, {"role", no_argument, NULL, OPT_ROLE},
{"slots", no_argument, NULL, OPT_SLOTS}, {"slots", no_argument, NULL, OPT_SLOTS},
{"missing-slots", no_argument, NULL, OPT_MISSING_SLOTS},
{"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE}, {"has-passfile", no_argument, NULL, OPT_HAS_PASSFILE},
{"replication-connection", no_argument, NULL, OPT_REPL_CONN},
/* "node rejoin" options */ /* "node rejoin" options */
{"config-files", required_argument, NULL, OPT_CONFIG_FILES}, {"config-files", required_argument, NULL, OPT_CONFIG_FILES},
{"config-archive-dir", required_argument, NULL, OPT_CONFIG_ARCHIVE_DIR}, {"config-archive-dir", required_argument, NULL, OPT_CONFIG_ARCHIVE_DIR},
{"force-rewind", optional_argument, NULL, OPT_FORCE_REWIND}, {"force-rewind", no_argument, NULL, OPT_FORCE_REWIND},
/* "node service" options */ /* "node service" options */
{"action", required_argument, NULL, OPT_ACTION}, {"action", required_argument, NULL, OPT_ACTION},

View File

@@ -288,6 +288,7 @@ standby_get_last_updated(PG_FUNCTION_ARGS)
Datum Datum
notify_follow_primary(PG_FUNCTION_ARGS) notify_follow_primary(PG_FUNCTION_ARGS)
{ {
#ifndef BDR_ONLY
int primary_node_id = UNKNOWN_NODE_ID; int primary_node_id = UNKNOWN_NODE_ID;
if (!shared_state) if (!shared_state)
@@ -315,7 +316,7 @@ notify_follow_primary(PG_FUNCTION_ARGS)
} }
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
PG_RETURN_VOID(); PG_RETURN_VOID();
} }
@@ -328,12 +329,14 @@ get_new_primary(PG_FUNCTION_ARGS)
if (!shared_state) if (!shared_state)
PG_RETURN_NULL(); PG_RETURN_NULL();
#ifndef BDR_ONLY
LWLockAcquire(shared_state->lock, LW_SHARED); LWLockAcquire(shared_state->lock, LW_SHARED);
if (shared_state->follow_new_primary == true) if (shared_state->follow_new_primary == true)
new_primary_node_id = shared_state->candidate_node_id; new_primary_node_id = shared_state->candidate_node_id;
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
if (new_primary_node_id == UNKNOWN_NODE_ID) if (new_primary_node_id == UNKNOWN_NODE_ID)
PG_RETURN_NULL(); PG_RETURN_NULL();
@@ -345,6 +348,7 @@ get_new_primary(PG_FUNCTION_ARGS)
Datum Datum
reset_voting_status(PG_FUNCTION_ARGS) reset_voting_status(PG_FUNCTION_ARGS)
{ {
#ifndef BDR_ONLY
if (!shared_state) if (!shared_state)
PG_RETURN_NULL(); PG_RETURN_NULL();
@@ -362,7 +366,7 @@ reset_voting_status(PG_FUNCTION_ARGS)
} }
LWLockRelease(shared_state->lock); LWLockRelease(shared_state->lock);
#endif
PG_RETURN_VOID(); PG_RETURN_VOID();
} }

View File

@@ -40,28 +40,18 @@
# is not running and there's no other way of determining # is not running and there's no other way of determining
# the data directory. # the data directory.
#replication_user='repmgr' # User to make replication connections with, if not set defaults
# to the user defined in "conninfo".
# ============================================================================= # =============================================================================
# Optional configuration items # Optional configuration items
# ============================================================================= # =============================================================================
#------------------------------------------------------------------------------
# Server settings
#------------------------------------------------------------------------------
#config_directory='' # If configuration files are located outside the data
# directory, specify the directory where the main
# postgresql.conf file is located.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Replication settings # Replication settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
#replication_user='repmgr' # User to make replication connections with, if not set defaults
# to the user defined in "conninfo".
#replication_type=physical # Must be one of 'physical' or 'bdr'. #replication_type=physical # Must be one of 'physical' or 'bdr'.
#location=default # arbitrary string defining the location of the node; this #location=default # arbitrary string defining the location of the node; this
@@ -75,6 +65,9 @@
# at least the number of standbys which will connect # at least the number of standbys which will connect
# to the primary. # to the primary.
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" in recovery.conf
# will be set to this value.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Witness server settings # Witness server settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
@@ -98,7 +91,7 @@
#log_facility=STDERR # Logging facility: possible values are STDERR, or for #log_facility=STDERR # Logging facility: possible values are STDERR, or for
# syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER # syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER
#log_file='' # STDERR can be redirected to an arbitrary file #log_file='' # stderr can be redirected to an arbitrary file:
#log_status_interval=300 # interval (in seconds) for repmgrd to log a status message #log_status_interval=300 # interval (in seconds) for repmgrd to log a status message
@@ -168,7 +161,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# "standby clone" settings # Standby clone settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# #
# These settings apply when cloning a standby ("repmgr standby clone"). # These settings apply when cloning a standby ("repmgr standby clone").
@@ -182,65 +175,20 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# file system location to another. This # file system location to another. This
# parameter can be provided multiple times. # parameter can be provided multiple times.
#restore_command='' # This will be placed in the recovery.conf file generated #restore_command='' # This will be placed in the recovery.conf
# by repmgr. # file generated by repmgr
#archive_cleanup_command='' # This will be placed in the recovery.conf file generated
# by repmgr. Note we recommend using Barman for managing
# WAL archives (see: https://www.pgbarman.org )
#recovery_min_apply_delay= # If provided, "recovery_min_apply_delay" in recovery.conf
# will be set to this value (PostgreSQL 9.4 and later).
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# "standby promote" settings # Standby follow settings
#------------------------------------------------------------------------------
# These settings apply when instructing a standby to promote itself to the
# new primary ("repmgr standby promote").
#promote_check_timeout=60 # The length of time (in seconds) to wait
# for the new primary to finish promoting
#promote_check_interval=1 # The interval (in seconds) to check whether
# the new primary has finished promoting
#------------------------------------------------------------------------------
# "standby follow" settings
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# These settings apply when instructing a standby to follow the new primary # These settings apply when instructing a standby to follow the new primary
# ("repmgr standby follow"). # ("repmgr standby follow").
#primary_follow_timeout=60 # The max length of time (in seconds) to wait #primary_follow_timeout=60 # The length of time (in seconds) to wait
# for the new primary to become available # for the new primary to become available
#standby_follow_timeout=15 # The max length of time (in seconds) to wait
# for the standby to connect to the primary
#------------------------------------------------------------------------------
# "standby switchover" settings
#------------------------------------------------------------------------------
# These settings apply when switching roles between a primary and a standby
# ("repmgr standby switchover").
#standby_reconnect_timeout=60 # The max length of time (in seconds) to wait
# for the demoted standby to reconnect to the promoted
# primary (note: this value should be equal to or greater
# than that set for "node_rejoin_timeout")
#------------------------------------------------------------------------------
# "node rejoin" settings
#------------------------------------------------------------------------------
# These settings apply when reintegrating a node into a replication cluster
# with "repmgrd_node_rejoin"
#node_rejoin_timeout=60 # The maximum length of time (in seconds) to wait for
# the node to reconnect to the replication cluster
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Barman options # Barman options
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
@@ -258,11 +206,6 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# These settings are only applied when repmgrd is running. Values shown # These settings are only applied when repmgrd is running. Values shown
# are defaults. # are defaults.
#repmgrd_pid_file= # Path of PID file to use for repmgrd; if not set, a PID file will
# be generated in a temporary directory specified by the environment
# variable $TMPDIR, or if not set, in "/tmp". This value can be overridden
# by the command line option "-p/--pid-file"; the command line option
# "--no-pid-file" will force PID file creation to be skipped.
#failover=manual # one of 'automatic', 'manual'. #failover=manual # one of 'automatic', 'manual'.
# determines what action to take in the event of upstream failure # determines what action to take in the event of upstream failure
# #
@@ -272,7 +215,7 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# manual attention to reattach it to replication # manual attention to reattach it to replication
# (does not apply to BDR mode) # (does not apply to BDR mode)
#priority=100 # indicate a preferred priority for promoting nodes; #priority=100 # indicate a preferred priorty for promoting nodes;
# a value of zero prevents the node being promoted to primary # a value of zero prevents the node being promoted to primary
# (default: 100) # (default: 100)
@@ -280,11 +223,11 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# primary (or other upstream node) # primary (or other upstream node)
#reconnect_interval=10 # Interval between attempts to reconnect to an unreachable #reconnect_interval=10 # Interval between attempts to reconnect to an unreachable
# primary (or other upstream node) # primary (or other upstream node)
#promote_command= # command repmgrd executes when promoting a new primary; use something like: #promote_command= # command to execute when promoting a new primary; use something like:
# #
# repmgr standby promote -f /etc/repmgr.conf # repmgr standby promote -f /etc/repmgr.conf
# #
#follow_command= # command repmgrd executes when instructing a standby to follow a new primary; #follow_command= # command to execute when instructing a standby to follow a new primary;
# use something like: # use something like:
# #
# repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n # repmgr standby follow -f /etc/repmgr.conf -W --upstream-node-id=%n
@@ -292,12 +235,8 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
#primary_notification_timeout=60 # Interval (in seconds) which repmgrd on a standby #primary_notification_timeout=60 # Interval (in seconds) which repmgrd on a standby
# will wait for a notification from the new primary, # will wait for a notification from the new primary,
# before falling back to degraded monitoring # before falling back to degraded monitoring
#repmgrd_standby_startup_timeout=60 # Interval (in seconds) which repmgrd on a standby will wait #monitoring_history=no
# for the the local node to restart and become ready to accept connections after
# executing "follow_command" (defaults to the value set in "standby_reconnect_timeout")
#monitoring_history=no # Whether to write monitoring data to the "montoring_history" table
#monitor_interval_secs=2 # Interval (in seconds) at which to write monitoring data
#degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the #degraded_monitoring_timeout=-1 # Interval (in seconds) after which repmgrd will terminate if the
# server being monitored is no longer available. -1 (default) # server being monitored is no longer available. -1 (default)
# disables the timeout completely. # disables the timeout completely.
@@ -330,19 +269,16 @@ ssh_options='-q -o ConnectTimeout=10' # Options to append to "ssh"
# /usr/bin/systemctl start postgresql-9.6, \ # /usr/bin/systemctl start postgresql-9.6, \
# /usr/bin/systemctl restart postgresql-9.6 # /usr/bin/systemctl restart postgresql-9.6
# #
# Debian/Ubuntu users: use "sudo pg_ctlcluster" to execute service control commands.
#
# For more details, see: https://repmgr.org/docs/4.0/configuration-service-commands.html
#service_start_command = '' #service_start_command = ''
#service_stop_command = '' #service_stop_command = ''
#service_restart_command = '' #service_restart_command = ''
#service_reload_command = '' #service_reload_command = ''
#service_promote_command = '' # This parameter is intended for systems which provide a #service_promote_command = '' # Note: this overrides any value contained in the setting
# package-level promote command, such as Debian's # "promote_command". This is intended for systems which
# "pg_ctlcluster". *IMPORTANT*: it is *not* a substitute # provide a package-level promote command, such as Debian's
# for "promote_command"; do not use "repmgr standby promote" # "pg_ctlcluster"
# (or a script which executes "repmgr standby promote") here.
#------------------------------------------------------------------------------ #------------------------------------------------------------------------------
# Status check thresholds # Status check thresholds

View File

@@ -1,6 +1,6 @@
# repmgr extension # repmgr extension
comment = 'Replication manager for PostgreSQL' comment = 'Replication manager for PostgreSQL'
default_version = '4.1' default_version = '4.0'
module_pathname = '$libdir/repmgr' module_pathname = '$libdir/repmgr'
relocatable = false relocatable = false
schema = repmgr schema = repmgr

View File

@@ -49,8 +49,6 @@
#define REPLICATION_TYPE_BDR 2 #define REPLICATION_TYPE_BDR 2
#define UNKNOWN_SERVER_VERSION_NUM -1 #define UNKNOWN_SERVER_VERSION_NUM -1
#define UNKNOWN_BDR_VERSION_NUM -1
#define UNKNOWN_TIMELINE_ID -1 #define UNKNOWN_TIMELINE_ID -1
#define UNKNOWN_SYSTEM_IDENTIFIER 0 #define UNKNOWN_SYSTEM_IDENTIFIER 0
@@ -60,8 +58,6 @@
#define VOTING_TERM_NOT_SET -1 #define VOTING_TERM_NOT_SET -1
#define BDR2_REPLICATION_SET_NAME "repmgr"
/* /*
* various default values - ensure repmgr.conf.sample is update * various default values - ensure repmgr.conf.sample is update
* if any of these are changed * if any of these are changed
@@ -74,7 +70,6 @@
#define DEFAULT_ASYNC_QUERY_TIMEOUT 60 /* seconds */ #define DEFAULT_ASYNC_QUERY_TIMEOUT 60 /* seconds */
#define DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT 60 /* seconds */ #define DEFAULT_PRIMARY_NOTIFICATION_TIMEOUT 60 /* seconds */
#define DEFAULT_PRIMARY_FOLLOW_TIMEOUT 60 /* seconds */ #define DEFAULT_PRIMARY_FOLLOW_TIMEOUT 60 /* seconds */
#define DEFAULT_STANDBY_FOLLOW_TIMEOUT 30 /* seconds */
#define DEFAULT_BDR_RECOVERY_TIMEOUT 30 /* seconds */ #define DEFAULT_BDR_RECOVERY_TIMEOUT 30 /* seconds */
#define DEFAULT_ARCHIVE_READY_WARNING 16 /* WAL files */ #define DEFAULT_ARCHIVE_READY_WARNING 16 /* WAL files */
#define DEFAULT_ARCHIVE_READY_CRITICAL 128 /* WAL files */ #define DEFAULT_ARCHIVE_READY_CRITICAL 128 /* WAL files */
@@ -82,10 +77,6 @@
#define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */ #define DEFAULT_REPLICATION_LAG_CRITICAL 600 /* seconds */
#define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */ #define DEFAULT_WITNESS_SYNC_INTERVAL 15 /* seconds */
#define DEFAULT_WAIT_START 30 /* seconds */ #define DEFAULT_WAIT_START 30 /* seconds */
#define DEFAULT_PROMOTE_CHECK_TIMEOUT 60 /* seconds */
#define DEFAULT_PROMOTE_CHECK_INTERVAL 1 /* seconds */
#define DEFAULT_STANDBY_RECONNECT_TIMEOUT 60 /* seconds */
#define DEFAULT_NODE_REJOIN_TIMEOUT 60 /* seconds */
#ifndef RECOVERY_COMMAND_FILE #ifndef RECOVERY_COMMAND_FILE
#define RECOVERY_COMMAND_FILE "recovery.conf" #define RECOVERY_COMMAND_FILE "recovery.conf"

View File

@@ -1,2 +1,3 @@
#define REPMGR_VERSION_DATE "" #define REPMGR_VERSION_DATE ""
#define REPMGR_VERSION "4.1.0" #define REPMGR_VERSION "4.0.2"

View File

@@ -35,29 +35,6 @@ do_bdr_node_check(void)
/* nothing to do at the moment */ /* nothing to do at the moment */
} }
void
handle_sigint_bdr(SIGNAL_ARGS)
{
PQExpBufferData event_details;
initPQExpBuffer(&event_details);
appendPQExpBuffer(&event_details,
"%s signal received",
postgres_signal_arg == SIGTERM
? "TERM" : "INT");
create_event_notification(local_conn,
&config_file_options,
config_file_options.node_id,
"repmgrd_shutdown",
true,
event_details.data);
termPQExpBuffer(&event_details);
terminate(SUCCESS);
}
void void
monitor_bdr(void) monitor_bdr(void)
@@ -121,6 +98,23 @@ monitor_bdr(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
} }
/* Retrieve record for this node from the local database */
record_status = get_node_record(local_conn, config_file_options.node_id, &local_node_info);
/*
* Terminate if we can't find the local node record. This is a
* "fix-the-config" situation, not a lot else we can do.
*/
if (record_status != RECORD_FOUND)
{
log_error(_("unable to retrieve record for local node (ID: %i), terminating"),
local_node_info.node_id);
log_hint(_("check that \"repmgr bdr register\" was executed for this node"));
PQfinish(local_conn);
exit(ERR_BAD_CONFIG);
}
if (local_node_info.active == false) if (local_node_info.active == false)
{ {
log_error(_("local node (ID: %i) is marked as inactive in repmgr"), log_error(_("local node (ID: %i) is marked as inactive in repmgr"),
@@ -158,16 +152,15 @@ monitor_bdr(void)
cell->node_info->node_status = NODE_STATUS_UP; cell->node_info->node_status = NODE_STATUS_UP;
} }
log_info(_("starting continuous BDR node monitoring on node %i"), log_debug("main_loop_bdr() monitoring local node %i", config_file_options.node_id);
config_file_options.node_id);
INSTR_TIME_SET_CURRENT(log_status_interval_start); log_info(_("starting continuous BDR node monitoring"));
while (true) while (true)
{ {
/* monitoring loop */ /* monitoring loop */
log_verbose(LOG_DEBUG, "BDR check loop - checking %i nodes", nodes.node_count); log_verbose(LOG_DEBUG, "BDR check loop...");
for (cell = nodes.head; cell; cell = cell->next) for (cell = nodes.head; cell; cell = cell->next)
{ {
@@ -269,6 +262,7 @@ loop:
if (config_file_options.log_status_interval > 0) if (config_file_options.log_status_interval > 0)
{ {
int log_status_interval_elapsed = calculate_elapsed(log_status_interval_start); int log_status_interval_elapsed = calculate_elapsed(log_status_interval_start);
if (log_status_interval_elapsed >= config_file_options.log_status_interval) if (log_status_interval_elapsed >= config_file_options.log_status_interval)
{ {
log_info(_("monitoring BDR replication status on node \"%s\" (ID: %i)"), log_info(_("monitoring BDR replication status on node \"%s\" (ID: %i)"),
@@ -279,7 +273,8 @@ loop:
{ {
if (cell->node_info->monitoring_state == MS_DEGRADED) if (cell->node_info->monitoring_state == MS_DEGRADED)
{ {
log_detail(_("monitoring node \"%s\" (ID: %i) in degraded mode"), log_detail(
_("monitoring node \"%s\" (ID: %i) in degraded mode"),
cell->node_info->node_name, cell->node_info->node_name,
cell->node_info->node_id); cell->node_info->node_id);
} }

View File

@@ -22,5 +22,4 @@
extern void do_bdr_node_check(void); extern void do_bdr_node_check(void);
extern void monitor_bdr(void); extern void monitor_bdr(void);
extern void handle_sigint_bdr(SIGNAL_ARGS);
#endif /* _REPMGRD_BDR_H_ */ #endif /* _REPMGRD_BDR_H_ */

File diff suppressed because it is too large Load Diff

View File

@@ -24,7 +24,6 @@ void do_physical_node_check(void);
void monitor_streaming_primary(void); void monitor_streaming_primary(void);
void monitor_streaming_standby(void); void monitor_streaming_standby(void);
void monitor_streaming_witness(void); void monitor_streaming_witness(void);
void close_connections_physical(void);
void handle_sigint_physical(SIGNAL_ARGS);
#endif /* _REPMGRD_PHYSICAL_H_ */ #endif /* _REPMGRD_PHYSICAL_H_ */

230
repmgrd.c
View File

@@ -35,10 +35,8 @@
static char *config_file = NULL; static char *config_file = NULL;
static bool verbose = false; static bool verbose = false;
static char pid_file[MAXPGPATH]; static char *pid_file = NULL;
static bool daemonize = true; static bool daemonize = false;
static bool show_pid_file = false;
static bool no_pid_file = false;
t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER; t_configuration_options config_file_options = T_CONFIGURATION_OPTIONS_INITIALIZER;
@@ -55,6 +53,9 @@ bool startup_event_logged = false;
MonitoringState monitoring_state = MS_NORMAL; MonitoringState monitoring_state = MS_NORMAL;
instr_time degraded_monitoring_start; instr_time degraded_monitoring_start;
static void close_connections(void);
void (*_close_connections) (void) = NULL;
/* /*
* Record receipt of SIGHUP; will cause configuration file to be reread * Record receipt of SIGHUP; will cause configuration file to be reread
* at the appropriate point in the main loop. * at the appropriate point in the main loop.
@@ -72,6 +73,7 @@ static void start_monitoring(void);
#ifndef WIN32 #ifndef WIN32
static void setup_event_handlers(void); static void setup_event_handlers(void);
static void handle_sighup(SIGNAL_ARGS); static void handle_sighup(SIGNAL_ARGS);
static void handle_sigint(SIGNAL_ARGS);
#endif #endif
int calculate_elapsed(instr_time start_time); int calculate_elapsed(instr_time start_time);
@@ -87,7 +89,6 @@ main(int argc, char **argv)
bool cli_monitoring_history = false; bool cli_monitoring_history = false;
RecordStatus record_status; RecordStatus record_status;
ExtensionStatus extension_status = REPMGR_UNKNOWN;
FILE *fd; FILE *fd;
@@ -101,10 +102,8 @@ main(int argc, char **argv)
{"config-file", required_argument, NULL, 'f'}, {"config-file", required_argument, NULL, 'f'},
/* daemon options */ /* daemon options */
{"daemonize", optional_argument, NULL, 'd'}, {"daemonize", no_argument, NULL, 'd'},
{"pid-file", required_argument, NULL, 'p'}, {"pid-file", required_argument, NULL, 'p'},
{"show-pid-file", no_argument, NULL, 's'},
{"no-pid-file", no_argument, NULL, OPT_NO_PID_FILE},
/* logging options */ /* logging options */
{"log-level", required_argument, NULL, 'L'}, {"log-level", required_argument, NULL, 'L'},
@@ -117,6 +116,8 @@ main(int argc, char **argv)
set_progname(argv[0]); set_progname(argv[0]);
srand(time(NULL));
/* Disallow running as root */ /* Disallow running as root */
if (geteuid() == 0) if (geteuid() == 0)
{ {
@@ -130,10 +131,6 @@ main(int argc, char **argv)
exit(1); exit(1);
} }
srand(time(NULL));
memset(pid_file, 0, MAXPGPATH);
while ((c = getopt_long(argc, argv, "?Vf:L:vdp:m", long_options, &optindex)) != -1) while ((c = getopt_long(argc, argv, "?Vf:L:vdp:m", long_options, &optindex)) != -1)
{ {
switch (c) switch (c)
@@ -175,22 +172,11 @@ main(int argc, char **argv)
/* daemon options */ /* daemon options */
case 'd': case 'd':
if (optarg != NULL) daemonize = true;
{
daemonize = parse_bool(optarg, "-d/--daemonize", &cli_errors);
}
break; break;
case 'p': case 'p':
strncpy(pid_file, optarg, MAXPGPATH); pid_file = optarg;
break;
case 's':
show_pid_file = true;
break;
case OPT_NO_PID_FILE:
no_pid_file = true;
break; break;
/* logging options */ /* logging options */
@@ -237,7 +223,7 @@ main(int argc, char **argv)
/* Exit here already if errors in command line options found */ /* Exit here already if errors in command line options found */
if (cli_errors.head != NULL) if (cli_errors.head != NULL)
{ {
exit_with_cli_errors(&cli_errors, NULL); exit_with_cli_errors(&cli_errors);
} }
startup_event_logged = false; startup_event_logged = false;
@@ -256,58 +242,6 @@ main(int argc, char **argv)
*/ */
load_config(config_file, verbose, false, &config_file_options, argv[0]); load_config(config_file, verbose, false, &config_file_options, argv[0]);
/* Determine pid file location, unless --no-pid-file supplied */
if (no_pid_file == false)
{
if (config_file_options.repmgrd_pid_file[0] != '\0')
{
if (pid_file[0] != '\0')
{
log_warning(_("\"repmgrd_pid_file\" will be overridden by --pid-file"));
}
else
{
strncpy(pid_file, config_file_options.repmgrd_pid_file, MAXPGPATH);
}
}
/* no pid file provided - determine location */
if (pid_file[0] == '\0')
{
/* packagers: if feasible, patch PID file path into "package_pid_file" */
char package_pid_file[MAXPGPATH] = "";
if (package_pid_file[0] != '\0')
{
maxpath_snprintf(pid_file, "%s", package_pid_file);
}
else
{
const char *tmpdir = getenv("TMPDIR");
if (!tmpdir)
tmpdir = "/tmp";
maxpath_snprintf(pid_file, "%s/repmgrd.pid", tmpdir);
}
}
}
else
{
/* --no-pid-file supplied - overwrite any value provided with --pid-file ... */
memset(pid_file, 0, MAXPGPATH);
}
/* If --show-pid-file supplied, output the location (if set) and exit */
if (show_pid_file == true)
{
printf("%s\n", pid_file);
exit(SUCCESS);
}
/* Some configuration file items can be overriden by command line options */ /* Some configuration file items can be overriden by command line options */
@@ -320,8 +254,6 @@ main(int argc, char **argv)
strncpy(config_file_options.log_level, cli_log_level, MAXLEN); strncpy(config_file_options.log_level, cli_log_level, MAXLEN);
} }
log_notice(_("repmgrd (repmgr %s) starting up"), REPMGR_VERSION);
/* /*
* -m/--monitoring-history, if provided, will override repmgr.conf's * -m/--monitoring-history, if provided, will override repmgr.conf's
* monitoring_history; this is for backwards compatibility as it's * monitoring_history; this is for backwards compatibility as it's
@@ -386,60 +318,15 @@ main(int argc, char **argv)
* repmgr has not been properly configured. * repmgr has not been properly configured.
*/ */
/* Check "repmgr" the extension is installed */
extension_status = get_repmgr_extension_status(local_conn);
if (extension_status != REPMGR_INSTALLED)
{
/* this is unlikely to happen */
if (extension_status == REPMGR_UNKNOWN)
{
log_error(_("unable to determine status of \"repmgr\" extension"));
log_detail("%s", PQerrorMessage(local_conn));
close_connection(&local_conn);
exit(ERR_DB_QUERY);
}
log_error(_("repmgr extension not found on this node"));
if (extension_status == REPMGR_AVAILABLE)
{
log_detail(_("repmgr extension is available but not installed in database \"%s\""),
PQdb(local_conn));
}
else if (extension_status == REPMGR_UNAVAILABLE)
{
log_detail(_("repmgr extension is not available on this node"));
}
log_hint(_("check that this node is part of a repmgr cluster"));
close_connection(&local_conn);
exit(ERR_BAD_CONFIG);
}
/* Retrieve record for this node from the local database */ /* Retrieve record for this node from the local database */
record_status = get_node_record(local_conn, config_file_options.node_id, &local_node_info); record_status = get_node_record(local_conn, config_file_options.node_id, &local_node_info);
/*
* Terminate if we can't find the local node record. This is a
* "fix-the-config" situation, not a lot else we can do.
*/
if (record_status != RECORD_FOUND) if (record_status != RECORD_FOUND)
{ {
log_error(_("no metadata record found for this node - terminating")); log_error(_("no metadata record found for this node - terminating"));
log_hint(_("check that 'repmgr (primary|standby) register' was executed for this node"));
switch (config_file_options.replication_type) PQfinish(local_conn);
{
case REPLICATION_TYPE_PHYSICAL:
log_hint(_("check that 'repmgr (primary|standby) register' was executed for this node"));
break;
case REPLICATION_TYPE_BDR:
log_hint(_("check that 'repmgr bdr register' was executed for this node"));
break;
}
close_connection(&local_conn);
terminate(ERR_BAD_CONFIG); terminate(ERR_BAD_CONFIG);
} }
@@ -458,7 +345,7 @@ main(int argc, char **argv)
{ {
log_error(_("unable to write to shared memory")); log_error(_("unable to write to shared memory"));
log_hint(_("ensure \"shared_preload_libraries\" includes \"repmgr\"")); log_hint(_("ensure \"shared_preload_libraries\" includes \"repmgr\""));
close_connection(&local_conn); PQfinish(local_conn);
terminate(ERR_BAD_CONFIG); terminate(ERR_BAD_CONFIG);
} }
} }
@@ -470,6 +357,7 @@ main(int argc, char **argv)
} }
else else
{ {
_close_connections = close_connections_physical;
log_debug("node id is %i, upstream node id is %i", log_debug("node id is %i, upstream node id is %i",
local_node_info.node_id, local_node_info.node_id,
local_node_info.upstream_node_id); local_node_info.upstream_node_id);
@@ -483,7 +371,7 @@ main(int argc, char **argv)
daemonize_process(); daemonize_process();
} }
if (pid_file[0] != '\0') if (pid_file != NULL)
{ {
check_and_create_pid_file(pid_file); check_and_create_pid_file(pid_file);
} }
@@ -512,6 +400,7 @@ start_monitoring(void)
{ {
switch (local_node_info.type) switch (local_node_info.type)
{ {
#ifndef BDR_ONLY
case PRIMARY: case PRIMARY:
monitor_streaming_primary(); monitor_streaming_primary();
break; break;
@@ -521,6 +410,11 @@ start_monitoring(void)
case WITNESS: case WITNESS:
monitor_streaming_witness(); monitor_streaming_witness();
break; break;
#else
case PRIMARY:
case STANDBY:
return;
#endif
case BDR: case BDR:
monitor_bdr(); monitor_bdr();
return; return;
@@ -693,6 +587,11 @@ check_and_create_pid_file(const char *pid_file)
#ifndef WIN32 #ifndef WIN32
static void
handle_sigint(SIGNAL_ARGS)
{
terminate(SUCCESS);
}
/* SIGHUP: set flag to re-read config file at next convenient time */ /* SIGHUP: set flag to re-read config file at next convenient time */
static void static void
@@ -705,23 +604,8 @@ static void
setup_event_handlers(void) setup_event_handlers(void)
{ {
pqsignal(SIGHUP, handle_sighup); pqsignal(SIGHUP, handle_sighup);
pqsignal(SIGINT, handle_sigint);
/* pqsignal(SIGTERM, handle_sigint);
* we want to be able to write a "repmgrd_shutdown" event, so delegate
* signal handling to the respective replication type handler, as it
* will know best which database connection to use
*/
switch (config_file_options.replication_type)
{
case REPLICATION_TYPE_BDR:
pqsignal(SIGINT, handle_sigint_bdr);
pqsignal(SIGTERM, handle_sigint_bdr);
break;
case REPLICATION_TYPE_PHYSICAL:
pqsignal(SIGINT, handle_sigint_physical);
pqsignal(SIGTERM, handle_sigint_physical);
break;
}
} }
#endif #endif
@@ -738,8 +622,6 @@ show_help(void)
{ {
printf(_("%s: replication management daemon for PostgreSQL\n"), progname()); printf(_("%s: replication management daemon for PostgreSQL\n"), progname());
puts(""); puts("");
printf(_("%s monitors a cluster of servers and optionally performs failover.\n"), progname());
puts("");
printf(_("Usage:\n")); printf(_("Usage:\n"));
printf(_(" %s [OPTIONS]\n"), progname()); printf(_(" %s [OPTIONS]\n"), progname());
@@ -759,14 +641,12 @@ show_help(void)
puts(""); puts("");
printf(_("Daemon configuration options:\n")); printf(_("General configuration options:\n"));
printf(_(" -d, --daemonize[=true/false]\n")); printf(_(" -d, --daemonize detach process from foreground\n"));
printf(_(" detach process from foreground (default: true)\n")); printf(_(" -p, --pid-file=PATH write a PID file\n"));
printf(_(" -p, --pid-file=PATH use the specified PID file\n"));
printf(_(" -s, --show-pid-file show PID file which would be used by the current configuration\n"));
printf(_(" --no-pid-file don't write a PID file\n"));
puts(""); puts("");
printf(_("%s monitors a cluster of servers and optionally performs failover.\n"), progname());
} }
@@ -774,29 +654,17 @@ PGconn *
try_reconnect(t_node_info *node_info) try_reconnect(t_node_info *node_info)
{ {
PGconn *conn; PGconn *conn;
t_conninfo_param_list conninfo_params = T_CONNINFO_PARAM_LIST_INITIALIZER;
int i; int i;
int max_attempts = config_file_options.reconnect_attempts; int max_attempts = config_file_options.reconnect_attempts;
initialize_conninfo_params(&conninfo_params, false);
/* we assume by now the conninfo string is parseable */
(void) parse_conninfo_string(node_info->conninfo, &conninfo_params, NULL, false);
/* set some default values if not explicitly provided */
param_set_ine(&conninfo_params, "connect_timeout", "2");
param_set_ine(&conninfo_params, "fallback_application_name", "repmgr");
for (i = 0; i < max_attempts; i++) for (i = 0; i < max_attempts; i++)
{ {
log_info(_("checking state of node %i, %i of %i attempts"), log_info(_("checking state of node %i, %i of %i attempts"),
node_info->node_id, i + 1, max_attempts); node_info->node_id, i + 1, max_attempts);
if (is_server_available_params(&conninfo_params) == true) if (is_server_available(node_info->conninfo) == true)
{ {
log_notice(_("node has recovered, reconnecting")); log_notice(_("node has recovered, reconnecting"));
/* /*
@@ -804,18 +672,14 @@ try_reconnect(t_node_info *node_info)
* connection denied due to connection exhaustion - fall back to * connection denied due to connection exhaustion - fall back to
* degraded monitoring? - make that configurable * degraded monitoring? - make that configurable
*/ */
conn = establish_db_connection(node_info->conninfo, false);
conn = establish_db_connection_by_params(&conninfo_params, false);
if (PQstatus(conn) == CONNECTION_OK) if (PQstatus(conn) == CONNECTION_OK)
{ {
free_conninfo_params(&conninfo_params);
node_info->node_status = NODE_STATUS_UP; node_info->node_status = NODE_STATUS_UP;
return conn; return conn;
} }
close_connection(&conn); PQfinish(conn);
log_notice(_("unable to reconnect to node")); log_notice(_("unable to reconnect to node"));
} }
@@ -827,14 +691,13 @@ try_reconnect(t_node_info *node_info)
} }
} }
log_warning(_("unable to reconnect to node %i after %i attempts"), log_warning(_("unable to reconnect to node %i after %i attempts"),
node_info->node_id, node_info->node_id,
max_attempts); max_attempts);
node_info->node_status = NODE_STATUS_DOWN; node_info->node_status = NODE_STATUS_DOWN;
free_conninfo_params(&conninfo_params);
return NULL; return NULL;
} }
@@ -870,12 +733,27 @@ print_monitoring_state(MonitoringState monitoring_state)
} }
static void
close_connections()
{
if (_close_connections != NULL)
_close_connections();
if (local_conn != NULL && PQstatus(local_conn) == CONNECTION_OK)
{
PQfinish(local_conn);
local_conn = NULL;
}
}
void void
terminate(int retval) terminate(int retval)
{ {
close_connections();
logger_shutdown(); logger_shutdown();
if (pid_file[0] != '\0') if (pid_file)
{ {
unlink(pid_file); unlink(pid_file);
} }

View File

@@ -10,8 +10,6 @@
#include <time.h> #include <time.h>
#include "portability/instr_time.h" #include "portability/instr_time.h"
#define OPT_NO_PID_FILE 1000
extern volatile sig_atomic_t got_SIGHUP; extern volatile sig_atomic_t got_SIGHUP;
extern MonitoringState monitoring_state; extern MonitoringState monitoring_state;
extern instr_time degraded_monitoring_start; extern instr_time degraded_monitoring_start;
@@ -28,6 +26,4 @@ const char *print_monitoring_state(MonitoringState monitoring_state);
void update_registration(PGconn *conn); void update_registration(PGconn *conn);
void terminate(int retval); void terminate(int retval);
#endif /* _REPMGRD_H_ */ #endif /* _REPMGRD_H_ */