Commit Graph

1056 Commits

Author SHA1 Message Date
Ian Barwick
8d636690bd repmgrd: create pid file by default
Traditionally repmgrd will only write a pidfile if explicitly requested with
-p/--pid-file. However it's normally desirable to have a pidfile, and it's
preferable to have one used by default to prevent accidentally starting a second
repmgrd instance.

Following changes made:

 - add configuration file parameter "repmgrd_pid_file" (initially overridden by
   -p/--pid-file for backwards compatibility, though eventually we'll want to
   drop -p/--pid-file altogether)
 - add command line option --no-pid-file
 - if neither "repmgrd_pid_file" nor -p/--pid-file is set, create the pid file
   in a temporary directory

Implements GitHub #457.
2018-06-29 14:36:24 +09:00
Ian Barwick
b2081dca52 De-overload configuration file parameter "standby_reconnect_timeout"
Currently the (very generic sounding) "standby_reconnect_timeout" configuration
file parameter is used in several different contexts and it would be useful
to have more granular control over the different timeouts it's used to configure.

This patch introduces "node_rejoin_timeout", used in place of "standby_reconnect_timeout"
(which wasn't documented) when "repmgr node rejoin" is executed, to determine
how long to wait for the node to rejoin the replication cluster.

Additionally "repmgrd_standby_startup_timeout" is introduced as a timeout for
failover situations, when repmgrd executes "repmgr standby follow" to follow
a new primary, and waits for the standby to restart and become available
for connections.

"standby_reconnect_timeout" is now only relevant for "repmgr standby switchover".

Implements GitHub #454.
2018-06-28 18:00:55 +09:00
Ian Barwick
080a29c33b node check: add --missing-slots check
This enables an explicit check for slots which should exist (according
to the repmgr metadata) but which aren't present.
2018-06-22 17:21:40 +09:00
Ian Barwick
dd7a4068d2 node check: implement CSV output
This is advertised in the --help output and placeholder code was in
place, but it wasn't actually implemented.
2018-06-22 13:14:57 +09:00
Ian Barwick
fcf237fe31 node status: improve output and documentation
In the default text output mode, list inactive slots.

In CSV output mode, list inactive slots as additional information;
add output line with number of missing slots and a list thereof.

Also document --csv output mode.
2018-06-22 11:46:50 +09:00
Ian Barwick
4d70a667fb node check: clarify status information for witness server
Previously the output gave the impression the server was a primary,
which is technically the case, but it's not the actual cluster primary.

Also output an error if the node is in recovery, which is unlikely but
you never know.
2018-06-22 10:15:45 +09:00
Ian Barwick
c5ba72c2c5 standby switchover: fix behaviour if witness node is a sibling
The witness node is not a streaming replication standby, so executing
"repmgr standby follow" will fail. Instead, execute "repmgr witness
register --force" to update the witness node record on the primary and
its local copy of all node records.

Addresses GitHub #453.
2018-06-21 16:48:58 +09:00
Ian Barwick
0f97a98f28 repmgr: don't count witness node as a standby when running "node status"
Addresses GitHub #451.
2018-06-21 13:06:18 +09:00
Ian Barwick
269e3242c8 "repmgr node ...": update comments and formatting 2018-06-21 12:12:07 +09:00
Ian Barwick
b0ed87832b repmgr: don't count witness node as a standby when running "node check"
Addresses GitHub #451.
2018-06-21 11:13:46 +09:00
Ian Barwick
836d2125fe Improve BDR3 node query
We can get everything we need from bdr.node_summary
2018-06-15 14:30:06 +09:00
Ian Barwick
bf0d67c60a Add repmgr.nodes to the BDR replication set 2018-06-15 14:29:08 +09:00
Ian Barwick
e1d807188d Add extension upgrade files 2018-06-15 14:27:42 +09:00
Ian Barwick
108c3a36fb Enable creation of repmgr extension on BDR3 node 2018-06-15 14:26:47 +09:00
Ian Barwick
8377704596 Convert BDR query functions to handle BDR2/BDR3 2018-06-15 14:26:07 +09:00
Ian Barwick
4f642f8332 Detect and store BDR major version number when executing "is_bdr_db()"
BDR3 metadata structure is very different to BDR1/2, so we'll need to
generate queries according to version.
2018-06-15 14:25:55 +09:00
Ian Barwick
029ba46470 doc: remove info about old RPM package repository 2018-06-15 13:27:19 +09:00
Ian Barwick
098f8eaf2a doc: finalize release notes 2018-06-15 13:27:14 +09:00
Ian Barwick
d60bd232f0 Enable "recovery_min_apply_delay" to be zero.
Addresses GitHub #448.
2018-06-14 11:11:33 +09:00
Ian Barwick
eca1943026 doc: emphasize that repmgrd should not be running during a switchover 2018-06-12 10:30:35 +09:00
Ian Barwick
bcab4bc391 _create_event(): log event and node ID for debugging 2018-06-12 10:30:30 +09:00
Ian Barwick
bb320a64f5 repmgr: consolidate code in "standby switchover"
Commit 41274f5525 left us with two if statements
in sequence with exactly the same condition, so consolidate both into a single
statement. Clarify code comments while we're at it.
2018-06-12 10:30:24 +09:00
Ian Barwick
3b0cde2846 repmgr: cluster check commands - non-zero exit code if node(s) unavailable
Return ERR_CLUSTER_CHECK if one or nodes was not reachable.

Implements GitHub #447.
2018-06-12 10:30:11 +09:00
Ian Barwick
00704913a6 doc: 4.0.6 release notes 2018-06-12 10:29:35 +09:00
Ian Barwick
efc388065e standby follow: check node has connect to new primary
After restarting the standby, poll pg_stat_replication on the upstream
until the standby connects, and exit with an error if it doesn't by the
timeout defined in "standby_follow_timeout".

Implments GitHub #444.
2018-06-07 15:04:45 +09:00
Ian Barwick
e12fbb7b4d doc: update release notes 2018-06-07 15:04:38 +09:00
Ian Barwick
0108fb2e72 standby follow: add hint about using "node rejoin"
If "repmgr standby follow" is executed on a node which isn't running,
point out "repmgr node rejoin" should probably be used instead.
2018-06-07 15:04:30 +09:00
Ian Barwick
e408351697 doc: fix typos 2018-06-07 15:04:25 +09:00
Ian Barwick
f904cd2573 witness_register: check for existing node with same name 2018-06-07 15:04:18 +09:00
Ian Barwick
95fe7ea621 repmgrd: ensure local node is counted as quorum member
Rename "standby_nodes" to "sibling_nodes" to make it clearer in the
code what total is actually provided by the struct.

Addresses GitHub #439.
2018-06-07 15:04:12 +09:00
Ian Barwick
a50ac039da doc: fix typo 2018-06-07 15:04:06 +09:00
Ian Barwick
535fba43d3 standby clone: improve external configuration file copying
If --copy-external-config-files was provided, check that we can copy
the files *before* cloning the standby, and abort if an error is
encountered. This will give the user the opportunity to fix any issues
before running the entire (and potentially lengthy) clone.

Previously errors were logged but no action taken, and the final
message indicated the clone operation was successful.

Addresses GitHub #443.
2018-06-07 15:04:01 +09:00
Ian Barwick
043a6c5bea repmgrd: ensue degraded monitoring timeout works on standby
Parameter "degraded_monitoring_timeout" was not being acted on when
monitoring a streaming replication standby.

Addresses GitHub #439.
2018-06-07 15:03:52 +09:00
Ian Barwick
8da26f1c6c If --dry-run specified, ensure minimum log level is INFO
When executed with --dry-run, repmgr outputs detail about what would
happen using log level INFO. If the log_level is configured to
NOTICE or higher, it's possible some or all of the --dry-run output
might not be displayed.

Addresses GitHub #441.
2018-06-07 15:03:43 +09:00
Ian Barwick
7861392450 node rejoin: avoid outputting empty DETAIL message 2018-06-07 15:03:36 +09:00
Ian Barwick
b297e40d77 node rejoin: improve handling of --config-file parameter
Fixes bug when parsing --config-file values (GitHub #442).

Also improves handling in --dry-run mode, as some checks for the
provided files were being skipped if --dry-run supplied, even though
they are intended to work with --dry-run.
2018-06-07 15:03:30 +09:00
Ian Barwick
7613b1769c standby clone: --recovery-conf-only expects the standby to be registered
Note this in the documentation, and add a HINT about registering it
if the standby record is not available.

Related to GitHub #438.
2018-05-31 09:42:53 +09:00
Ian Barwick
b1b49748a7 "config_file" is MAXPGPATH, not MAXLEN
The two values are the same anyway, so change is more for consistency.
2018-05-24 15:52:57 +09:00
Ian Barwick
276239422b standby clone: don't assume existence of "user" in upstream conninfo
Usually a seperate user (typically "repmgr") is set up specifically to manage
the repmgr metadata, however there's no compelling requirement to do this, and
it's possible the database owner (usually: "postgres") will be used, in which
case it's possible the username will be left out of the conninfo string.

Addresses GitHub #437.
2018-05-24 15:52:51 +09:00
Martín Marqués
49418e096e Fix typo in a code comment 2018-05-19 12:30:03 -03:00
Ian Barwick
6c518f1403 "standby clone": log actual connection string used to connect to upstream
Useful for diagnostic purposes.
2018-05-10 12:03:13 +09:00
Ian Barwick
b365765bc8 Fix check for -d/--dbname parameter
Not a bug per-se, just meant some unnecessary processing was done on
an empty string.

Per note from petere.
2018-05-10 12:03:09 +09:00
Ian Barwick
bd63948937 Include "arpa/inet.h" in dbutils.c
Needed for htonl() on FreeBSD.
2018-05-10 12:03:04 +09:00
Ian Barwick
69c1f147ea doc: update 2ndQuadrant repository information
Canonical link for each repository should not include any directories.
2018-05-10 10:39:31 +09:00
Ian Barwick
ce8d3cf0b0 doc: update repository information 2018-05-10 10:39:27 +09:00
Ian Barwick
14134f8e70 doc: update package installation information
Document the new public 2ndQuadrant apt repository
2018-05-10 10:39:23 +09:00
Ian Barwick
be8448ddcb doc: update package installation information
Document the new, public 2ndQuadrant RPM repository.
2018-05-10 10:39:18 +09:00
Ian Barwick
a2ff1536ad doc: add notes about package compatibility
We need to emphasise that the repmgr packages are only compatible
with packages based on the PGDG filesystem layout; 3rd party vendor
packages often put application and data directories elsewhere.
See e.g. GitHub #427.
2018-05-10 10:38:54 +09:00
Ian Barwick
9c0c1b663e Minor documentation fixes 2018-05-10 10:25:29 +09:00
Ian Barwick
2d43feb34b doc: update HISTORY and add 4.0.5 release notes 2018-05-01 10:21:40 +09:00