Commit Graph

1298 Commits

Author SHA1 Message Date
Ian Barwick
e4cb6d7130 repmgr: simplify LSN parsing
Also silences a compiler warning.
2016-10-10 22:56:49 +09:00
Ian Barwick
871ec47ff5 Fix repmgr cluster crosscheck output
Show actual node ID rather than incremental number.

Fixes GitHub #244.
2016-10-10 16:20:17 +09:00
Ian Barwick
f435abb3ec Update README
Also reorder HISTORY entries.
2016-10-10 15:10:06 +09:00
Ian Barwick
a217b4d0a9 repmgr: standardize SSH-related error messages 2016-10-07 07:42:15 +09:00
Ian Barwick
2dcb75f889 Add 'cluster crosscheck' to help output detail
Per GitHub #243.
2016-10-06 07:38:45 +09:00
Ian Barwick
b509ce6382 Minor README fix 2016-10-05 16:47:37 +09:00
Ian Barwick
1150bf272a Update README
`--ignore-external-config-files` deprecated
2016-10-05 15:09:07 +09:00
Ian Barwick
09ac6cd145 Update history 2016-10-05 13:57:10 +09:00
Ian Barwick
2fae788bc4 Add documentation for repmgrd failover process and failed node fencing
Addresses GitHub #200.
2016-10-05 11:25:36 +09:00
Ian Barwick
eb90f864c9 repmgr: consistent error message style 2016-10-05 10:31:25 +09:00
Ian Barwick
ba89758366 Update barman-wal-restore documentation
Barman 2.0 provides this in a separate, more convenient `barman-cli` package;
document this and add note about previous `barman-wal-restore.py` script.
2016-10-03 15:59:03 +09:00
Ian Barwick
84595fe711 Tweak repmgr.conf.sample
Put `monitor_interval_secs` at the start of the `repmgrd` section, as it's
a very fundamental configuration item.
2016-10-03 15:57:33 +09:00
Ian Barwick
9523894808 Bump dev version number
3.3dev
2016-09-30 15:14:35 +09:00
Ian Barwick
df09af4d57 Update README
`repmgr cluster show --csv` now only reflects node availability,
and no longer overloads this with node type (master/standby) information.
2016-09-30 15:07:34 +09:00
Ian Barwick
2c1cbc6bf9 Fix witness server initialisation 2016-09-30 13:43:47 +09:00
Ian Barwick
ed22fe326e Document and expand pg_ctl override configuration options
These are now prefixed with "service_" to emphasize that they're
OS-level commands, not repmgr ones; also added reload and promote
commands:

    service_start_command
    service_stop_command
    service_restart_command
    service_reload_command
    service_promote_command

GitHub #169
2016-09-30 11:58:45 +09:00
Ian Barwick
46500e1408 Documentation update and miscellaneous code cleanup 2016-09-30 09:30:22 +09:00
Ian Barwick
c3971513b6 Refactor show diagnose to handle node IDs correctly.
See previous comments for `show matrix`.
2016-09-30 01:46:01 +09:00
Ian Barwick
a2910eded9 Refactor show matrix to handle node IDs correctly.
Previously the code assumed repmgr node IDs to be sequential,
which is not guaranteed to be the case. With a non-sequential
list of node IDs, an incorrect node id would be displayed,
and memory accessed beyond the bounds of the matrix array.

The refactored code is considerably less elegant than the original
but will correctly handle a non-sequential sequence of node IDs.
2016-09-29 18:51:55 +09:00
Ian Barwick
dc70e2d804 Remove superfluous PQfinish() call
Connection was previously closed, if this error condition is triggered
a segfault will occur.
2016-09-29 11:53:13 +09:00
Ian Barwick
ea45158f50 Refactor cluster diagnose
- use the remote user setting, like other SSH-based remote operations
  (avoid hardcoding the user name)
- enable `repmgr cluster matrix'  to accept the cluster name, node id
  and the database connection information instead of requiring repmgr.conf;
  this means we don't have to assume that repmgr.conf is in one
  of the default locations
2016-09-29 11:41:14 +09:00
Gianni Ciolli
84d1e16edd Factor out "build_cluster_diagnose" from "do_cluster_diagnose"
We separate the code that builds the cube from the code that displays
it, in preparation for reusing the cube somewhere else, e.g. for
automatic failover detection.
2016-09-29 01:23:09 +09:00
Gianni Ciolli
57815af3ac Factor out "build_cluster_matrix" from "do_cluster_matrix"
We separate the code that builds the matrix from the code that
displays it, in preparation for reusing the matrix somewhere else,
e.g. for automatic failover detection.
2016-09-29 01:22:54 +09:00
Gianni Ciolli
a4a2e48ab4 README rewording 2016-09-29 01:21:06 +09:00
Gianni Ciolli
5189488b92 Add "cluster diagnose" mode
This mode merges the output of "cluster matrix" from each node to
improve node state knowledge.
2016-09-29 01:19:36 +09:00
Gianni Ciolli
263128a740 Bug fix 2016-09-29 00:59:26 +09:00
Ian Barwick
f775750334 cluster matrix: make remote execution of cluster show more configurable
- use the remote user setting, like other SSH-based remote operations
  (avoid hardcoding the user name)
- enable `repmgr cluster show` to accept the cluster name and the
  database connection information instead of requiring repmgr.conf;
  this means we don't have to assume that repmgr.conf is in one
  of the default locations
2016-09-29 00:54:33 +09:00
Ian Barwick
41ec45a4cc Remove ssh_hostname support
Currently repmgr assumes the SSH hostname will be the same as the
database hostname, and it's easy enough now to extract this
from the node's conninfo string.

We can consider re-adding this in the next release if required.
2016-09-29 00:24:04 +09:00
Gianni Ciolli
9b5b9acb82 Add "cluster matrix" mode and "ssh_hostname" parameter
- The "cluster matrix" command supports CSV mode via the --csv
  switch.
- Add the optional ssh_hostname configuration parameter, which is
  required by "cluster matrix".
- A corresponding ssh_hostname column has been added to the repl_nodes
  table and to the repl_show_nodes view.
2016-09-28 23:40:35 +09:00
Ian Barwick
77de5dbeeb Update HISTORY 2016-09-28 23:37:42 +09:00
Gunnlaugur Thor Briem
465f1a73a5 Fix inconsistent repmgr.conf path in failover cfg
The examples of `promote_command` and `follow_command` reference the
`repmgr.conf` file under a different path from the rest of the README.
This makes them consistent with the rest of the README.
2016-09-28 23:14:49 +09:00
Ian Barwick
c4f84bd777 Update README
Document new `--copy-external-config-files` option.
2016-09-28 16:40:13 +09:00
Ian Barwick
da4dc26505 repmgr: various minor logging fixes 2016-09-28 15:51:26 +09:00
Ian Barwick
19670db1d4 repmgr: add missing 'break' statements in switch structure 2016-09-28 15:41:29 +09:00
Ian Barwick
b9f52e74eb repmgr: add --help output and documentation for --wait-sync 2016-09-28 14:22:26 +09:00
Ian Barwick
fa10fd8493 repmgr: add option --wait-sync for standby register
Causes repmgr to wait for the updated node record to propagate
to the standby before exiting. This can be used to ensure that
actions which depend on the standby's node record being synchronised
(such as starting repmgrd) are not carried out prematurely.

Addresses GitHub #103
2016-09-28 14:08:42 +09:00
Ian Barwick
b7f20ee1f7 repmgrd: don't start if node is inactive and failover=automatic
If failover=automatic, it would be reasonable to expect repmgrd to
consider this node as a promotion candidate, however this will not
happen if it is marked inactive. This often happens when a failed
primary is recloned as a standby but not re-registered, and if
repmgrd would run it would give the incorrect impression that
failover capability is available.

Addresses GitHub #153.
2016-09-28 10:59:20 +09:00
Ian Barwick
bbb2e2f017 Initial implementation of improved configuration file copying
GitHub #210.
2016-09-27 22:19:45 +09:00
Ian Barwick
52328b8f33 repmgr: update README 2016-09-27 10:27:27 +09:00
Ian Barwick
65c2be3441 Remove redundant code 2016-09-26 10:44:34 +09:00
Ian Barwick
b17593ff4d repmgr: improve handling application name during standby clone
Addresses issues noticed while investigating GitHub #238
2016-09-26 10:41:01 +09:00
Ian Barwick
7c1776655b repmgr: correctly set application name during standby follow
Addresses issue mentioned in GitHub #238
2016-09-25 21:33:11 +09:00
Ian Barwick
789470b227 repmgr: rename 'upstream_conninfo' etc. to 'recovery_conninfo' etc.
These variables are used only for generating `primary_conninfo` in recovery.conf,
rename to make their purpose clearer.
2016-09-25 19:18:54 +09:00
Martin
5d3c0d6163 Make it clear in the error message that the type of connection that
failed was an SSH connection, not a libpq one.
2016-09-23 09:15:09 -03:00
Ian Barwick
44d4ca46b0 repmgr: place write_primary_conninfo() next to the other recovery.conf functions
Also terminate the buffer created, for tidiness.
2016-09-22 10:10:26 +09:00
Ian Barwick
114c1bddcb repmgr: only require 'wal_keep_segments' to be set in certain corner cases
Now that repmgr uses pg_basebackup's `--xlog-method=stream` setting by
default, and enables provision of `restore_command`, there's no reason
to require `wal_keep_segments` to be set in the default use-case.

`repmgr standby clone` will now only fail with an error if `wal_keep_segments`
is zero and one of the following cases applies:

* `--rsync-only` clone with no `restore_command` set
* clone with pg_basebackup and `--xlog-method=fetch`
* -w/--wal-keep-segments specified on the command line

If, for whatever reason, it's necessary to perform a standby clone
with `wal_keep_segments=0` in one of the above cases, specifying
`-w/--wal-keep-segments=0` on the command line will effectively
override the check.

GitHub #204
2016-09-21 16:42:14 +09:00
Ian Barwick
5090b8cab1 Update HISTORY 2016-09-21 13:39:42 +09:00
Ian Barwick
5e338473f7 repmgr: before cloning a standby, verify sufficient free walsenders available
Previously repmgr only checked that 'max_wal_senders' is a positive value.
It will now additionally verify that the requisite number of replication
connections can actually be made before commencing with a cloning operation
using pg_basebackup.

GitHub #214
2016-09-21 12:36:17 +09:00
Ian Barwick
e043d5c9a9 Clarify requirements for passwordless SSH access
This is already effectively optional; in 3.2 we will ensure it becomes
fully optional (mainly by deprecating --ignore-external-config-files
and replacing it with --copy-external-config-files).
2016-09-21 08:43:45 +09:00
Ian Barwick
03911488aa repmgr: in standby switchover, quote file paths in remotely executed commands
Per suggestion from GitHub user sebasmannem (#229)
2016-09-21 08:09:44 +09:00