This suppresses log output below log level ERROR. This is useful mainly
when repmgr is being executed programmatically, e.g. in a cronjob,
where it's only useful to receive output if something goes wrong.
Note we advise against using this option when executing repmgr
commands which operate on PostgreSQL nodes (standby follow,
standby promote, standby switchover, node rejoin), particularly when
executed by repmgrd, as the log output will provide valuable
troubleshooting information.
Implements suggestion in GitHub #468.
Previously repmgr was relying on whatever command was configured to
start PostgreSQL to determine whether the node being rejoined had
started correctly. However it's preferable to actively poll the upstream
to confirm it has restarted and actually attached as a standby before
confirming success of the "node rejoin" action.
This can be overridden with the -W/--no-wait option.
(Note that for consistency with other PostgreSQL utilities, the
short form of the --wait option is now "-w"; this is currently
only used in "repmgr standby follow".)
Also update "repmgr node rejoin" documentation with a list of supported
options, and add some useful index entries for "pg_rewind".
Implements GitHub #415.
pg_rewind is not part of the core distribution for those, but we
provided support in repmgr 3.3 so should extend it to repmgr 4.
Note that there is no check in place whether the pg_rewind binary
exists, so it's up to the user to ensure it's present.
Addresses GitHub #413.
This will generate "recovery.conf" for an existing standby.
Typical use-case is a standby cloned manually from an external data
source (e.g. Barman), where "recovery.conf" needs to be created
(and if required a replication slot).
The --dry-run option will check the pre-requisites but not actually
create "recovery.conf" or a replication slot.
This requires that the upstream node is running, a replication connection
can be made and if required a replication slot can be created.
Implements GitHub #382.
Check it's actually possible for the demotion candidate to attach to
the promotion candidate before executing the switchover.
As with other checks of this nature, there's a faint possibility the
situation could change between the time the check is carried out and
the demotion candidate is restarted to connect to the promotion candidate,
but there's not a lot we can do about that. The main purpose is to
be able to catch existing misconfigurations before anything gets changed.
Implements GitHub #370.
As that's what we really want to know. Also return "UNCLEAN_SHUTDOWN"
if that's the case, rather than "RUNNING" which is confusing, even
though it's a command for internal use.
This is a remnant of the early repmgr days when there were no alternative
mechanisms for ensuring sufficient WAL remains available while cloning a
standby.
The purpose of this setting was to override a check for an (arbitrary)
minimum setting for "wal_keep_segments". As there's no reliable way
of determining a sensible value for this, and improvements in
pg_basebackup mean WALs can be streamed (possibly using a replication
slot) while the backup is in progress, there's no point in keeping
this around.
We will however still emit a warning about setting "wal_keep_segments"
if the configuration doesn't appear to provide any other way of
ensuring WAL is available during/after the cloning process and
"wal_keep_segments" is not set.
If the current primary (demotion candidate) still has any files to archive,
it will delay the shutdown until all files are archived. If there is a
substantial number of files, and/or the archive command executes slowly,
this will probably lead to an unwelcome delay in the switchover process.
The repmgr3 implementation required the promotion candidate (standby)
to directly work with the demotion candidate's data directory,
directly execute server control commands etc.
Here we delegated a lot more of that work to the repmgr on the
demotion candidate, which reduces the amount of back-and-forth
over SSH and generally makes things cleaner and smoother.
In particular the repmgr on the demotion candidate will carry
out a thorough check that the node is shut down and report
the last checkpoint LSN to the promotion candidate; this
can then be used to determine whether pg_rewind needs to be
executed on the demoted primary before reintegrating it back
into the cluster (todo).
Also implement "--dry-run" for this action, which will sanity-check the
nodes as far as possible without executing the switchover.
Additionally some of the new repmgr node commands (or command options)
introduced for this can be also executed by the user to obtain
additional information about the status of each node.
This is needed for better switchover control, so we can instruct
the remote repmgr to issue the appropriate server command rather
than trying to work out what it should be from the local node.
This is only relevant when cloning a standby and the node's upstream
can change after failover/switchover etc., so no point keeping the
original value around in the configuration file.