After restarting the standby, poll pg_stat_replication on the upstream
until the standby connects, and exit with an error if it doesn't by the
timeout defined in "standby_follow_timeout".
Implments GitHub #444.
This enables explicit provision of an external configuration file
directory, which if set will be passed to "pg_ctl" as the -D
parameter. Otherwise "pg_ctl" will default to using the data directory,
which will cause some operations to fail if the configuration files
are not present there.
Note this is implemented primarily for feature completeness and for
development/testing purposes. Users who have installed "repmgr" from
a package should not rely on "pg_ctl" to stop/start/restart PostgreSQL,
instead they should set the appropriate "service_..._command" for their
operating system. For more details see:
https://repmgr.org/docs/4.0/configuration-service-commands.html
Note: in a future release, the presence of "config_directory" in repmgr.conf
will be used to implictly set "--copy-external-config-files=samepath" when
cloning a standby; this is a behaviour change so will be implemented in the
next major realease (repmgr 4.1).
Implements GitHub #424.
If "archive_cleanup_command" is defined in "repmgr.conf", a corresponding
entry will be made in the node's "recovery.conf" file after cloning a
standby.
Note that we recommend using PgBarman to manage WAL archives, but are
providing this facility to help repmgr to be integrated in existing environments.
Implements GitHub #416.
This introduces following new configuration file parameters, which
were previously hard-coded values:
- promote_check_timeout
- promote_check_interval
Implements GitHub #387.
This is used for determining a timeout when reconnecting to the standby
after executing the "follow_command". This will normally not need to be
set explicitly, but maybe useful in cases where the standby's startup
phase can last longer than usual.
"repmgr node check --archive-ready" is correct, however abbreviated
versions will be accepted by getopt_long() if they don't match
or partially match any other options.
Per report by "chaintng" in GitHub #355.
This makes comments stay aligned in most cases the conf file is
modified, and when indentation changes, it's easy to re-align
(by removing or adding a tab)
Signed-off-by: Martín Marqués <martin.marques@2ndquadrant.com>
It must never contain "repmgr standby promote", as it is intended
to enable use of package-level promote commands such as Debian's
"pg_ctlcluster promote".
Addresses GitHub #336.
Now we are more explicit on what we recommend for the various
service_X_command settings when using sudo.
Signed-off-by: Gianni Ciolli <gianni.ciolli@2ndQuadrant.com>
The repmgr3 implementation required the promotion candidate (standby)
to directly work with the demotion candidate's data directory,
directly execute server control commands etc.
Here we delegated a lot more of that work to the repmgr on the
demotion candidate, which reduces the amount of back-and-forth
over SSH and generally makes things cleaner and smoother.
In particular the repmgr on the demotion candidate will carry
out a thorough check that the node is shut down and report
the last checkpoint LSN to the promotion candidate; this
can then be used to determine whether pg_rewind needs to be
executed on the demoted primary before reintegrating it back
into the cluster (todo).
Also implement "--dry-run" for this action, which will sanity-check the
nodes as far as possible without executing the switchover.
Additionally some of the new repmgr node commands (or command options)
introduced for this can be also executed by the user to obtain
additional information about the status of each node.
There are some circumstances, e.g. during switchover operations,
where repmgr may need to operate on a data directory while the
server isn't running, in which case there's no way to retrieve
that information.
Also add configuration file option "pgdata" for hard-coding the
node's data directory - if the "repmgr" DB user isn't a superuser
or doesn't have permission to extract the data directory, we'll
need another way of finding out.
In previous versions of repmgr, some options had ambiguous meanings,
and/or were used for slightly different purposes. This way we end
up with a couple more options (most of which probably won't need
adjusting) but greater clarity and flexibility.
Removed:
master_reponse_timeout:
renamed to "async_query_timeout", as this was its main usage
retry_promote_interval_secs:
replaced by "primary_notification_timeout"
Added:
async_query_timeout:
timeout (in seconds) when executing asynchronous queries
primary_notification_timeout:
number of seconds to wait for notification from the new primary
after a failover
primary_follow_timeout:
number of seconds to wait for the new primary to become available
when executing "repmgr standby follow"
In some cases, the monitored upstream may not be available for a while
(e.g. network split), in which case it makes sense to have repmgrd
keep running and trying to reconnect. Previously it would just keel
over and quit.
This is more consistent with other parameters and conforms to
the pattern used by PostgreSQL itself, which uses the prefix "log_"
for logging parameters.
A warning will be emitted if the old version of the parameter name
is detected.
Normally (outside of log level DEBUG) repmgrd doesn't generate any
kind of log output, so examining the log file may give the impression
it's not working. Output an informational message at regular intervals
to show it's up and running.
This is more consistent with other parameters and conforms to
the pattern used by PostgreSQL itself, which uses the prefix "log_"
for logging parameters.
A warning will be emitted if the old version of the parameter name
is detected.