No need to check for empty parameters; this is something left over
from the original configuration file parsing. Since converting
to use the PostgreSQL-based configuration file parsing, it is
redundant.
Usually repmgrd requires the parameters "promote_command" and
"follow_command" to be present in the configuration file. These are
not required if "failover=manual", but the configuration sanity check
following receipt of SIGHUP was not checking that.
Addresses issue reported in GitHub #614.
Previously, repmgr was using a very simple ad-hoc string-based parser,
which had various limitations and allowed configuration files to be
created in a way which could cause confusion and/or unexpected
behaviour.
For example, it accepted strings enclosed in single quotes, but treated
strings enclosed in double quotes literally. A node_name defined thusly:
node_name="somenode"
would result in the literal value '"somenode"' being used, which could
lead to unobvious errors along the lines of:
no record found for ""somenode""
The configuration file parser has been adapted from the one used by
PostgreSQL itself, so behaves more-or-less identically (though some
functions such as file inclusion are not supported in repmgr).
This makes configuration parsing more robust and consistent;
additionally, error reporting will be more precise.
Note this does mean that some repmgr.conf items previously accepted
as valid by repmgr will now be rejected; in particular this includes
strings containing spaces which are not enclosed in single quotes.
This reverts commit c6ca183247.
Backing out this patch for now as the Debian build system doesn't
seem to like it, even though it builds just fine on Debian itself.
Previously, repmgr was using a very simple ad-hoc string-based parser,
which had various limitations and allowed configuration files to be
created in a way which could cause confusion and/or unexpected
behaviour.
For example, it accepted strings enclosed in single quotes, but treated
strings enclosed in double quotes literally. A node_name defined thusly:
node_name="somenode"
would result in the literal value '"somenode"' being used, which could
lead to unobvious errors along the lines of:
no record found for ""somenode""
The configuration file parser has been adapted from the one used by
PostgreSQL itself, so behaves more-or-less identically (though some
functions such as file inclusion are not supported in repmgr).
This makes configuration parsing more robust and consistent;
additionally, error reporting will be more precise.
Note this does mean that some repmgr.conf items previously accepted
as valid by repmgr will now be rejected; in particular this includes
strings containing spaces which are not enclosed in single quotes.
Now the "xlog nomenclature" Pg versions are fading into the past,
rename things related to handling pg_basebackup's -X option
(was: --xlog-method, now: --wal-method) to start with "wal_"
rather than "xlog_".
This is a cosmetic change for code clarity.
This functionality enables repmgrd (when running on the primary) to
monitor connected child nodes. It will log connections and disconnections
and generate events.
Additionally, repmgrd can execute a custom script if the number of connected
child nodes falls below a configurable threshold. This script can be used
e.g. to "fence" the primary following a failover situation where a new primary
has been promoted and all standbys are now child nodes of that primary.
In "recovery.conf", the configuration parameter "node_name" is used
as the "application_name" value, which will be truncated by PostgreSQL
to 63 characters (NAMEDATALEN - 1).
repmgr sometimes needs to be able to extract the application name from
pg_stat_replication to determine if a node is connected (e.g. when
executing "repmgr standby register"), so the comparison will fail
if "node_name" exceeds 63 characters.
If "failover_validation_command" is set, and the command returns an error,
rerun the election.
There is a pause between reruns to avoid "churn"; the length of this pause
is controlled by the configuration parameter "election_rerun_interval".
This controls the maximum length of time in seconds that repmgrd will
wait for other standbys to disconnect their WAL receivers in a failover
situation.
This setting is only used when "standby_disconnect_on_failover" is set to "true".
This is intended to ensure that all nodes have a constant LSN while
making the failover decision.
This feature is experimental and needs to be explicitly enabled with the
configuration file option "standby_disconnect_on_failover".
Note enabling this option will result in a delay in the failover decision
until the WAL receiver is disconnected on all nodes.
This enable selection of the method repmgrd uses to check whether the upstream
node is available. Possible values are:
- "ping" (default): uses PQping() to check server availability
- "connection": executes a query on the connection to check server
availability (similar to repmgr3.x).
Immediately after the demotion candidate (primary) has shut down, we can't
be absolutely sure that the walreceiver has flushed all WAL to disk, so
checking pg_last_wal_receive_lsn() at that point might not reflect
the actual last available WAL location.
To handle this, we'll loop for a while (timeout controlled by configuration
parameter "wal_receive_check_timeout") before finally deciding whether
the standby is still behind the shut-down primary.
Addresses issue raised in GitHub #518.
In some places we were still providing "false" from the original implementation,
which was intended to indicate whether a negative value was allowed.
This has not been a problem, as it merely means we have been providing "0",
which is the same thing; however we can finer-tune some of the calls
(e.g. node ID must be or greater).
This is to facilitate remote invocation of repmgr when the repmgr
binary is located somewhere other than the PostgreSQL binary directory, as it
cannot be assumed all package maintainers will install repmgr there.
This parameter is optional; if not set (the default), repmgr will fall back
to "pg_bindir" (if set).
Addresses GitHub #246.
Previously, "repmgr standby switchover" used the configuration file parameters
"reconnect_interval" and "reconnect_attempts" to define a timeout to determine
whether the current primary (demotion candidate) has shut down.
However, these parameters are intended for primary failure detection and are
generally lower in value, while a controlled shutdown may take longer, resulting
in the switchover being aborted as repmgr was not waiting long enough.
To prevent this happening, parameter "shutdown_check_timeout" has been added.
This complements the existing "standby_reconnect_timeout" parameter used
by "repmgr standby switchover".
Implements GitHub #504.
This matches the behaviour of other PostgreSQL utilities such as psql, though
repmgr will only abort once all command line options are parsed, so as many
errors as possible are found and displayed. If a repmgr "command" (e.g.
"repmgr primary ..." was provided, a hint about the relevant command
help section (e.g. "repmgr primary --help") will be provided alongside
the generic help command (i.e. "repmgr --help").
Addresses GitHub #464, with further improvements.
It's hard to imagine a use case where this isn't desirable, but
in case, for whatever reason, the user does not wish to daemonize the
process, the command line option "--daemonize=false" can be provided.
Implements GitHub #458.
Traditionally repmgrd will only write a pidfile if explicitly requested with
-p/--pid-file. However it's normally desirable to have a pidfile, and it's
preferable to have one used by default to prevent accidentally starting a second
repmgrd instance.
Following changes made:
- add configuration file parameter "repmgrd_pid_file" (initially overridden by
-p/--pid-file for backwards compatibility, though eventually we'll want to
drop -p/--pid-file altogether)
- add command line option --no-pid-file
- if neither "repmgrd_pid_file" nor -p/--pid-file is set, create the pid file
in a temporary directory
Implements GitHub #457.