This functionality is intended for those cases where a cascading replication
cluster is being automatically provisioned and it might be necessary to
clone multiple levels in parallel.
As always, use of `--force` implies you know what you are doing.
When executing `repmgr standby clone`, this enables the primary_conninfo
string set in recovery.conf to be explictly defined (rather than generated
from the upstream node's conninfo string and connection parameters).
This is primarily intended for those cases when an L2 cascaded standby
is being cloned from the cluster primary, and its intended upstream
might not yet be available.
Previously providing a parameter which requires a value (e.g. -f/--config-file)
would result in a misleading error like "Unknown option -f".
Rather than suppress getopt's error messages, we'll rely on these to inform
about missing required values or unknown options (as other PostgreSQL tools
do), though we will still provide our own list of command line errors and
warnings countered above and beyond getopt's sanity checks.
Remove any non-repmgrd specific items.
parse_config() already sanity-checks the values so no need to
recheck. Refactor parse_config() so when called by reload_config()
it won't exit if errors are encountered.
When checking the new standby's record in pg_stat_replication, keep
polling until the expected status is reported, and only give up
after a timeout was exceeded.
Previously repmgr would report an error if status was "startup",
even though this is not a problem.
In Barman mode the data directory is created early containing a temporary
directory needed to hold temporary files while cloning from the Barman
server. In other modes we might not know the data directory location until
connecting to the source server, so its creation happens later. In Barman
mode ensure that step is skipped.
Place elements in a sensible order and split the associated initializer
macro over multiple lines for easier editing.
Also move a few related global variables into to the structure to keep
everything in the same place.
Barman 2.0 provides this in a separate, more convenient `barman-cli` package;
document this and add note about previous `barman-wal-restore.py` script.
These are now prefixed with "service_" to emphasize that they're
OS-level commands, not repmgr ones; also added reload and promote
commands:
service_start_command
service_stop_command
service_restart_command
service_reload_command
service_promote_command
GitHub #169
Previously the code assumed repmgr node IDs to be sequential,
which is not guaranteed to be the case. With a non-sequential
list of node IDs, an incorrect node id would be displayed,
and memory accessed beyond the bounds of the matrix array.
The refactored code is considerably less elegant than the original
but will correctly handle a non-sequential sequence of node IDs.
- use the remote user setting, like other SSH-based remote operations
(avoid hardcoding the user name)
- enable `repmgr cluster matrix' to accept the cluster name, node id
and the database connection information instead of requiring repmgr.conf;
this means we don't have to assume that repmgr.conf is in one
of the default locations
We separate the code that builds the cube from the code that displays
it, in preparation for reusing the cube somewhere else, e.g. for
automatic failover detection.
We separate the code that builds the matrix from the code that
displays it, in preparation for reusing the matrix somewhere else,
e.g. for automatic failover detection.
- use the remote user setting, like other SSH-based remote operations
(avoid hardcoding the user name)
- enable `repmgr cluster show` to accept the cluster name and the
database connection information instead of requiring repmgr.conf;
this means we don't have to assume that repmgr.conf is in one
of the default locations
Currently repmgr assumes the SSH hostname will be the same as the
database hostname, and it's easy enough now to extract this
from the node's conninfo string.
We can consider re-adding this in the next release if required.
- The "cluster matrix" command supports CSV mode via the --csv
switch.
- Add the optional ssh_hostname configuration parameter, which is
required by "cluster matrix".
- A corresponding ssh_hostname column has been added to the repl_nodes
table and to the repl_show_nodes view.
The examples of `promote_command` and `follow_command` reference the
`repmgr.conf` file under a different path from the rest of the README.
This makes them consistent with the rest of the README.
Causes repmgr to wait for the updated node record to propagate
to the standby before exiting. This can be used to ensure that
actions which depend on the standby's node record being synchronised
(such as starting repmgrd) are not carried out prematurely.
Addresses GitHub #103
If failover=automatic, it would be reasonable to expect repmgrd to
consider this node as a promotion candidate, however this will not
happen if it is marked inactive. This often happens when a failed
primary is recloned as a standby but not re-registered, and if
repmgrd would run it would give the incorrect impression that
failover capability is available.
Addresses GitHub #153.
Now that repmgr uses pg_basebackup's `--xlog-method=stream` setting by
default, and enables provision of `restore_command`, there's no reason
to require `wal_keep_segments` to be set in the default use-case.
`repmgr standby clone` will now only fail with an error if `wal_keep_segments`
is zero and one of the following cases applies:
* `--rsync-only` clone with no `restore_command` set
* clone with pg_basebackup and `--xlog-method=fetch`
* -w/--wal-keep-segments specified on the command line
If, for whatever reason, it's necessary to perform a standby clone
with `wal_keep_segments=0` in one of the above cases, specifying
`-w/--wal-keep-segments=0` on the command line will effectively
override the check.
GitHub #204
Previously repmgr only checked that 'max_wal_senders' is a positive value.
It will now additionally verify that the requisite number of replication
connections can actually be made before commencing with a cloning operation
using pg_basebackup.
GitHub #214
This is already effectively optional; in 3.2 we will ensure it becomes
fully optional (mainly by deprecating --ignore-external-config-files
and replacing it with --copy-external-config-files).
In switchover mode, if no remote repmgr config file is provided with `-C`,
repmgr attempts to look for a file with the same path as the local
file (provided with `-f/--config-file`). However if this was not specified,
repmgr would execute `ls` with an empty filepath on the remote host, which
appeared to succeed, causing subsequent remote repmgr command executions
to fail as a blank value was provided for `-f/--config-file`.
Fixes GitHub #229.
Previously, if e.g. a non-superuser connection is used to get a value
like `data_directory`, which is available to superusers only, it
would return true.
Refactor recovery.conf generation to take into account the node being
cloned from might not be the intended upstream node, e.g. "grandchild"
node being cloned direct from the master ("grandparent") rather than
the intended parent node.
This extends functionality introduced with Barman support and ensures
that behaviour of cascaded standby cloning is consistent, regardless
of cloning method.
After introducing Barman mode, it is no longer true that STANDBY CLONE
can derive primary_conninfo from the connection to the master. Now we
ask Barman how to connect to a valid cluster node, and then we fetch
the conninfo for the current master from repmgr metadata.
The LSN reported by the shared memory function defaults to "0/0"
(InvalidXLogRecPtr) - this indicates that the repmgrd on that node
hasn't been able to update it yet. However during failover several
places in the code assumed this is an error, which would cause
an endless loop waiting for updates which would never come.
To get around this without changing function definitions, we can
store an explicit message in the shared memory location field so the
caller can tell whether the other node hasn't yet updated the field,
or encountered situation which means it should not be considered
as a promotion candidate (which in most cases will be because
`failover` is set to `manual`.
Resolves GitHub #222.
* Version set to 3.2dev
* Binaries are placed in PGBINDIR and then linked from /usr/bin,
instead of being placed into /usr/bin directly. This is necessary
for the switchover command, because it requires pg_rewind, which is
placed in PGBINDIR too.
- properly distinguish between the command line option -? and getopt's
unknown option marker '?'
- remove deprecated command line options --initdb-no-pwprompt and
-l/--local-port
- add witness command summary in help output
This prevents connection error messages being mixed in
with `repmgr cluster show` output. Error message output can
still be enabled with the --verbose flag.
Fixes GitHub #215
This is to ensure that when repmgr executes pg_basebackup it doesn't
add any options which would conflict with user-supplied options.
This is related to GitHub #206, where the -S/--slot option has been
added for 9.6 - it's important to check this doesn't conflict with
-X/--xlog-method.
While we're at it, rename the ErrorList handling code to ItemList
etc. so we can use it for generic non-error-related lists.
This is for consistency with the PostgreSQL source code (see:
src/backend/access/transam/xlog.c ), but as it's not exported
we need to define it ourselves anyway.
From 3.1.4 `repmgr` will behave like other PostgreSQL utilities
when handling database connection parameters, in particular
accepting a conninfo string and honouring libpq connection defaults.
Removed the existing keyword array which has a fixed, limited number
of parameters and replace it with a dynamic array which can be
used to store as many parameters as reported by libpq.
repmgr disallows socket connections anyway (the whole point of providing
the host is to connect to a remote machine) so don't show that as
a fallback default in the -?/--help output.
This matches the behaviour of other PostgreSQL utilities such as pg_basebackup,
psql et al.
Note that unlike psql, but like pg_basebackup, repmgr does not accept a
"left-over" parameter as a conninfo string; this could be added later.
Parameters specified in the conninfo string will override any parameters
supplied correcly (e.g. `-d "host=foo"` will override `-h bar`).
Having successfully connected to the primary, we can use the actual parameters
reported by libpq to create "primary_conninfo", rather than the limited
subset previously defined by repmgr. Assuming that the user can
pass a conninfo string to repmgr (see following commit), this makes it
possible to provide other connection parameters, e.g. related to
SSL usage.
This commit introduces three new options:
- start_command
- stop_command
- restart_command
If these are set, repmgr will issue the specified command instead
of the default pg_ctl commands
If the pidfile is still there after apparent shutdown, or we're
unable to access the server at all, something has gone wrong and
the switchover should be aborted.
If a connection attempt fails, keep pinging the server until it
finally away, or the timeout kicks in.
Addresses issue reported in GitHub #188 and previously noted in
repmgr.c
This enables the switchover operation to function if the remote server
(current primary) has a different binary directory to the current
server, and addresses the issue reported in GitHub #172.
Otherwise the monitoring table's 'last_wal_standby_location' will stay at
the location of the last streaming WAL received.
This complements the bugfix applied in e814c1120e.
9.4+, as there is no ALTER CONSTRAINT in 9.3.
This new ALTER TABLE does the same in two hops by removing the foreign
key and creating it again in the same ALTER TABLE.
This fixes#183
This can be used so that repmgr standby clone adds the string
specified in repmgr.conf as a restore_command in recovery.conf.
We can use this option for integration with barman by setting the
parameter to an appropriate get-wal call.
all the WALs needed without needing to set wal_keep_segments to
a ridiculously high value.
This is not necessary on 9.6 if we are using replication slots,
as all WAL segments needed will be kept on the primary until
they are consumed by the slot.
from the postgresql code so we use that instead of issuing system calls
with rm -rf ....
I also eliminated the rm -rf for pg_xlog.
Will later do the same with the other system call to remove files
in pg_replslot/
issue with rsync returning non-zero status on vanishing files on commit
83e5f98171.
Alvaro Herrera gave me some tips which pointed me in the correct
direction.
This was reported by sungjae lee <sj860908@gmail.com>
If the connection to the primary is lost, roll back to the previously
known state.
TRUNCATE is of course not MVCC-friendly, but that shouldn't matter here
as only one process should ever be looking at this table.
The configured values are either the defaults, or examples which
may not work in a real environment. If this file is being used as
a template, the onus is on the user to uncomment and check all
desired parameters.
Although the witness server will resync the repl_nodes table following
a failover, other operations (e.g. removing or cloning a standby)
were previously not reflected in the witness server's copy of this
table.
As a short-term workaround, automatically resync the table at regular
intervals (defined by the configuration file parameter
"witness_repl_nodes_sync_interval_secs", default 30 seconds).
This fixes a bug introduced into the previous commit, where the
witness node was registered last to prevent a spurious node record
being created even if witness server creation failed.
Ensure witness is only registered after all steps for creation
have been successfully completed.
Also write an event record if connection could not be made to
the witness server after initial creation.
This addresses GitHub issue #146.
99.9% of the time they'll be the same as the primary connection, but
it's more consistent to use the provided local conninfo string
(from which the port is already extracted).
Optionally prompt for superuser and repmgr user when creating a witness.
This ensures a password can be provided if the primary's pg_hba.conf
mandates it.
This deprecates '--initdb-no-pwprompt'; and changes the default behaviour of
"repmgr create witness", which previously required a superuser password
unless '--initdb-no-pwprompt' was supplied.
This behaviour is more consistent with other PostgreSQL utilities such
as createuser.
Partial fix for GitHub issue #145.
Make the code match the documentation.
As pointed out by GitHub user phyber (#142).
Also various other minor improvements to error reporting during
config file parsing.
The difference between this and establish_db_connection() is that
it outputs any connection failure as a [NOTICE] rather than an
[ERROR]; it's intended for use in e.g. polling a server to wait
for it to come up/go down, while preventing [ERROR] log lines
which may cause confusion.
There's no compelling reason to require "archive_mode" to be enabled
for streaming replication. It is of course a good idea to archive WAL
using e.g. barman ( http://www.pgbarman.org/ ) as part of a comprehensive
backup strategy, but repmgr and streaming replication work fine without
it.
Per GitHub #141.
Also revise the configuration check for "archive_command" to be
triggered only when "archive_mode" is not "off", as from PostgreSQL
9.5 onwards "archive_mode" can also be "on" or "always".
Updated makefile for deb creation
There's still a bug in the Version field of the Control file (it shouldn't have a 'v' in front of the version).
Will fix that immediately after.
A fix for this was introduced with commit ee9270fe8d
and removed in 4f1c67a1bf.
Refactor the original fix to simply omit attempting to write an invalid entry
to the monitoring table.
Calculate the width of the "Name" and "Upstream" columns dynamically.
Based on pull request #135 by sengaya, edited and modified by myself
to include a psql-like separator line.
removing it.
Basically, on startup the standby will start receiving again from the
begining of the WAL and so received will be lower then applied.
A proper code is needed to make sure the standby is still following the
correct master (as per node information)
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
Perform a switchover by:
- stopping current primary node
- promoting this standby node to primary
- forcing previous primary node to follow this node
Caveats:
- repmgrd must not be running, otherwise it may
attempt a failover
(TODO: find some way of notifying repmgrd of planned
activity like this)
- currently only set up for two-node operation; any other
standbys will probably become downstream cascaded standbys
of the old primary once it's restarted
- as we're executing repmgr remotely (on the old primary),
we'll need the location of its configuration file; this
can be provided explicitly with -C/--remote-config-file,
otherwise repmgr will look in default locations on the
remote server
- this does not yet support "rewinding" stopped nodes
which will be unable to catch up with the primary
TODO:
- update help, docs
- make connection test timeouts/intervals configurable
Anyone needing them, particularly in a replication context, should
know what they're doing anyway.
See also: http://www.postgresql.org/docs/current/interactive/sql-createindex.html#AEN74175
"Also, changes to hash indexes are not replicated over streaming or file-based
replication after the initial base backup, so they give wrong answers to
queries that subsequently use them. For these reasons, hash index use is presently
discouraged."
Also refactor configuration file handling while we're at it.
Previously a configuration file would be ignored if it couldn't
be opened, however that is now treated as an error.
When parsing command line arguments in check_parameters_for_action(),
create warnings for paramters supplied but not required (e.g. -D/--data-dir
for MASTER REGISTER), rather than fail with error(s), as the
presence of the parameters won't cause any problems.
Errors will still be raised for required-but-missing parameters, of course.
* repmgr: add explicit --log-level flag, repurpose --verbose flag to
show extra detailed/repetitive output only (see item below too)
-> e0cbdd5b31
* debug output: show some repetitive output only if --verbose flag set to prevent
excessive log growth
-> 8ab1901a93
This should make it more likely that the actual primary is first
in the retrieved list, reducing the number of connections to
other nodes in the cluster which need to be made.
repmgr and particularly repmgrd currently produce substantial
amounts of log output. Much of this is only useful when troubleshooting
or debugging.
Previously the -v/--verbose option just forced the log level to
INFO. With repmgrd this is pretty pointless - just set the log
level in the configuration file. With repmgr the configuration
file can be overriden by the new -L/--log-level option.
-v/--verbose now provides an additional, chattier/pedantic level
of logging ("Opening *this* logfile", "Executing *this* query",
"running in *this* loop") which is helpful for understanding
repmgr/repmgrd's behaviour, particularly for troubleshooting.
What additional verbose logging is generated will of course a
also depends on the log level set, so e.g. someone trying to
work out which configuration file is actually being opened
can use '--log-level=INFO --verbose' without being bothered
by an avalanche of extra verbose debugging output.
-t/--terse option will silence certain non-essential output, at
the moment any HINTs.
Note that -v/--verbose and -t/--terse are not mutually exclusive
(suggestions for better names welcome).
When cloning a server without this option, and pg_start_backup() takes time
to complete, repmgr appears to hang and give no indication of what may
or may not be happening. The hint provides an explanation for any
delay and possible action which could be taken to mitigate it.
There are a few places where additional hints are written as log
output, usually LOG_NOTICE. Create an explicit function to provide
hints in a standardized manner; by storing the log level of the
previous logger call, we can ensure the hint is only displayed when
the log message itself would be.
Part of an ongoing effort to better control repmgr's logging output.
Related to Github #127.
- use the previously introduced repmgr_atoi() function to parse
integers better
- collate all detected errors and output as a list, rather than
failing on the first error.
Also change "master" to "primary" in the comments for consistency
with main PostgreSQL terminology. We'll need to add aliases
for the configuration parameters at some point...
is the number of the current node (1, 2 or 3 in the current example).
**node_name**
is an identifier for every node.
**conninfo**
is used to connect to the local PostgreSQL server (where the configuration file is) from any node. In the witness server configuration you need to add a 'port=5499' to the conninfo.
**master_response_timeout**
is the maximum amount of time we are going to wait before deciding the master has died and start the failover procedure.
**reconnect_attempts**
is the number of times we will try to reconnect to master after a failure has been detected and before start the failover procedure.
**reconnect_interval**
is the amount of time between retries to reconnect to master after a failure has been detected and before start the failover procedure.
**failover**
configure behavior: *manual* or *automatic*.
**promote_command**
the command executed to do the failover (including the PostgreSQL failover itself). The command must return 0 on success.
**follow_command**
the command executed to address the current standby to another Master. The command must return 0 on success.
[2015-03-03 18:24:34] [INFO] repmgr connecting to standby database
[2015-03-03 18:24:34] [INFO] repmgr connecting to master database
[2015-03-03 18:24:34] [INFO] finding node list for cluster 'test'
[2015-03-03 18:24:34] [INFO] checking role of cluster node '1'
[2015-03-03 18:24:34] [INFO] repmgr connected to master, checking its state
[2015-03-03 18:24:34] [INFO] repmgr registering the standby
[2015-03-03 18:24:34] [INFO] repmgr registering the standby complete
[2015-03-03 18:24:34] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db)
This concludes the basic `repmgr` setup of master and standby. The records
created in the `repl_nodes` table should look something like this:
repmgr_db=# SELECT * from repmgr_test.repl_nodes;
id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.