If the current primary (demotion candidate) still has any files to archive,
it will delay the shutdown until all files are archived. If there is a
substantial number of files, and/or the archive command executes slowly,
this will probably lead to an unwelcome delay in the switchover process.
pg_rewind will only be executed on a demoted primary if explictly
requested, to prevent transactions on the primary, which
were never replicated, from being automatically overwritten.
If --force-rewind is provided, we'll need to check pg_rewind
is actually useable before we need to use it.
The repmgr3 implementation required the promotion candidate (standby)
to directly work with the demotion candidate's data directory,
directly execute server control commands etc.
Here we delegated a lot more of that work to the repmgr on the
demotion candidate, which reduces the amount of back-and-forth
over SSH and generally makes things cleaner and smoother.
In particular the repmgr on the demotion candidate will carry
out a thorough check that the node is shut down and report
the last checkpoint LSN to the promotion candidate; this
can then be used to determine whether pg_rewind needs to be
executed on the demoted primary before reintegrating it back
into the cluster (todo).
Also implement "--dry-run" for this action, which will sanity-check the
nodes as far as possible without executing the switchover.
Additionally some of the new repmgr node commands (or command options)
introduced for this can be also executed by the user to obtain
additional information about the status of each node.
There are some circumstances, e.g. during switchover operations,
where repmgr may need to operate on a data directory while the
server isn't running, in which case there's no way to retrieve
that information.
Also add configuration file option "pgdata" for hard-coding the
node's data directory - if the "repmgr" DB user isn't a superuser
or doesn't have permission to extract the data directory, we'll
need another way of finding out.
This is needed for better switchover control, so we can instruct
the remote repmgr to issue the appropriate server command rather
than trying to work out what it should be from the local node.
In previous versions of repmgr, some options had ambiguous meanings,
and/or were used for slightly different purposes. This way we end
up with a couple more options (most of which probably won't need
adjusting) but greater clarity and flexibility.
Removed:
master_reponse_timeout:
renamed to "async_query_timeout", as this was its main usage
retry_promote_interval_secs:
replaced by "primary_notification_timeout"
Added:
async_query_timeout:
timeout (in seconds) when executing asynchronous queries
primary_notification_timeout:
number of seconds to wait for notification from the new primary
after a failover
primary_follow_timeout:
number of seconds to wait for the new primary to become available
when executing "repmgr standby follow"
This will cover both the case when an entire node including
repmgrd goes down, and when one PostgreSQL instance goes down
but repmgrd is still up (in which case only one of the repmgrds
will handle the failover).
For logging an event to the event table without generating an external
event notification.
Rename existing create_event_record*() functions to create_event_notification*()
as this describes their function better.
This is only relevant when cloning a standby and the node's upstream
can change after failover/switchover etc., so no point keeping the
original value around in the configuration file.
When creating recovery.conf outside of "repmgr standby clone",
there was no way of knowing if a replication user had been
explicitly provided with --replication-user, meaning the value
of "primary_conninfo" would be set to the "conninfo" field of the
node's upstream node record.
We'll add an extra column to store the replication user for each
node so it can be referenced at any time.
- word hint about registering depending on whether record exists or not
- when checking for existing records with same name, check node id
is different