The LSN reported by the shared memory function defaults to "0/0"
(InvalidXLogRecPtr) - this indicates that the repmgrd on that node
hasn't been able to update it yet. However during failover several
places in the code assumed this is an error, which would cause
an endless loop waiting for updates which would never come.
To get around this without changing function definitions, we can
store an explicit message in the shared memory location field so the
caller can tell whether the other node hasn't yet updated the field,
or encountered situation which means it should not be considered
as a promotion candidate (which in most cases will be because
`failover` is set to `manual`.
Resolves GitHub #222.
This is to ensure that when repmgr executes pg_basebackup it doesn't
add any options which would conflict with user-supplied options.
This is related to GitHub #206, where the -S/--slot option has been
added for 9.6 - it's important to check this doesn't conflict with
-X/--xlog-method.
While we're at it, rename the ErrorList handling code to ItemList
etc. so we can use it for generic non-error-related lists.
Otherwise the monitoring table's 'last_wal_standby_location' will stay at
the location of the last streaming WAL received.
This complements the bugfix applied in e814c1120e.
Although the witness server will resync the repl_nodes table following
a failover, other operations (e.g. removing or cloning a standby)
were previously not reflected in the witness server's copy of this
table.
As a short-term workaround, automatically resync the table at regular
intervals (defined by the configuration file parameter
"witness_repl_nodes_sync_interval_secs", default 30 seconds).
A fix for this was introduced with commit ee9270fe8d
and removed in 4f1c67a1bf.
Refactor the original fix to simply omit attempting to write an invalid entry
to the monitoring table.
removing it.
Basically, on startup the standby will start receiving again from the
begining of the WAL and so received will be lower then applied.
A proper code is needed to make sure the standby is still following the
correct master (as per node information)
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
Perform a switchover by:
- stopping current primary node
- promoting this standby node to primary
- forcing previous primary node to follow this node
Caveats:
- repmgrd must not be running, otherwise it may
attempt a failover
(TODO: find some way of notifying repmgrd of planned
activity like this)
- currently only set up for two-node operation; any other
standbys will probably become downstream cascaded standbys
of the old primary once it's restarted
- as we're executing repmgr remotely (on the old primary),
we'll need the location of its configuration file; this
can be provided explicitly with -C/--remote-config-file,
otherwise repmgr will look in default locations on the
remote server
- this does not yet support "rewinding" stopped nodes
which will be unable to catch up with the primary
TODO:
- update help, docs
- make connection test timeouts/intervals configurable