Improve sample PostgreSQL replication configuration, including
links to the PostgreSQL documentation for each configuration item.
Also set "max_replication_slots" to the same value as "max_wal_senders"
to ensure the sample configuration will work regardless of whether
replication slots are in use, though we do still encourage careful
reading of the comments in the sample configuration and the documentation
in general.
In "recovery.conf", the configuration parameter "node_name" is used
as the "application_name" value, which will be truncated by PostgreSQL
to 63 characters (NAMEDATALEN - 1).
repmgr sometimes needs to be able to extract the application name from
pg_stat_replication to determine if a node is connected (e.g. when
executing "repmgr standby register"), so the comparison will fail
if "node_name" exceeds 63 characters.
Unless the PQExpBuffer is required for the duration of the function,
ensure it's always a variable local to the relevant code block. This
mitigates the risk of accidentally accessing a generically named
PQExpBuffer which hasn't been initialised or was previously terminated.
Previously, repmgrd assumed that during a failover, there would not
already be another primary node. However it's possible a node was
promoted manually. While this is not a desirable situation, it's
conceivable this could happen in the wild, so we should check for
it and react accordingly.
Also sanity-check that the follow target can actually be followed.
Addresses issue raised in GitHub #420.
In some corner cases (e.g. immediately after a switchover) where
the current primary has not yet been determined, the provided connection
might not be writeable. This prevents error messages such as
"cannot execute INSERT in a read-only transaction" generating unnecessary
noise in the logs.
Log the output of PQerrorStatus() in a couple of places where it was missing.
Additionally, always log the output of PQerrorStatus() starting with a blank
line, otherwise the first line looks like it was emitted by repmgr, and
it's harder to scan the error message.
Before:
[2019-03-20 11:24:15] [DETAIL] could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5501?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5501?
After:
[2019-03-20 11:27:21] [DETAIL]
could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5501?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5501?
1. The target additional-maintainer-clean was misspelled as
maintainer-additional-clean.
2. Add add missing clean targets, in particular sysutils.o, config.h,
repmgr_version.h, and Makefile.global. While at it, use a wildcard
for obj files.
3. Don't delete configure.
4. Remove generated file doc/version.sgml from the repo.
5. Have maintainer-clean recurse to the doc directory.