Since b5934bfd6071 in postgresql.git the flag
`-Wshadow=compatible-local` is activated. This commit fixes any
duplicated declaration made in the same function.
References: HL-40
* Added check for pg_checkpoint role presence
This commit provides the needed infrastructure in `repmgr` so if the `repmgr` database
user is a member of the `pg_checkpoint` role, and inherits its privileges, there is no
need for such a user to be a superuser.
Co-authored-by: Martín Marqués <martin.marques@enterprisedb.com>
As it's possible to specify the connection information for any available
node, but currently not possible to rejoin to a node other than the
primary, explicitly mention what the rejoin target will be.
When cloning a standby, it's possible to do a "raw" clone by providing
-D/--data-directory but no repmgr.conf file. However the code which
creates "standby.signal" was assuming the presence of a valid
repmgr.conf complete with "data_directory" configuration.
This is very much a niche-use case.
As it seems redirecting stderr to stdin (2>&1) when executing
system commands results in a SIGPIPE (141) return code, making
it impossible to determine the actual return code, redirect
stderr to a temporary file and collate the output from that.
There are possibly better ways of doing this which could
be revisited at a future date.
From PostgreSQL 13, pg_rewind will automatically handle an unclean
shutdown itself, so as long as --force-rewind was provided, so there
is no need to fail with an error.
Note that pg_rewind handles the unclean shutdown by starting PostgreSQL
in single user mode, which it does before performing any checks as
to whether a rewind is actually necessary.
However pg_rewind doesn't take into account the possible presence
of a standby.signal file, so we remove that and recreate it after
pg_rewind was executed.
Previously the check verifying that a node has connected to its upstream
merely assumed the presence of a record in pg_stat_replication indicates
a successful replication connection. However the record may contain a
state other than "streaming", typically "startup" (which will occur when
a node has diverged from its upstream and will therefore never
transition to "streaming"), which needs to be taken into account when
considering the state of the replication connection to avoid false
positives.
Explicitly log if a database connection failure caused the check
to fail.
It's unlikely this situation will be encountered, as the data directory
check will already have run and checked for connection failure, however
there's a small chance the connection could fail between checks.
It's possible that the remote data directory check will fail if e.g.
connection configuration is not consistent across all nodes. This
modification ensures a database error connection is reported, rather
than a spurios issue with the data directory configuration.
The --optformat option is intended for use when repmgr is being invoked
remotely by another repmgr instance, typically during a switchover
operation.
Previously no output was returned if the local repmgr was unable to
connect to its local PostgreSQL instance, which made diagnosing
various corner-case problems trickier than it should be.
We have a --downstream option to check for attached nodes, but it
would be useful to have a corresponding --upstream option too.
A following patch will adapt the behaviour of this option when executed
on the primary node.
This is mainly useful for the --data-directory-config option, which
requires permission to read pg_settings to verify that the data
directory configured in "repmgr.conf" matches the data directory
actually in use.
If pg_settings read permission is not available, repmgr will fall
back to a simple check that the data directory configured in
"repmgr.conf" is a valid PostgreSQL directory. This is not entirely
foolproof, as it's possible PostgreSQL could be using a different
data directory.
In a few places, replication connections are generated from the
parameters used by existing connections. This has resulted in a
number of similar blocks of code which do more-or-less the same
thing almost but not quite identically. In two cases, the code
omitted to set "dbname=replication", which can cause problems
in some contexts.
These code blocks have now been consolidated into standardized
functions.
This also resolves the issue addressed by GitHub #619.
Within a PostgreSQL data directory, all files should have the same
ownership as the data directory itself. PostgreSQL itself expects
this, and ownership of files by another user is likely to cause
problems.
In PostgreSQL 11 or earlier, if "recovery.conf" cannot be moved
by PostgreSQL (because e.g. it is owned by root), it will not be
possible to promote the standby to primary.
In PostgreSQL 12 and later, if "postgresql.auto.conf" on the demotion
candidate (current primary) has incorrect ownership (e.g. owned by
root), repmgr will very likely not be able to modify this file and
write the replication configuration required for the node to rejoin
the cluster as a standby.
Checks added to catch both cases before a switchover is executed.
From PostgreSQL 10, a member of the default roles "pg_monitor" and/or
"pg_read_all_settings" can read pg_settings without requiring superuser
privileges.
Previously, a hint was being emitted about making the repmgr user a
member of one of those groups, but no check for membership was being
made, meaning the check could only be run by a superuser.