standby switchover: check replication configuration file ownership

Within a PostgreSQL data directory, all files should have the same
ownership as the data directory itself. PostgreSQL itself expects
this, and ownership of files by another user is likely to cause
problems.

In PostgreSQL 11 or earlier, if "recovery.conf" cannot be moved
by PostgreSQL (because e.g. it is owned by root), it will not be
possible to promote the standby to primary.

In PostgreSQL 12 and later, if "postgresql.auto.conf" on the demotion
candidate (current primary) has incorrect ownership (e.g. owned by
root), repmgr will very likely not be able to modify this file and
write the replication configuration required for the node to rejoin
the cluster as a standby.

Checks added to catch both cases before a switchover is executed.
This commit is contained in:
Ian Barwick
2020-03-04 11:35:52 +09:00
parent 194b6d0948
commit 8f6058c676
5 changed files with 287 additions and 3 deletions

View File

@@ -118,6 +118,7 @@ typedef struct
bool has_passfile;
bool replication_connection;
bool data_directory_config;
bool replication_config_owner;
/* "node rejoin" options */
char config_files[MAXLEN];
@@ -170,7 +171,7 @@ typedef struct
/* "node status" options */ \
false, \
/* "node check" options */ \
false, false, false, false, false, false, false, false, false, \
false, false, false, false, false, false, false, false, false, false, \
/* "node rejoin" options */ \
"", \
/* "node service" options */ \
@@ -281,6 +282,7 @@ extern bool drop_replication_slot_if_exists(PGconn *conn, int node_id, char *slo
extern standy_join_status check_standby_join(PGconn *primary_conn, t_node_info *primary_node_record, t_node_info *standby_node_record);
extern bool check_replication_slots_available(int node_id, PGconn* conn);
extern bool check_node_can_attach(TimeLineID local_tli, XLogRecPtr local_xlogpos, PGconn *follow_target_conn, t_node_info *follow_target_node_record, bool is_rejoin);
extern bool check_replication_config_owner(int pg_version, const char *data_directory, PQExpBufferData *error_msg, PQExpBufferData *detail_msg);
extern void check_shared_library(PGconn *conn);
extern bool is_repmgrd_running(PGconn *conn);