node check: accept -S/--superuser option

This is mainly useful for the --data-directory-config option, which
requires permission to read pg_settings to verify that the data
directory configured in "repmgr.conf" matches the data directory
actually in use.

If pg_settings read permission is not available, repmgr will fall
back to a simple check that the data directory configured in
"repmgr.conf" is a valid PostgreSQL directory. This is not entirely
foolproof, as it's possible PostgreSQL could be using a different
data directory.
This commit is contained in:
Ian Barwick
2020-03-23 17:07:42 +09:00
parent 06f0e5e94f
commit e561ddc8d3
5 changed files with 93 additions and 23 deletions

View File

@@ -14,6 +14,7 @@
repmgr: consolidate replication connection code (Ian)
repmgr: check permissions for "pg_promote()" and fall back to pg_ctl
if necessary (Ian)
repmgr: accept option -S/--superuser for "node check"; GitHub #612 (Ian)
repmgr: enable "service_promote_command" in PostgreSQL 12 (Ian)
5.0 2019-10-15

View File

@@ -64,6 +64,12 @@
data directory.
</para>
</listitem>
<listitem>
<para>
<link linkend="repmgr-node-check"><command>repmgr node check</command></link>:
accept option <option>-S</option>/<option>--superuser</option> GitHub #621.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>

View File

@@ -57,20 +57,20 @@
<listitem>
<simpara>
<literal>--role</literal>: checks if the node has the expected role
<option>--role</option>: checks if the node has the expected role
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--replication-lag</literal>: checks if the node is lagging by more than
<option>--replication-lag</option>: checks if the node is lagging by more than
<varname>replication_lag_warning</varname> or <varname>replication_lag_critical</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--archive-ready</literal>: checks for WAL files which have not yet been archived,
<option>--archive-ready</option>: checks for WAL files which have not yet been archived,
and returns <literal>WARNING</literal> or <literal>CRITICAL</literal> if the number
exceeds <varname>archive_ready_warning</varname> or <varname>archive_ready_critical</varname> respectively.
</simpara>
@@ -78,25 +78,25 @@
<listitem>
<simpara>
<literal>--downstream</literal>: checks that the expected downstream nodes are attached
<option>--downstream</option>: checks that the expected downstream nodes are attached
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--slots</literal>: checks there are no inactive physical replication slots
<option>--slots</option>: checks there are no inactive physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--missing-slots</literal>: checks there are no missing physical replication slots
<option>--missing-slots</option>: checks there are no missing physical replication slots
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--data-directory-config</literal>: checks the data directory configured in
<option>--data-directory-config</option>: checks the data directory configured in
<filename>repmgr.conf</filename> matches the actual data directory.
This check is not directly related to replication, but is useful to verify &repmgr;
is correctly configured.
@@ -108,6 +108,22 @@
</para>
</refsect1>
<refsect1>
<title>Connection options</title>
<para>
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<simpara>
<option>-S</option>/<option>--superuser</option>: connect as the
named superuser instead of the &repmgr; user
</simpara>
</listitem>
</itemizedlist>
</para>
</refsect1>
<refsect1>
<title>Output format</title>
<para>
@@ -115,14 +131,14 @@
<listitem>
<simpara>
<literal>--csv</literal>: generate output in CSV format (not available
<option>--csv</option>: generate output in CSV format (not available
for individual checks)
</simpara>
</listitem>
<listitem>
<simpara>
<literal>--nagios</literal>: generate output in a Nagios-compatible format
<option>--nagios</option>: generate output in a Nagios-compatible format
(for individual checks only)
</simpara>
</listitem>
@@ -130,13 +146,15 @@
</para>
</refsect1>
<refsect1>
<title>Exit codes</title>
<para>
When executing <command>repmgr node check</command> with one of the individual
checks listed above, &repmgr; will emit one of the following Nagios-style exit codes
(even if <literal>--nagios</literal> is not supplied):
(even if <option>--nagios</option> is not supplied):
<itemizedlist spacing="compact" mark="bullet">

View File

@@ -716,10 +716,45 @@ do_node_check(void)
exit(SUCCESS);
}
if (strlen(config_file_options.conninfo))
conn = establish_db_connection(config_file_options.conninfo, true);
if (config_file_options.conninfo[0] != '\0')
{
t_conninfo_param_list node_conninfo = T_CONNINFO_PARAM_LIST_INITIALIZER;
char *errmsg = NULL;
bool parse_success = false;
initialize_conninfo_params(&node_conninfo, false);
parse_success = parse_conninfo_string(config_file_options.conninfo,
&node_conninfo,
&errmsg, false);
if (parse_success == false)
{
log_error(_("unable to parse conninfo string \"%s\" for local node"),
config_file_options.conninfo);
log_detail("%s", errmsg);
exit(ERR_BAD_CONFIG);
}
/*
* If --superuser option provided, attempt to connect as the specified user
*/
if (runtime_options.superuser[0] != '\0')
{
param_set(&node_conninfo,
"user",
runtime_options.superuser);
}
conn = establish_db_connection_by_params(&node_conninfo, true);
}
else
{
conn = establish_db_connection_by_params(&source_conninfo, true);
}
if (get_node_record(conn, config_file_options.node_id, &node_info) != RECORD_FOUND)
{
@@ -1846,10 +1881,9 @@ do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_in
{
log_info(_("connection is not a superuser connection, falling back to simple check"));
/* XXX add -S/--superuser option */
if (PQserverVersion(conn) >= 100000)
{
log_hint(_("add the \"%s\" user to group \"pg_read_all_settings\" or \"pg_monitor\""),
log_hint(_("provide a superuser with -S/--superuser, or add the \"%s\" user to role \"pg_read_all_settings\" or \"pg_monitor\""),
PQuser(conn));
}
}
@@ -1870,6 +1904,12 @@ do_node_check_data_directory(PGconn *conn, OutputMode mode, t_node_info *node_in
status = CHECK_STATUS_CRITICAL;
}
else
{
appendPQExpBuffer(&details,
_("configured \"data_directory\" is \"%s\""),
config_file_options.data_directory);
}
}
switch (mode)
@@ -3164,17 +3204,21 @@ do_node_help(void)
puts("");
printf(_(" Configuration file required, runs on local node only.\n"));
puts("");
printf(_(" --csv emit output as CSV (not available for individual check output)\n"));
printf(_(" --nagios emit output in Nagios format (individual check output only)\n"));
printf(_(" Connection options:\n"));
printf(_(" -S, --superuser=USERNAME superuser to use, if repmgr user is not superuser\n"));
puts("");
printf(_(" Output options:\n"));
printf(_(" --csv emit output as CSV (not available for individual check output)\n"));
printf(_(" --nagios emit output in Nagios format (individual check output only)\n"));
puts("");
printf(_(" Following options check an individual status:\n"));
printf(_(" --archive-ready number of WAL files ready for archiving\n"));
printf(_(" --downstream whether all downstream nodes are connected\n"));
printf(_(" --replication-lag replication lag in seconds (standbys only)\n"));
printf(_(" --role check node has expected role\n"));
printf(_(" --slots check for inactive replication slots\n"));
printf(_(" --missing-slots check for missing replication slots\n"));
printf(_(" --data-directory-config check repmgr's data directory configuration\n"));
printf(_(" --archive-ready number of WAL files ready for archiving\n"));
printf(_(" --downstream whether all downstream nodes are connected\n"));
printf(_(" --replication-lag replication lag in seconds (standbys only)\n"));
printf(_(" --role check node has expected role\n"));
printf(_(" --slots check for inactive replication slots\n"));
printf(_(" --missing-slots check for missing replication slots\n"));
printf(_(" --data-directory-config check repmgr's data directory configuration\n"));
puts("");

View File

@@ -1691,6 +1691,7 @@ check_cli_parameters(const int action)
switch (action)
{
case STANDBY_CLONE:
case NODE_CHECK:
break;
default:
item_list_append_format(&cli_warnings,