Document and expand pg_ctl override configuration options

These are now prefixed with "service_" to emphasize that they're
OS-level commands, not repmgr ones; also added reload and promote
commands:

    service_start_command
    service_stop_command
    service_restart_command
    service_reload_command
    service_promote_command

GitHub #169
This commit is contained in:
Ian Barwick
2016-09-30 11:57:02 +09:00
parent 46500e1408
commit ed22fe326e
6 changed files with 124 additions and 59 deletions

View File

@@ -1,6 +1,6 @@
3.2 2016- 3.2 2016-
repmgr: add support for cloning from a Barman backup (Gianni) repmgr: add support for cloning from a Barman backup (Gianni)
repmhr: add commands `standby matrix` and `standby diagnose` (Gianni) repmgr: add commands `standby matrix` and `standby crosscheck` (Gianni)
repmgr: suppress connection error display in `repmgr cluster show` repmgr: suppress connection error display in `repmgr cluster show`
unless `--verbose` supplied (Ian) unless `--verbose` supplied (Ian)
repmgr: add commands `witness register` and `witness unregister` (Ian) repmgr: add commands `witness register` and `witness unregister` (Ian)
@@ -15,6 +15,8 @@
the standby (Ian) the standby (Ian)
repmgr: add option `--copy-external-config-files` for files outside repmgr: add option `--copy-external-config-files` for files outside
of the data directory (Ian) of the data directory (Ian)
repmgr: add configuration options to override the default pg_ctl
commands (Jarkko Oranen)
packaging: improve "repmgr-auto" Debian package (Gianni) packaging: improve "repmgr-auto" Debian package (Gianni)
3.1.5 2016-08-15 3.1.5 2016-08-15

View File

@@ -154,7 +154,7 @@ is not required, but is necessary in the following cases:
data directory data directory
* when using `rsync` to clone a standby * when using `rsync` to clone a standby
* to perform switchover operations * to perform switchover operations
* when executing `repmgr cluster matrix` and `repmgr cluster diagnose` * when executing `repmgr cluster matrix` and `repmgr cluster crosscheck`
In these cases `rsync` is required on all servers too. In these cases `rsync` is required on all servers too.
@@ -475,8 +475,8 @@ settings which have not been activated, the file will not be copied.
* * * * * *
> *TIP*: for reliable configuration file management we recommend using a configuration > *TIP*: for reliable configuration file management we recommend using a
> management tool such as Ansible, Chef, Puppet or Salt. > configuration management tool such as Ansible, Chef, Puppet or Salt.
* * * * * *
@@ -1423,6 +1423,42 @@ In general `repmgr` can be upgraded as-is without any further action required,
however feature releases may require the `repmgr` database to be upgraded. however feature releases may require the `repmgr` database to be upgraded.
An SQL script will be provided - please check the release notes for details. An SQL script will be provided - please check the release notes for details.
Distribution-specific configuration
-----------------------------------
`repmgr` is largely OS-agnostic and can be run on any UNIX-like environment
including various Linux distributions, Solaris, macOS and the various BSDs.
However, often OS-specific configuration is required, particularly when
dealing with system service management (e.g. stopping and starting the
PostgreSQL server), file paths and configuration file locations.
### PostgreSQL server control
By default, `repmgr` will use PostgreSQL's standard `pg_ctl` utility to control
a running PostgreSQL server. However it may be better to use the operating
system's service management system, e.g. `systemd`. To specify which service
control commands are used, the following `repmgr.conf` configuration settings
are available:
service_start_command
service_stop_command
service_restart_command
service_reload_command
service_promote_command
See `repmgr.conf.sample` for further details.
### Binary directory
Some PostgreSQL system packages, such as those provided for Debian/Ubuntu, like
to hide some PostgreSQL utility programs outside of the default path. To ensure
`repmgr` finds all required executables, explicitly set `pg_bindir` to the
appropriate location, e.g. for PostgreSQL 9.6 on Debian/Ubuntu this would be
`/usr/lib/postgresql/9.6/bin/`.
Reference Reference
--------- ---------
@@ -1602,7 +1638,7 @@ which contains connection details for the local database.
The first column is the node's ID, and the second column represents the The first column is the node's ID, and the second column represents the
node's status (0 = master, 1 = standby, -1 = failed). node's status (0 = master, 1 = standby, -1 = failed).
* `cluster matrix` and `cluster diagnose` * `cluster matrix` and `cluster crosscheck`
These commands display connection information for each pair of These commands display connection information for each pair of
nodes in the replication cluster. nodes in the replication cluster.
@@ -1610,8 +1646,9 @@ which contains connection details for the local database.
- `cluster matrix` runs a `cluster show` on each node and arranges - `cluster matrix` runs a `cluster show` on each node and arranges
the results in a matrix, recording success or failure; the results in a matrix, recording success or failure;
- `cluster diagnose` runs a `cluster matrix` on each node and - `cluster crosscheck` runs a `cluster matrix` on each node and
combines the results in a single matrix. combines the results in a single matrix, providing a full
overview of connections between all databases in the cluster.
These commands require a valid `repmgr.conf` file on each node. These commands require a valid `repmgr.conf` file on each node.
Additionally password-less `ssh` connections are required between Additionally password-less `ssh` connections are required between
@@ -1653,7 +1690,7 @@ which contains connection details for the local database.
node1 and node2, meaning that inbound connections to these nodes node1 and node2, meaning that inbound connections to these nodes
have succeeded. have succeeded.
In this case, `cluster diagnose` gives the same result as `cluster In this case, `cluster crosscheck` gives the same result as `cluster
matrix`, because from any functioning node we can observe the same matrix`, because from any functioning node we can observe the same
state: `node1` and `node2` are up, `node3` is down. state: `node1` and `node2` are up, `node3` is down.
@@ -1680,9 +1717,9 @@ which contains connection details for the local database.
and that (therefore) we don't know the state of any outbound and that (therefore) we don't know the state of any outbound
connection from node3. connection from node3.
In this case, the `cluster diagnose` command is more informative: In this case, the `cluster crosscheck` command is more informative:
$ repmgr -f /etc/repmgr.conf cluster diagnose $ repmgr -f /etc/repmgr.conf cluster crosscheck
Name | Id | 1 | 2 | 3 Name | Id | 1 | 2 | 3
-------+----+----+----+---- -------+----+----+----+----
@@ -1690,7 +1727,7 @@ which contains connection details for the local database.
node2 | 2 | * | * | * node2 | 2 | * | * | *
node3 | 3 | * | * | * node3 | 3 | * | * | *
What happened is that `cluster diagnose` merged its own `cluster What happened is that `cluster crosscheck` merged its own `cluster
matrix` with the `cluster matrix` output from `node2`; the latter is matrix` with the `cluster matrix` output from `node2`; the latter is
able to connect to `node3` and therefore determine the state of able to connect to `node3` and therefore determine the state of
outbound connections from that node. outbound connections from that node.

View File

@@ -222,9 +222,11 @@ parse_config(t_configuration_options *options)
memset(options->node_name, 0, sizeof(options->node_name)); memset(options->node_name, 0, sizeof(options->node_name));
memset(options->promote_command, 0, sizeof(options->promote_command)); memset(options->promote_command, 0, sizeof(options->promote_command));
memset(options->follow_command, 0, sizeof(options->follow_command)); memset(options->follow_command, 0, sizeof(options->follow_command));
memset(options->stop_command, 0, sizeof(options->stop_command)); memset(options->service_stop_command, 0, sizeof(options->service_stop_command));
memset(options->start_command, 0, sizeof(options->start_command)); memset(options->service_start_command, 0, sizeof(options->service_start_command));
memset(options->restart_command, 0, sizeof(options->restart_command)); memset(options->service_restart_command, 0, sizeof(options->service_restart_command));
memset(options->service_reload_command, 0, sizeof(options->service_reload_command));
memset(options->service_promote_command, 0, sizeof(options->service_promote_command));
memset(options->rsync_options, 0, sizeof(options->rsync_options)); memset(options->rsync_options, 0, sizeof(options->rsync_options));
memset(options->ssh_options, 0, sizeof(options->ssh_options)); memset(options->ssh_options, 0, sizeof(options->ssh_options));
memset(options->pg_bindir, 0, sizeof(options->pg_bindir)); memset(options->pg_bindir, 0, sizeof(options->pg_bindir));
@@ -351,12 +353,16 @@ parse_config(t_configuration_options *options)
strncpy(options->promote_command, value, MAXLEN); strncpy(options->promote_command, value, MAXLEN);
else if (strcmp(name, "follow_command") == 0) else if (strcmp(name, "follow_command") == 0)
strncpy(options->follow_command, value, MAXLEN); strncpy(options->follow_command, value, MAXLEN);
else if (strcmp(name, "stop_command") == 0) else if (strcmp(name, "service_stop_command") == 0)
strncpy(options->stop_command, value, MAXLEN); strncpy(options->service_stop_command, value, MAXLEN);
else if (strcmp(name, "start_command") == 0) else if (strcmp(name, "service_start_command") == 0)
strncpy(options->start_command, value, MAXLEN); strncpy(options->service_start_command, value, MAXLEN);
else if (strcmp(name, "restart_command") == 0) else if (strcmp(name, "service_restart_command") == 0)
strncpy(options->restart_command, value, MAXLEN); strncpy(options->service_restart_command, value, MAXLEN);
else if (strcmp(name, "service_reload_command") == 0)
strncpy(options->service_reload_command, value, MAXLEN);
else if (strcmp(name, "service_promote_command") == 0)
strncpy(options->service_promote_command, value, MAXLEN);
else if (strcmp(name, "master_response_timeout") == 0) else if (strcmp(name, "master_response_timeout") == 0)
options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", &config_errors, false); options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", &config_errors, false);
/* /*

View File

@@ -63,11 +63,15 @@ typedef struct
int failover; int failover;
int priority; int priority;
char node_name[MAXLEN]; char node_name[MAXLEN];
/* commands executed by repmgrd */
char promote_command[MAXLEN]; char promote_command[MAXLEN];
char follow_command[MAXLEN]; char follow_command[MAXLEN];
char stop_command[MAXLEN]; /* Overrides for pg_ctl commands */
char start_command[MAXLEN]; char service_stop_command[MAXLEN];
char restart_command[MAXLEN]; char service_start_command[MAXLEN];
char service_restart_command[MAXLEN];
char service_reload_command[MAXLEN];
char service_promote_command[MAXLEN];
char loglevel[MAXLEN]; char loglevel[MAXLEN];
char logfacility[MAXLEN]; char logfacility[MAXLEN];
char rsync_options[QUERY_STR_LEN]; char rsync_options[QUERY_STR_LEN];
@@ -93,7 +97,7 @@ typedef struct
* The following will initialize the structure with a minimal set of options; * The following will initialize the structure with a minimal set of options;
* actual defaults are set in parse_config() before parsing the configuration file * actual defaults are set in parse_config() before parsing the configuration file
*/ */
#define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } } #define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", "", "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", "", 0, 0, 0, 0, "", { NULL, NULL }, { NULL, NULL } }
typedef struct ItemListCell typedef struct ItemListCell
{ {

View File

@@ -21,7 +21,7 @@
* WITNESS REGISTER * WITNESS REGISTER
* WITNESS UNREGISTER * WITNESS UNREGISTER
* *
* CLUSTER DIAGNOSE * CLUSTER CROSSCHECK
* CLUSTER MATRIX * CLUSTER MATRIX
* CLUSTER SHOW * CLUSTER SHOW
* CLUSTER CLEANUP * CLUSTER CLEANUP
@@ -91,7 +91,7 @@
#define CLUSTER_SHOW 13 #define CLUSTER_SHOW 13
#define CLUSTER_CLEANUP 14 #define CLUSTER_CLEANUP 14
#define CLUSTER_MATRIX 15 #define CLUSTER_MATRIX 15
#define CLUSTER_DIAGNOSE 16 #define CLUSTER_CROSSCHECK 16
static int test_ssh_connection(char *host, char *remote_user); static int test_ssh_connection(char *host, char *remote_user);
static int copy_remote_files(char *host, char *remote_user, char *remote_path, static int copy_remote_files(char *host, char *remote_user, char *remote_path,
@@ -114,7 +114,7 @@ static void get_barman_property(char *dst, char *name, char *local_repmgr_direct
static char *string_skip_prefix(const char *prefix, char *string); static char *string_skip_prefix(const char *prefix, char *string);
static char *string_remove_trailing_newlines(char *string); static char *string_remove_trailing_newlines(char *string);
static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length); static int build_cluster_matrix(t_node_matrix_rec ***matrix_rec_dest, int *name_length);
static int build_cluster_diagnose(t_node_status_cube ***cube_dest, int *name_length); static int build_cluster_crosscheck(t_node_status_cube ***cube_dest, int *name_length);
static char *make_pg_path(char *file); static char *make_pg_path(char *file);
static char *make_barman_ssh_command(void); static char *make_barman_ssh_command(void);
@@ -136,7 +136,7 @@ static void do_witness_unregister(void);
static void do_cluster_show(void); static void do_cluster_show(void);
static void do_cluster_matrix(void); static void do_cluster_matrix(void);
static void do_cluster_diagnose(void); static void do_cluster_crosscheck(void);
static void do_cluster_cleanup(void); static void do_cluster_cleanup(void);
static void do_check_upstream_config(void); static void do_check_upstream_config(void);
static void do_help(void); static void do_help(void);
@@ -665,7 +665,7 @@ main(int argc, char **argv)
* { MASTER | PRIMARY } REGISTER | * { MASTER | PRIMARY } REGISTER |
* STANDBY {REGISTER | UNREGISTER | CLONE [node] | PROMOTE | FOLLOW [node] | SWITCHOVER | REWIND} | * STANDBY {REGISTER | UNREGISTER | CLONE [node] | PROMOTE | FOLLOW [node] | SWITCHOVER | REWIND} |
* WITNESS { CREATE | REGISTER | UNREGISTER } | * WITNESS { CREATE | REGISTER | UNREGISTER } |
* CLUSTER { DIAGNOSE | MATRIX | SHOW | CLEANUP} * CLUSTER { CROSSCHECK | MATRIX | SHOW | CLEANUP}
* *
* the node part is optional, if we receive it then we shouldn't have * the node part is optional, if we receive it then we shouldn't have
* received a -h option * received a -h option
@@ -721,8 +721,8 @@ main(int argc, char **argv)
action = CLUSTER_SHOW; action = CLUSTER_SHOW;
else if (strcasecmp(server_cmd, "CLEANUP") == 0) else if (strcasecmp(server_cmd, "CLEANUP") == 0)
action = CLUSTER_CLEANUP; action = CLUSTER_CLEANUP;
else if (strcasecmp(server_cmd, "DIAGNOSE") == 0) else if (strcasecmp(server_cmd, "CROSSCHECK") == 0)
action = CLUSTER_DIAGNOSE; action = CLUSTER_CROSSCHECK;
else if (strcasecmp(server_cmd, "MATRIX") == 0) else if (strcasecmp(server_cmd, "MATRIX") == 0)
action = CLUSTER_MATRIX; action = CLUSTER_MATRIX;
} }
@@ -965,8 +965,8 @@ main(int argc, char **argv)
case WITNESS_UNREGISTER: case WITNESS_UNREGISTER:
do_witness_unregister(); do_witness_unregister();
break; break;
case CLUSTER_DIAGNOSE: case CLUSTER_CROSSCHECK:
do_cluster_diagnose(); do_cluster_crosscheck();
break; break;
case CLUSTER_MATRIX: case CLUSTER_MATRIX:
do_cluster_matrix(); do_cluster_matrix();
@@ -1435,7 +1435,7 @@ cube_set_node_status(t_node_status_cube **cube, int n, int execute_node_id, int
static int static int
build_cluster_diagnose(t_node_status_cube ***dest_cube, int *name_length) build_cluster_crosscheck(t_node_status_cube ***dest_cube, int *name_length)
{ {
PGconn *conn; PGconn *conn;
PGresult *res; PGresult *res;
@@ -1455,7 +1455,7 @@ build_cluster_diagnose(t_node_status_cube ***dest_cube, int *name_length)
" FROM %s.repl_show_nodes ORDER BY id", " FROM %s.repl_show_nodes ORDER BY id",
get_repmgr_schema_quoted(conn)); get_repmgr_schema_quoted(conn));
log_verbose(LOG_DEBUG, "build_cluster_diagnose(): \n%s\n",sqlquery ); log_verbose(LOG_DEBUG, "build_cluster_crosscheck(): \n%s\n",sqlquery );
res = PQexec(conn, sqlquery); res = PQexec(conn, sqlquery);
@@ -1582,7 +1582,7 @@ build_cluster_diagnose(t_node_status_cube ***dest_cube, int *name_length)
host = param_get(&remote_conninfo, "host"); host = param_get(&remote_conninfo, "host");
log_verbose(LOG_DEBUG, "build_cluster_diagnose(): executing\n%s\n", quoted_command.data); log_verbose(LOG_DEBUG, "build_cluster_crosscheck(): executing\n%s\n", quoted_command.data);
(void)remote_command( (void)remote_command(
host, host,
@@ -1633,7 +1633,7 @@ build_cluster_diagnose(t_node_status_cube ***dest_cube, int *name_length)
static void static void
do_cluster_diagnose(void) do_cluster_crosscheck(void)
{ {
int i, n; int i, n;
char c; char c;
@@ -1642,7 +1642,7 @@ do_cluster_diagnose(void)
t_node_status_cube **cube; t_node_status_cube **cube;
n = build_cluster_diagnose(&cube, &name_length); n = build_cluster_crosscheck(&cube, &name_length);
printf("%*s | Id ", name_length, node_header); printf("%*s | Id ", name_length, node_header);
for (i = 0; i < n; i++) for (i = 0; i < n; i++)
@@ -4138,8 +4138,17 @@ do_standby_promote(void)
* can't be sure when or if the promotion completes. * can't be sure when or if the promotion completes.
* For now we'll poll the server until the default timeout (60 seconds) * For now we'll poll the server until the default timeout (60 seconds)
*/ */
maxlen_snprintf(script, "%s -D %s promote",
make_pg_path("pg_ctl"), data_dir); if (*options.service_promote_command)
{
maxlen_snprintf(script, "%s", options.service_promote_command);
}
else
{
maxlen_snprintf(script, "%s -D %s promote",
make_pg_path("pg_ctl"), data_dir);
}
log_notice(_("promoting server using '%s'\n"), log_notice(_("promoting server using '%s'\n"),
script); script);
@@ -4417,9 +4426,9 @@ do_standby_follow(void)
exit(ERR_BAD_CONFIG); exit(ERR_BAD_CONFIG);
/* Finally, restart the service */ /* Finally, restart the service */
if (*options.restart_command) if (*options.service_restart_command)
{ {
maxlen_snprintf(script, "%s", options.restart_command); maxlen_snprintf(script, "%s", options.service_restart_command);
} }
else else
{ {
@@ -4922,10 +4931,6 @@ do_standby_switchover(void)
* We'll issue the pg_ctl command but not force it not to wait; we'll check * We'll issue the pg_ctl command but not force it not to wait; we'll check
* the connection from here - and error out if no shutdown is detected * the connection from here - and error out if no shutdown is detected
* after a certain time. * after a certain time.
*
* XXX currently we assume the same Postgres binary path on the primary
* as configured on the local standby; we may need to add a command
* line option to provide an explicit path (--remote-pg-bindir)?
*/ */
/* /*
@@ -4936,9 +4941,9 @@ do_standby_switchover(void)
initPQExpBuffer(&remote_command_str); initPQExpBuffer(&remote_command_str);
if (*options.stop_command) if (*options.service_stop_command)
{ {
appendPQExpBuffer(&remote_command_str, "%s", options.stop_command); appendPQExpBuffer(&remote_command_str, "%s", options.service_stop_command);
} }
else else
{ {
@@ -5655,6 +5660,7 @@ do_witness_create(void)
if (!runtime_options.superuser[0]) if (!runtime_options.superuser[0])
strncpy(runtime_options.superuser, "postgres", MAXLEN); strncpy(runtime_options.superuser, "postgres", MAXLEN);
/* TODO: possibly allow the user to override this with a custom command? */
maxlen_snprintf(script, "%s %s -D %s init -o \"%s-U %s\"", maxlen_snprintf(script, "%s %s -D %s init -o \"%s-U %s\"",
make_pg_path("pg_ctl"), make_pg_path("pg_ctl"),
options.pg_ctl_options, runtime_options.dest_dir, options.pg_ctl_options, runtime_options.dest_dir,
@@ -5732,9 +5738,9 @@ do_witness_create(void)
/* start new instance */ /* start new instance */
if (*options.start_command) if (*options.service_start_command)
{ {
maxlen_snprintf(script, "%s", options.start_command); maxlen_snprintf(script, "%s", options.service_start_command);
} }
else else
{ {
@@ -5858,9 +5864,17 @@ do_witness_create(void)
} }
/* reload witness server to activate the copied pg_hba.conf */ /* reload witness server to activate the copied pg_hba.conf */
maxlen_snprintf(script, "%s %s -w -D %s reload", if (*options.service_reload_command)
make_pg_path("pg_ctl"), {
options.pg_ctl_options, runtime_options.dest_dir); maxlen_snprintf(script, "%s", options.service_reload_command);
}
else
{
maxlen_snprintf(script, "%s %s -w -D %s reload",
make_pg_path("pg_ctl"),
options.pg_ctl_options, runtime_options.dest_dir);
}
log_info(_("reloading witness server configuration: %s"), script); log_info(_("reloading witness server configuration: %s"), script);
r = system(script); r = system(script);
if (r != 0) if (r != 0)
@@ -6169,7 +6183,7 @@ do_help(void)
printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow|switchover}\n"), printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow|switchover}\n"),
progname()); progname());
printf(_(" %s [OPTIONS] witness {create|register|unregister}\n"), progname()); printf(_(" %s [OPTIONS] witness {create|register|unregister}\n"), progname());
printf(_(" %s [OPTIONS] cluster {show|cleanup}\n"), progname()); printf(_(" %s [OPTIONS] cluster {show|matrix|crosscheck|cleanup}\n"), progname());
printf(_("\n")); printf(_("\n"));
printf(_("General options:\n")); printf(_("General options:\n"));
printf(_(" -?, --help show this help, then exit\n")); printf(_(" -?, --help show this help, then exit\n"));

View File

@@ -106,8 +106,8 @@
# service control commands # service control commands
# #
# repmgr provides options to to override the default pg_ctl commands # repmgr provides options to override the default pg_ctl commands
# used to stop, start and restart the PostgreSQL cluster # used to stop, start, restart, reload and promote the PostgreSQL cluster
# #
# NOTE: These commands must be runnable on remote nodes as well for switchover # NOTE: These commands must be runnable on remote nodes as well for switchover
# to function correctly. # to function correctly.
@@ -123,9 +123,11 @@
# /usr/bin/systemctl start postgresql-9.5, \ # /usr/bin/systemctl start postgresql-9.5, \
# /usr/bin/systemctl restart postgresql-9.5 # /usr/bin/systemctl restart postgresql-9.5
# #
# start_command = systemctl start postgresql-9.5 # service_start_command = systemctl start postgresql-9.5
# stop_command = systemctl stop postgresql-9.5 # service_stop_command = systemctl stop postgresql-9.5
# restart_command = systemctl restart postgresql-9.5 # service_restart_command = systemctl restart postgresql-9.5
# service_reload_command = pg_ctlcluster 9.5 main reload
# service_promote_command = pg_ctlcluster 9.5 main promote
# external command options # external command options