Compare commits

...

21 Commits

Author SHA1 Message Date
Jaime Casanova
e04ba8bea5 Add "--checksum" in rsync when using "--force"
If the user don't put that option in rsync_options using of "--force"
could be unsafe.
While the probability of failures because of this are low they aren't
zero.
2015-02-10 20:46:06 -05:00
Jaime Casanova
031f9aedcc Options -F -I -v doesn't accept arguments, which means that on
getopt_long shouldn't be marked with the colon (:) character.

This has been wrong since day one, so backpatching all the way until
1.1
2013-01-13 16:42:04 -05:00
Jaime Casanova
8ee715b657 Make repmgr compatible with FreeBSD.
We need to #include <sys/wait.h> to get WEXITSTATUS()
2012-09-15 17:45:38 -05:00
Jaime Casanova
c2344fe843 When we have more command-line arguments than we should have we
need to show that last value and we should use only optind for that
instead of optind+1
2012-09-15 17:41:49 -05:00
Jaime Casanova
30b124e91f STANDBY CLONE should be run by a SUPERUSER, otherwise we won't be able
to retrieve data_directory and the other parameters we need by
querying the database.
2012-06-12 09:40:38 -05:00
Jaime Casanova
c00fa9f9ba Fix a switch in which a "break" was missing that makes always that --force option
was used end up in the default section and error.
2012-04-19 12:18:21 -05:00
Jaime Casanova
d36ee899dc Complete HISTORY information in preparation for release of v1.1.1 2012-04-18 09:49:38 -05:00
Jaime Casanova
d790ef740b Add a paragraph in the docs describing how to clean history 2012-04-11 10:54:22 -05:00
Jaime Casanova
aa6633b027 Complete the lists of error codes that repmgr can return in the README.rst 2012-04-11 10:38:22 -05:00
Jaime Casanova
c3bffce379 Run astyle to format code before tagging the release 2012-04-11 10:35:37 -05:00
Jaime Casanova
78aea00a6d Avoid to show what segments are needed for this backup if the rsync failed 2012-04-11 10:34:38 -05:00
Jaime Casanova
91601204b5 Remove last argument from log_err, left in commit 9b8fb7e960.
Also rephrase the sentence

Reported by Jeroen Dekkers
2011-11-28 17:26:19 -05:00
Jaime Casanova
c91ddc2f5e Fix a wrong message.
It was saying the problem is the version of the PostgreSQL server while
it actually is because the MASTER REGISTER command was running on a
standby node
2011-11-10 09:30:42 -05:00
Jaime Casanova
72f74dd7a7 Fix a typo introduced in commit 94c9c3a5c6 2011-11-03 12:54:55 -05:00
Jaime Casanova
901d07fa92 Improve performance of the repl_status view 2011-10-20 23:20:03 -05:00
Greg Smith
f0e609bcd4 Add strnlen on platforms that don't have it, such as OS X 2011-10-20 17:04:29 -05:00
Jaime Casanova
94c9c3a5c6 Let the clone happen in a session with synchronous_commit off. This
is because in pg 9.1 the default configuration can easily allow sync
rep to be activated even if no standby is present and will block
pg_start_backup() and pg_stop_backup() in that case.

Also remove a second connection we were opening to execute
pg_stop_backup(), i'm not sure why that was there but now it was
a problem because it was another session and not the one we set here.
2011-10-03 13:56:31 -05:00
Cédric Villemain
3af5243bcc Fix rsync return code test 2011-08-24 09:14:22 -05:00
Cédric Villemain
85bbae462a Add --ignore-rsync-warning to README 2011-08-22 00:34:01 -05:00
Cédric Villemain
14e49d41c2 Add --ignore-rsync-warning command line option
This fix the rsync return code in case there are vanished files.

Common situation are DROPed tables and TEMPorary object deletion and
are handled by PostgreSQL.
But as it may exist situation where an external process delete files in
the PGDATA the flag is off by default.

XXX 2 items :

 * is -I a good choice ? maybe we need to prevent future --ignore-foo and
   add something like : --ignore=rsync_warning -I rsync_warning
 * the warning message is not enough explicit with the risk involved by
   --force usage
2011-08-22 00:32:40 -05:00
Cédric Villemain
1bd8a703c8 Fix getopt for ignore-rsync-warning
The change was loosed during merge and not checked in master/
2011-06-06 20:56:45 -04:00
7 changed files with 123 additions and 15 deletions

View File

@@ -31,3 +31,10 @@
1.1.0 2011-03-09
Make options -U, -R and -p not mandatory (Jaime)
1.1.1 2012-04-18
Add --ignore-rsync-warning (Cédric)
Add strnlen for compatibility with OS X (Greg)
Improve performance of repl_status view (Jaime)
Remove last argument from log_err (Jaime, Reported by Jeroen Dekkers)
Complete documentation about possible error conditions (Jaime)
Document how to clean history (Jaime)

View File

@@ -814,6 +814,23 @@ and on "prime."
The servers are now again acting as primary on "prime" and standby on "standby".
Maintainance of monitor history
-------------------------------
Once you have changed roles (with a failover or to restore original roles)
you would end up with records saying that node1 is primary and other records
saying that node2 is the primary. Which could be confusing.
Also, if you don't do anything about it the monitor history will keep growing.
For both of those reasons you sometime want to make some maintainance of the
``repl_monitor`` table.
If you want to clean the history after a few days you can execute a
truncate/delete (wheter you want to completely clean history or want to keep
a few days of history) in a cron. For example to keep just one day of history
you can put this in your crontab::
0 1 * * * psql -c "DELETE FROM repmgr_schema.repl_monitor where now() - last_monitor_time >= '1 day'::interval;" postgres
Configuration and command reference
===================================
@@ -863,6 +880,7 @@ The output from this program looks like this::
-R, --remote-user=USERNAME database server username for rsync
-w, --wal-keep-segments=VALUE minimum value for the GUC wal_keep_segments (default: 5000)
-F, --force force potentially dangerous operations to happen
-I, --ignore-rsync-warning Ignore partial transfert warning
repmgr performs some tasks like clone a node, promote it or making follow another node and then exits.
COMMANDS:
@@ -1023,6 +1041,7 @@ following
* ERR_DB_QUERY 7: Error executing a database query.
* ERR_PROMOTED 8: Exiting program because the node has been promoted to master.
* ERR_BAD_PASSWORD 9: Password used to connect to a database was rejected.
* ERR_STR_OVERFLOW 10: A string was larger than expected.
License and Contributions
=========================

View File

@@ -20,6 +20,8 @@
#ifndef _REPMGR_DBUTILS_H_
#define _REPMGR_DBUTILS_H_
#include "strutil.h"
PGconn *establishDBConnection(const char *conninfo, const bool exit_on_error);
PGconn *establishDBConnectionByParams(const char *keywords[],
const char *values[],

View File

@@ -28,6 +28,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>
@@ -71,7 +72,7 @@ bool need_a_node = true;
bool require_password = false;
/* Initialization of runtime options */
t_runtime_options runtime_options = { "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, "" };
t_runtime_options runtime_options = { "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, "" };
t_configuration_options options = { "", -1, "", "", "" };
static char *server_mode = NULL;
@@ -91,6 +92,7 @@ main(int argc, char **argv)
{"remote-user", required_argument, NULL, 'R'},
{"wal-keep-segments", required_argument, NULL, 'w'},
{"force", no_argument, NULL, 'F'},
{"ignore-rsync-warning", no_argument, NULL, 'I'},
{"verbose", no_argument, NULL, 'v'},
{NULL, 0, NULL, 0}
};
@@ -116,7 +118,7 @@ main(int argc, char **argv)
}
while ((c = getopt_long(argc, argv, "d:h:p:U:D:f:R:w:F:v", long_options,
while ((c = getopt_long(argc, argv, "d:h:p:U:D:f:R:w:FIv", long_options,
&optindex)) != -1)
{
switch (c)
@@ -150,6 +152,9 @@ main(int argc, char **argv)
case 'F':
runtime_options.force = true;
break;
case 'I':
runtime_options.ignore_rsync_warn = true;
break;
case 'v':
runtime_options.verbose = true;
break;
@@ -228,7 +233,7 @@ main(int argc, char **argv)
break;
default:
log_err(_("%s: too many command-line arguments (first extra is \"%s\")\n"),
progname, argv[optind + 1]);
progname, argv[optind]);
usage();
exit(ERR_BAD_CONFIG);
}
@@ -346,7 +351,7 @@ do_master_register(void)
log_info(_("%s connected to master, checking its state\n"), progname);
if (is_standby(conn))
{
log_err(_("%s needs master to be PostgreSQL 9.0 or better\n"), progname);
log_err(_("Trying to register a standby node as a master\n"));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
@@ -438,14 +443,13 @@ do_master_register(void)
/* and the view */
sqlquery_snprintf(sqlquery, "CREATE VIEW %s.repl_status AS "
" WITH monitor_info AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY primary_node, standby_node "
" ORDER BY last_monitor_time desc) "
" FROM %s.repl_monitor) "
" SELECT primary_node, standby_node, last_monitor_time, last_wal_primary_location, "
" last_wal_standby_location, pg_size_pretty(replication_lag) replication_lag, "
" pg_size_pretty(apply_lag) apply_lag, age(now(), last_monitor_time) AS time_lag "
" FROM monitor_info a "
" WHERE row_number = 1", repmgr_schema, repmgr_schema);
" FROM %s.repl_monitor "
" WHERE (standby_node, last_monitor_time) IN (SELECT standby_node, MAX(last_monitor_time) "
" FROM %s.repl_monitor GROUP BY 1)",
repmgr_schema, repmgr_schema, repmgr_schema);
log_debug("master register: %s\n", sqlquery);
if (!PQexec(conn, sqlquery))
{
@@ -454,6 +458,19 @@ do_master_register(void)
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
/* an index to improve performance of the view */
sqlquery_snprintf(sqlquery, "CREATE INDEX idx_repl_status_sort "
" ON %s.repl_monitor (last_monitor_time, standby_node) ",
repmgr_schema);
log_debug(_("master register: %s\n"), sqlquery);
if (!PQexec(conn, sqlquery))
{
log_err(_("Cannot indexing table %s.repl_monitor: %s\n"),
repmgr_schema, PQerrorMessage(conn));
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
}
else
{
@@ -845,6 +862,7 @@ do_standby_clone(void)
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
break;
default:
/* Trouble accessing directory */
log_err(_("%s: could not access directory \"%s\": %s\n"),
@@ -871,6 +889,16 @@ do_standby_clone(void)
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
/* We need all 4 parameters, and they can be retrieved only by superusers */
if (PQntuples(res) != 4)
{
log_err("%s: STANDBY CLONE should be run by a SUPERUSER\n", progname);
PQclear(res);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
for (i = 0; i < PQntuples(res); i++)
{
if (strcmp(PQgetvalue(res, i, 0), "data_directory") == 0)
@@ -886,6 +914,20 @@ do_standby_clone(void)
}
PQclear(res);
/*
* in pg 9.1 default is to wait for a sync standby to ack,
* avoid that by turning off sync rep for this session
*/
sqlquery_snprintf(sqlquery, "SET synchronous_commit TO OFF");
res = PQexec(conn, sqlquery);
if (PQresultStatus(res) != PGRES_COMMAND_OK)
{
log_err("Can't set synchronous_commit: %s\n", PQerrorMessage(conn));
PQclear(res);
PQfinish(conn);
exit(ERR_BAD_CONFIG);
}
/*
* inform the master we will start a backup and get the first XLog filename
* so we can say to the user we need those files
@@ -1022,9 +1064,6 @@ stop_backup:
* Don't have this one exit if it fails, so that a more informative
* error message will also appear about the backup not being stopped.
*/
log_info(_("%s connecting to master database to stop backup\n"), progname);
conn=establishDBConnectionByParams(keywords,values,false);
log_notice("Finishing backup...\n");
sqlquery_snprintf(sqlquery, "SELECT pg_xlogfile_name(pg_stop_backup())");
log_debug("standby clone: %s\n", sqlquery);
@@ -1039,8 +1078,10 @@ stop_backup:
}
last_wal_segment = PQgetvalue(res, 0, 0);
log_info(_("%s requires primary to keep WAL files %s until at least %s\n"),
progname, first_wal_segment, last_wal_segment);
/* don't show this message if rsync failed */
if (r == 0)
log_info(_("%s requires primary to keep WAL files %s until at least %s\n"),
progname, first_wal_segment, last_wal_segment);
/* Finished with the database connection now */
PQclear(res);
@@ -1337,6 +1378,7 @@ void help(const char *progname)
printf(_(" -R, --remote-user=USERNAME database server username for rsync\n"));
printf(_(" -w, --wal-keep-segments=VALUE minimum value for the GUC wal_keep_segments (default: 5000)\n"));
printf(_(" -F, --force force potentially dangerous operations to happen\n"));
printf(_(" -I, --ignore-rsync-warning Ignore partial transfert warning\n"));
printf(_("\n%s performs some tasks like clone a node, promote it "), progname);
printf(_("or making follow another node and then exits.\n"));
@@ -1446,7 +1488,7 @@ copy_remote_files(char *host, char *remote_user, char *remote_path,
maxlen_snprintf(rsync_flags, "%s", options.rsync_options);
if (runtime_options.force)
strcat(rsync_flags, " --delete");
strcat(rsync_flags, " --delete --checksum");
if (!remote_user[0])
{
@@ -1473,6 +1515,29 @@ copy_remote_files(char *host, char *remote_user, char *remote_path,
r = system(script);
/*
* If we are transfering a directory (ie: data directory, tablespace directories)
* then we can ignore some rsync warning, so if we get some of those errors we
* treat them as 0 if we have --ignore-rsync-warning commandline option set
* List of ignorable rsync errors:
* 24 Partial transfer due to vanished source files
*/
if ((WEXITSTATUS(r) == 24) && is_directory)
{
if (!runtime_options.ignore_rsync_warn)
{
log_warning( _("\nrsync completed with return code 24 "
"\"Partial transfer due to vanished source files\".\n"
"This can happen because of normal operation "
"on the master server, but it may indicate an "
"issue during cloning. If you are certain no "
"changes were made to the master, try cloning "
"again using \"repmgr --force --ignore-rsync-warning\"."));
exit(ERR_BAD_RSYNC);
}
else
r = 0;
}
if (r != 0)
log_err(_("Can't rsync from remote file or directory (%s:%s)\n"),
host_string, remote_path);

View File

@@ -55,6 +55,7 @@ typedef struct
char wal_keep_segments[MAXLEN];
bool verbose;
bool force;
bool ignore_rsync_warn;
char masterport[MAXLEN];

View File

@@ -27,6 +27,15 @@
static int xvsnprintf(char *str, size_t size, const char *format, va_list ap);
/* Add strnlen on platforms that don't have it, like OS X */
#ifndef strnlen
size_t
strnlen(const char *s, size_t n)
{
const char *end = (const char *) memchr(s, '\0', n);
return(end ? end - s : n);
}
#endif
static int
xvsnprintf(char *str, size_t size, const char *format, va_list ap)

View File

@@ -35,4 +35,9 @@ extern int xsnprintf(char *str, size_t size, const char *format, ...);
extern int sqlquery_snprintf(char *str, const char *format, ...);
extern int maxlen_snprintf(char *str, const char *format, ...);
/* Add strnlen on platforms that don't have it, like OS X */
#ifndef strnlen
extern size_t strnlen(const char *s, size_t n);
#endif
#endif /* _STRUTIL_H_ */