Commit Graph

963 Commits

Author SHA1 Message Date
Ian Barwick
94579b5f2e Clean up whitespace and comments 2016-01-04 14:41:15 +09:00
Ian Barwick
e9a25c367a Prevent invalid replication_lag values being written to the monitoring table
A fix for this was introduced with commit ee9270fe8d
and removed in 4f1c67a1bf.

Refactor the original fix to simply omit attempting to write an invalid entry
to the monitoring table.
2016-01-04 14:37:22 +09:00
Ian Barwick
3088096318 No need to manually create repmgr schema. 2016-01-04 14:33:41 +09:00
Ian Barwick
3bbd32c73c Add note about why 'hot_standby=on' is currently required 2016-01-04 14:30:15 +09:00
Martin
ac17033d61 This doesn't really mean the standby s following a new master, so we are
removing it.
Basically, on startup the standby will start receiving again from the
begining of the WAL and so received will be lower then applied.

A proper code is needed to make sure the standby is still following the
correct master (as per node information)
2016-01-04 14:29:56 +09:00
Martín Marqués
711ad0a76c Change where we activate back the standby node that was failed.
We will do it where we are sending the message that says that the
standby has recovered, eliminating some complexity
2016-01-04 14:28:39 +09:00
Martín Marqués
ad988dccce Fix bug discovered last week which prevents recovered standby from being
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
2016-01-04 14:28:33 +09:00
Martín Marqués
53fe3c7e5a Fix bug discovered last week which prevents recovered standby from being
used in the cluster.
Main issue was that if the local repmgrd was not able to connect locally,
it would set the local node as failed (active = false). This is fine, because
we actually don't know if the node is active (actually, it's not active ATM)
so it's best to keep it out of the cluster.
The problem is that if the postgres service comes back up, and is able to
recover by it self, then we should ack that fact and set it as active.
There was another issue related with repmgrd being terminated if the postgres
service was downs. This is not the correct thing to do: we should keep
trying to connect to the local standby.
2016-01-04 14:28:26 +09:00
József Kószó
7a439c90d0 Debian init script repmgrd process stop fix 2016-01-04 14:28:20 +09:00
Ian Barwick
87e5257cb8 Short option -c does not take a value 2015-12-22 12:37:38 +09:00
Ian Barwick
1f240ff9b3 Update HISTORY 2015-11-30 16:58:13 +09:00
Ian Barwick
9d6cff0d40 Bump version to 3.0.3 2015-11-30 16:30:50 +09:00
Ian Barwick
f86e251430 Backport drop_replication_slot() from HEAD 2015-11-30 16:30:29 +09:00
Ian Barwick
085b7cb8b4 pg_replslot will only exist in 9.4 and later
We need to clean this up regardless of whether "use_replication_slots"
is set.
2015-11-30 16:19:37 +09:00
Ian Barwick
5ccf89ad9b Ensure pg_replslot directory is cleaned up after "standby clone" with rsync
This ensures the directory is in the same state as it would be
after cloning the standby with pg_basebackup, i.e. empty.
2015-11-30 16:19:31 +09:00
Ian Barwick
6ae5401df0 Update TODO 2015-11-30 16:19:25 +09:00
Ian Barwick
4bd8190d02 Drop a previously created replication slot if base backup fails for any reason
Per Github #129
2015-11-30 16:19:19 +09:00
Ian Barwick
efdc2355a7 Ensure all failures encountered during a base backup jump to the stop_backup label 2015-11-30 16:19:05 +09:00
Ian Barwick
61b1f72a0e Put "starting backup" notice after any slot creation 2015-11-30 16:19:00 +09:00
Abhijit Menon-Sen
882bfd9d8e If we're using replication slots, we need to create them earlier
Otherwise, if the backup takes a long time, we might lose WAL we need
long before we create the slot.
2015-11-30 16:18:52 +09:00
Ian Barwick
c93f717305 Ensure 'master register --force' can't create more than one active primary node record 2015-11-30 16:18:46 +09:00
Ian Barwick
85be96a0be Remove unusable setting
Not a configuration item or command line option;
variable is always false.
2015-11-30 16:18:41 +09:00
Ian Barwick
ce2d4fb86f Make t_node_info generally available
And have it include all the fields from the repl_nodes table.
2015-11-30 16:18:35 +09:00
Ian Barwick
40354e1d62 Add item about hash indexes. 2015-11-30 16:18:29 +09:00
Ian Barwick
3e1655f241 Remove hint about hash indexes entirely.
Anyone needing them, particularly in a replication context, should
know what they're doing anyway.

See also: http://www.postgresql.org/docs/current/interactive/sql-createindex.html#AEN74175

"Also, changes to hash indexes are not replicated over streaming or file-based
 replication after the initial base backup, so they give wrong answers to
 queries that subsequently use them. For these reasons, hash index use is presently
 discouraged."
2015-11-30 16:18:24 +09:00
Ian Barwick
8387e7f65e Add missing 'break' 2015-11-30 16:18:17 +09:00
Ian Barwick
aa4dd155b2 Remove unused variable 2015-11-30 16:17:59 +09:00
Ian Barwick
a171a501ab Shift some common but not terribly informative log messages to verbose mode only 2015-11-30 16:17:52 +09:00
Ian Barwick
f42f771ff4 Logging fixes 2015-11-30 16:17:46 +09:00
Ian Barwick
88cfcf358e Update TODO 2015-11-30 16:17:41 +09:00
Ian Barwick
ce3594d52d Add /etc/repmgr.conf as a default configuration file location
Also refactor configuration file handling while we're at it.

Previously a configuration file would be ignored if it couldn't
be opened, however that is now treated as an error.
2015-11-30 16:17:23 +09:00
Ian Barwick
f64c42a514 Simplify logger_init() parameters
We're passing the t_configuration_options structure anyway, no need to
pass items it contains as separate parameters.
2015-11-30 16:17:17 +09:00
Ian Barwick
3072139d06 Update code comments 2015-11-30 16:17:10 +09:00
Ian Barwick
3b7185fd39 Update TODO 2015-11-30 16:17:06 +09:00
Ian Barwick
819f980e76 Don't display warnings about unused command line parameters in --terse mode 2015-11-30 16:16:58 +09:00
Ian Barwick
49316fb8fb repmgr: don't error out on superfluous command line options
When parsing command line arguments in check_parameters_for_action(),
create warnings for paramters supplied but not required (e.g. -D/--data-dir
for MASTER REGISTER), rather than fail with error(s), as the
presence of the parameters won't cause any problems.

Errors will still be raised for required-but-missing parameters, of course.
2015-11-30 16:16:53 +09:00
Ian Barwick
fa4ff73b87 Remove implemented TODO item 2015-11-30 16:16:46 +09:00
Ian Barwick
29842f0e0d Metadata update also handled by repmgr 2015-11-30 16:16:37 +09:00
Ian Barwick
25db1ba737 When following a new primary, have repmgr (not repmgrd) create the new slot 2015-11-30 16:16:26 +09:00
Ian Barwick
7b9f6f5352 Minor log message fixes 2015-11-30 16:16:03 +09:00
Ian Barwick
53b8f99217 Add a TODO item 2015-11-30 16:15:57 +09:00
Ian Barwick
95cdaac91d Update TODO 2015-11-30 16:15:52 +09:00
Ian Barwick
e7dd0f690c Remove implemented items from TODO list
* repmgr: add explicit --log-level flag, repurpose --verbose flag to
  show extra detailed/repetitive output only (see item below too)

  -> e0cbdd5b31

* debug output: show some repetitive output only if --verbose flag set to prevent
  excessive log growth

  -> 8ab1901a93
2015-11-30 16:15:46 +09:00
Ian Barwick
e0c5bb8d31 Refactor get_master_connection() and update description
Use 'remote_conn' instead of 'master_conn', as the connection
handle can potentially be used for any node.
2015-11-30 16:15:36 +09:00
Ian Barwick
df3e55fa35 get_master_connection(): order node list by node type and priority
This should make it more likely that the actual primary is first
in the retrieved list, reducing the number of connections to
other nodes in the cluster which need to be made.
2015-11-30 16:15:30 +09:00
Ian Barwick
0ee2a1e6ba Code formatting 2015-11-30 16:15:25 +09:00
Ian Barwick
df05214970 Fix variable argument handling with log_hint()/log_verbose() 2015-11-30 16:15:19 +09:00
Ian Barwick
bd1314d232 get_master_connection(): possible to use is_standby() now 2015-11-30 16:15:14 +09:00
Ian Barwick
745566605d Tidy up logging output in dbutils.c
Log all executed SQL if verbose mode is enabled.
2015-11-30 16:15:09 +09:00
Ian Barwick
807dcc1038 Repurpose -v/--verbose; add -t/--terse option (repmgr only)
repmgr and particularly repmgrd currently produce substantial
amounts of log output. Much of this is only useful when troubleshooting
or debugging.

Previously the -v/--verbose option just forced the log level to
INFO. With repmgrd this is pretty pointless - just set the log
level in the configuration file. With repmgr the configuration
file can be overriden by the new -L/--log-level option.

-v/--verbose now provides an additional, chattier/pedantic level
of logging ("Opening *this* logfile", "Executing *this* query",
"running in *this* loop") which is helpful for understanding
repmgr/repmgrd's behaviour, particularly for troubleshooting.
What additional verbose logging is generated will of course a
also depends on the log level set, so e.g. someone trying to
work out which configuration file is actually being opened
can use '--log-level=INFO --verbose' without being bothered
by an avalanche of extra verbose debugging output.

-t/--terse option will silence certain non-essential output, at
the moment any HINTs.

Note that -v/--verbose and -t/--terse are not mutually exclusive
(suggestions for better names welcome).
2015-11-30 16:15:03 +09:00