repmgrd: improve reconnection handling

Previously, if the server being monitored was not available, repmgrd
would always close the existing connection handle and open a new one.

However, in some cases, e.g. a brief network outage, the existing
connection handle is still good and does not need to be reopened.

This could be particularly problematic if monitoring_history is on,
as this risks leaving orphan sessions on the primary which (given
a sufficiently unstable network) could lead to all available backends
being occupied.

Instead, during an outage we now use a new connection to verify
the server is accessible; if the old connection is still available
(e.g. following a short network interruption) we continue using that;
if  not (e.g. the server was restarted), we use the new one.
This commit is contained in:
Ian Barwick
2018-08-30 10:24:06 +09:00
parent 216326f316
commit 0468e47ef3
8 changed files with 59 additions and 25 deletions

View File

@@ -4074,17 +4074,20 @@ is_server_available_params(t_conninfo_param_list *param_list)
/*
* Simple throw-away query to stop a connection handle going stale
* Simple throw-away query to stop a connection handle going stale.
*/
void
ExecStatusType
connection_ping(PGconn *conn)
{
PGresult *res = PQexec(conn, "SELECT TRUE");
ExecStatusType ping_result;
log_verbose(LOG_DEBUG, "connection_ping(): result is %s", PQresStatus(PQresultStatus(res)));
ping_result = PQresultStatus(res);
PQclear(res);
return;
return ping_result;
}