repmgr

mirror of https://github.com/EnterpriseDB/repmgr.git synced 2026-03-23 15:16:29 +00:00

Author	SHA1	Message	Date
Ian Barwick	389c0ab9c0	Clarify use of function parameter	2020-05-11 13:37:42 +09:00
Ian Barwick	be8e5b45fa	Add utility function validate_conninfo_string()	2020-05-05 13:41:18 +09:00
Ian Barwick	cb2fb53556	Fix debug logging Per GitHub #630.	2020-04-20 11:07:51 +09:00
Ian Barwick	599bab590a	Create temporary pg.auto.conf file with the same permissions as the original Commit `0574279` set the file permissions to 0600 rather than the user's umask, but if initdb was executed with -g/--allow-group-access, the file is maintained with 0640, so we'll just maintain the existing permssions.	2020-04-07 13:29:59 +09:00
Ian Barwick	cd80f265ac	standby clone: warn about missing pg_rewind prerequisites These are not essential for cloning a standby, but useful to warn as early as possible in case the user is intending to use pg_rewind.	2020-04-06 15:37:37 +09:00
Ian Barwick	f3258c5002	cluster cleanup: explicitly log vacuum operation	2020-03-26 11:38:51 +09:00
Ian Barwick	2e9bc31c8c	Consolidate code for establishing a superuser connection	2020-03-25 11:02:31 +09:00
Ian Barwick	fb5ce720f3	standby promote: fall back to "pg_ctl promote" if necessary From PostgreSQL 12, the SQL-level function "pg_promote()" can be used to promote a PostgreSQL instance, however usage is restricted to superusers and users to whom explicit execution permission for this function has been granted. Therefore, if execution permission is not available, fall back to "pg_ctl promote".	2020-03-06 12:53:37 +09:00
Ian Barwick	7c96afc6fb	Move user information database functions into their own section Code reorganisation for easier location of functions.	2020-03-06 12:53:34 +09:00
Ian Barwick	9de31428f1	Consolidate replication connection code In a few places, replication connections are generated from the parameters used by existing connections. This has resulted in a number of similar blocks of code which do more-or-less the same thing almost but not quite identically. In two cases, the code omitted to set "dbname=replication", which can cause problems in some contexts. These code blocks have now been consolidated into standardized functions. This also resolves the issue addressed by GitHub #619.	2020-03-05 17:21:37 +09:00
Ian Barwick	76af2d9e08	cluster show: don't display witness node timeline ID The witness node is not part of the replication cluster, so its timeline ID is not of any relevance.	2020-02-25 10:33:54 +09:00
Ian Barwick	cd7f36a6fd	Add general check function "check_replication_slots_available()" Make the code previously only used by "standby follow" generally available - we'll want to use this from "node rejoin" as well. While we're at it, when reporting failure due to lack of free replication slots, report the current value of "max_replication_slots".	2020-02-03 16:43:55 +09:00
Ian Barwick	4d4ed3bcd6	Remove BDR 2.x support The BDR 2.x support was conceptual only and was never used in production. As BDR 2.x will be EOL'd shortly, there is no risk it will be needed.	2020-01-16 09:52:42 +09:00
Ian Barwick	7fdf2f1778	Update copyright notices to 2020	2020-01-13 14:06:20 +09:00
Ian Barwick	220ec7fc96	Minimize user permissions requirements for replication slots Enable operations which create or drop replication slots to be carried out with the minimum necessary user permissions, i.e. a user with the REPLICATION attribute. This can be the repmgr user, or a dedicated replication user. In the latter case, if the dedicated replication user is only permitted to make replication connections, the streaming replication protocol is used to create/drop slots. Implements part of GitHub #536.	2019-10-30 15:51:15 +09:00
Ian Barwick	f0693271d3	Clean up replication slot creation/deletion functions	2019-10-25 14:31:11 +09:00
Ian Barwick	45b9002e5b	Modify function "is_replication_role()" to reference "pg_roles" "pg_authid" is restricted to superusers.	2019-10-24 16:57:50 +09:00
Ian Barwick	cc540a54e5	Add functions for slot creation via the streaming replication protocol	2019-10-24 10:05:32 +09:00
Ian Barwick	9083f26990	Clarify usage of log_db_error() function	2019-10-24 10:04:15 +09:00
Ian Barwick	63ddc2d39e	Let _establish_db_connection() work with replication connections Previously this was limited to establish_db_connection_by_params().	2019-10-23 14:45:44 +09:00
Ian Barwick	dc11330d58	Rename replication slot create/drop functions Append "_sql" to the respective function names, as we'll later be creating equivalent functions which use the replication protocol so need a way to distinguish between them.	2019-10-23 13:43:09 +09:00
Ian Barwick	be494f0d5f	standby clone: minimize requirement to check upstream data directory location repmgr has always insisted on determining the upstream's data directory location, which requires superuser permissions (or from PostgreSQL 10, membership of the default role "pg_read_all_settings"). Knowledge of the data directory location was required to implement rsync cloning (now deprecated), but with pg_basebackup the minimum permission requirement is now only a normal user with access to the repmgr metadata and a user with replication permissions. The ability to determine the data directory location is only required if the user specifies the --copy-external-config-files option, which needs to be able to determine the data directory to work out which configuration files are located outside it. This patch makes it possible to clone a standby with minimum permissions, with appropriate checks for available permissions if --copy-external-config-files is provided. Implements part of GitHub #536 and addresses issue raised in #586.	2019-10-23 10:46:40 +09:00
Ian Barwick	52abe309df	Add function is_replication_role()	2019-10-17 17:13:18 +09:00
Ian Barwick	507b27c05d	Mark set_repmgrd_pid() as "RETURNS NULL ON NULL INPUT" When unsetting the PID, we'll want to set the pidfile to NULL rather than an empty string.	2019-08-19 20:15:30 +09:00
Ian Barwick	28f4536372	repmgrd: fix pidfile handling at shutdown	2019-08-19 17:55:18 +09:00
Ian Barwick	3df65d0eb3	Simplify pg_has_role() call Specifying CURRENT_USER is superfluous here.	2019-08-07 14:43:56 +09:00
Ian Barwick	38b373e6df	"node check": check role membership when trying to read pg_settings From PostgreSQL 10, a member of the default roles "pg_monitor" and/or "pg_read_all_settings" can read pg_settings without requiring superuser privileges. Previously, a hint was being emitted about making the repmgr user a member of one of those groups, but no check for membership was being made, meaning the check could only be run by a superuser.	2019-08-07 14:26:48 +09:00
Ian Barwick	10870503d1	Add missing field in init_replication_info() "upstream_node_id" was not being initialised.	2019-08-06 21:24:37 +09:00
Ian Barwick	6aca764d5e	Fix extension version number query	2019-06-06 12:46:12 +09:00
Ian Barwick	c153e2fc02	standby clone: improve --dry-run output Log positive check results as an additional confirmation that the upstream configuration appears to be correct.	2019-05-28 00:54:39 +09:00
Ian Barwick	c560dfbbce	cluster show: display timeline ID This helps provide a better picture of the state of the cluster, i.e. making it more obvious whether there's been a timeline divergence. This also provides infrastructure for further improvements in cluster status display and diagnosis. Note this is only available in PostgreSQL 9.6 and later as it relies on the SQL functions for interrogating pg_control, which can be executed remotely. As PostgreSQL 9.5 will shortly be the only community-supported version without these functions, it's not worth the effort of trying to duplicate their functionality.	2019-05-27 09:39:19 +09:00
Ian Barwick	c9e85996f5	repmgr: prevent a standby being cloned from a witness server Previously repmgr would happily clone from whatever server it found at the provided source server address. We should ensure that a standby can only be cloned from a node which is part of the main replication cluster. This check fetches a list of nodes from the source server, connects to the first non-witness server it finds, and compares the system identifiers of the source node and the node it has connected to. If there is a mismatch, then the source server is clearly not part of the main replication cluster, and is most likely the witness server.	2019-05-22 16:52:25 +09:00
Ian Barwick	dd78a16006	Change return type of is_downstream_node_attached() from bool to NodeAttached This enables us to better determine whether a node is definitively attached, definitively not attached, or if it was not possible to determine the attached state.	2019-05-14 15:57:20 +09:00
Ian Barwick	89a7261483	Always quote node names in log messages	2019-04-30 15:52:56 +09:00
Ian Barwick	9fe2fa2daf	daemon status: make output more like that of "cluster show" In particular make any issues with unexpected server state more obvious.	2019-04-25 14:45:41 +09:00
Ian Barwick	5a90513878	repmgrd: monitor standbys attached to primary This functionality enables repmgrd (when running on the primary) to monitor connected child nodes. It will log connections and disconnections and generate events. Additionally, repmgrd can execute a custom script if the number of connected child nodes falls below a configurable threshold. This script can be used e.g. to "fence" the primary following a failover situation where a new primary has been promoted and all standbys are now child nodes of that primary.	2019-04-22 16:18:52 +09:00
Ian Barwick	27803f93ff	repmgrd: always unset upstream node ID when monitoring a primary	2019-04-12 12:26:39 +09:00
Ian Barwick	cd6a55c7cb	repmgrd: improve primary visibility consensus check Exclude sibling nodes which report they're following a different node. This shouldn't happen, but could.	2019-04-11 15:46:14 +09:00
Ian Barwick	008bd00a59	repmgrd: store upstream node ID in shared memory	2019-04-11 15:46:09 +09:00
Ian Barwick	dd454a8374	Miscellaneous string handling cleanup This is mainly to prevent effectively spurious truncation warnings in recent GCC versions.	2019-04-10 16:18:56 +09:00
Ian Barwick	a564f365c1	Fix default return value in alter_system_int()	2019-04-01 14:50:19 +09:00
Ian Barwick	799ac6d453	Add is_server_available_quiet() For use in cases where the caller collates node availability information and doesn't want to prematurely emit log output.	2019-04-01 12:27:30 +09:00
Ian Barwick	57c0ccd477	Improve copying of strings from database results Where feasible, specify the maximum string length via sizeof(), and use snprintf() in place of strncpy().	2019-04-01 11:19:58 +09:00
Ian Barwick	ece20f4831	Cast "int" to "long long"	2019-03-28 11:02:25 +09:00
Ian Barwick	ba1f05ece9	Restrict "node_name" to maximum 63 characters In "recovery.conf", the configuration parameter "node_name" is used as the "application_name" value, which will be truncated by PostgreSQL to 63 characters (NAMEDATALEN - 1). repmgr sometimes needs to be able to extract the application name from pg_stat_replication to determine if a node is connected (e.g. when executing "repmgr standby register"), so the comparison will fail if "node_name" exceeds 63 characters.	2019-03-28 10:37:57 +09:00
Ian Barwick	e9ece34aeb	log_db_error(): fix formatted message handling	2019-03-27 11:00:31 +09:00
Ian Barwick	539861cb58	repmgrd: during failover, check if a node was already promoted Previously, repmgrd assumed that during a failover, there would not already be another primary node. However it's possible a node was promoted manually. While this is not a desirable situation, it's conceivable this could happen in the wild, so we should check for it and react accordingly. Also sanity-check that the follow target can actually be followed. Addresses issue raised in GitHub #420.	2019-03-22 14:06:41 +09:00
Ian Barwick	314a1e8f4f	use a constant to denote unknown replication lag	2019-03-20 17:26:04 +09:00
Ian Barwick	b84d98fe81	Explictly log PQping() failures	2019-03-20 11:47:32 +09:00
Ian Barwick	46efe57cd0	Improve database connection failure logging Log the output of PQerrorStatus() in a couple of places where it was missing. Additionally, always log the output of PQerrorStatus() starting with a blank line, otherwise the first line looks like it was emitted by repmgr, and it's harder to scan the error message. Before: [2019-03-20 11:24:15] [DETAIL] could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5501? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5501? After: [2019-03-20 11:27:21] [DETAIL] could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5501? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5501?	2019-03-20 11:47:28 +09:00

1 2 3 4 5 ...

299 Commits