repmgrd configuration

repmgrd configuration repmgrd configuration repmgrd is a daemon which runs on each PostgreSQL node, monitoring the local node, and (unless it's the primary node) the upstream server (the primary server or with cascading replication, another standby) which it's connected to. repmgrd can be configured to provide failover capability in case the primary upstream node becomes unreachable, and/or provide monitoring data to the &repmgr; metadatabase. repmgrd basic configuration To use repmgrd, its associated function library must be included via postgresql.conf with: shared_preload_libraries = 'repmgr' Changing this setting requires a restart of PostgreSQL; for more details see the PostgreSQL documentation. automatic failover configuration If using automatic failover, the following repmgrd options *must* be set in repmgr.conf : failover=automatic promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' Adjust file paths as appropriate; alway specify the full path to the &repmgr; binary. &repmgr; will not apply when executing or ; these can be user-defined scripts so must always be specified with the full path. Note that the --log-to-file option will cause output generated by the &repmgr; command, when executed by repmgrd, to be logged to the same destination configured to receive log output for repmgrd. See repmgr.conf.sample for further repmgrd-specific settings. When failover is set to automatic, upon detecting failure of the current primary, repmgrd will execute one of: promote_command (if the current server is to become the new primary) follow_command (if the current server needs to follow another server which has become the new primary) These commands can be any valid shell script which results in one of these two actions happening, but if &repmgr;'s standby follow or standby promote commands are not executed (either directly as shown here, or from a script which performs other actions), the &repmgr; metadata will not be updated and &repmgr; will no longer function reliably. The follow_command should provide the --upstream-node-id=%n option to repmgr standby follow; the %n will be replaced by repmgrd with the ID of the new primary node. If this is not provided, &repmgr; will attempt to determine the new primary by itself, but if the original primary comes back online after the new primary is promoted, there is a risk that repmgr standby follow will result in the node continuing to follow the original primary. repmgrd PostgreSQL service configuration PostgreSQL service configuration If using automatic failover, currently repmgrd will need to execute repmgr standby follow to restart PostgreSQL on standbys to have them follow a new primary. To ensure this happens smoothly, it's essential to provide the appropriate system/service restart command appropriate to your operating system via service_restart_command in repmgr.conf. If you don't do this, repmgrd will default to using pg_ctl, which can result in unexpected problems, particularly on systemd-based systems. For more details, see . repmgrd monitoring configuration Monitoring configuration To enable monitoring, set: monitoring_history=yes in repmgr.conf. The default monitoring interval is 2 seconds; this value can be explicitly set using: monitor_interval_secs=<seconds> in repmgr.conf. For more details on monitoring, see . repmgrd applying configuration changes Applying configuration changes to repmgrd To apply configuration file changes to a running repmgrd daemon, execute the operating system's repmgrd service reload command (see for examples), or for instances which were manually started, execute kill -HUP, e.g. kill -HUP `cat /tmp/repmgrd.pid`. Check the repmgrd log to see what changes were applied, or if any issues were encountered when reloading the configuration. Note that only the following subset of configuration file parameters can be changed on a running repmgrd daemon: async_query_timeout bdr_local_monitoring_only bdr_recovery_timeout conninfo degraded_monitoring_timeout event_notification_command event_notifications failover follow_command log_facility log_file log_level log_status_interval monitor_interval_secs monitoring_history primary_notification_timeout promote_command reconnect_attempts reconnect_interval repmgrd_standby_startup_timeout The following set of configuration file parameters must be updated via repmgr standby register --force, as they require changes to the repmgr.nodes table so they are visible to all nodes in the replication cluster: node_id node_name data_directory location priority After executing repmgr standby register --force, repmgrd must be restarted for the changes to take effect. repmgrd starting and stopping repmgrd daemon If installed from a package, the repmgrd can be started via the operating system's service command, e.g. in systemd using systemctl. See appendix for details of service commands for different distributions. repmgrd can be started manually like this: repmgrd -f /etc/repmgr.conf --pid-file /tmp/repmgrd.pid and stopped with kill `cat /tmp/repmgrd.pid`. Adjust paths as appropriate. repmgrd PID file PID file repmgrd repmgrd's PID file repmgrd will generate a PID file by default. This is a behaviour change from previous versions (earlier than 4.1), where the PID file had to be explicitly specified with the command line parameter . The PID file can be specified in repmgr.conf with the configuration parameter repmgrd_pid_file. It can also be specified on the command line (as in previous versions) with the command line parameter . Note this will override any value set in repmgr.conf with repmgrd_pid_file. may be deprecated in future releases. If a PID file location was specified by the package maintainer, repmgrd will use that. This only applies if &repmgr; was installed from a package and the package maintainer has specified the PID file location. If none of the above apply, repmgrd will create a PID file in the operating system's temporary directory (das etermined by the environment variable TMPDIR, or if that is not set, will use /tmp). To prevent a PID file being generated at all, provide the command line option . To see which PID file repmgrd would use, execute repmgrd with the option . repmgrd will not start if this option is provided. Note that the value shown is the file repmgrd would use next time it starts, and is not necessarily the PID file currently in use. repmgrd Debian/Ubuntu and daemon configuration Debian/Ubuntu repmgrd daemon configuration repmgrd daemon configuration on Debian/Ubuntu If &repmgr; was installed from Debian/Ubuntu packages, additional configuration is required before repmgrd is started as a daemon. This is done via the file /etc/default/repmgrd, which by default looks like this: # default settings for repmgrd. This file is source by /bin/sh from # /etc/init.d/repmgrd # disable repmgrd by default so it won't get started upon installation # valid values: yes/no REPMGRD_ENABLED=no # configuration file (required) #REPMGRD_CONF="/path/to/repmgr.conf" # additional options REPMGRD_OPTS="--daemonize=false" # user to run repmgrd as #REPMGRD_USER=postgres # repmgrd binary #REPMGRD_BIN=/usr/bin/repmgrd # pid file #REPMGRD_PIDFILE=/var/run/repmgrd.pid Set REPMGRD_ENABLED to yes, and REPMGRD_CONF to the repmgr.conf file you are using. See for details of the Debian/Ubuntu packages and typical file locations (including repmgr.conf). From repmgrd 4.1, ensure REPMGRD_OPTS includes , as daemonization is handled by the service command. If using systemd, you may need to execute systemctl daemon-reload. Also, if you attempted to start repmgrd using systemctl start repmgrd, you'll need to execute systemctl stop repmgrd. Because that's how systemd rolls. repmgrd connection settings In addition to the &repmgr; configuration settings, parameters in the conninfo string influence how &repmgr; makes a network connection to PostgreSQL. In particular, if another server in the replication cluster is unreachable at network level, system network settings will influence the length of time it takes to determine that the connection is not possible. In particular explicitly setting a parameter for connect_timeout should be considered; the effective minimum value of 2 (seconds) will ensure that a connection failure at network level is reported as soon as possible, otherwise depending on the system settings (e.g. tcp_syn_retries in Linux) a delay of a minute or more is possible. For further details on conninfo network connection parameters, see the PostgreSQL documentation. log rotation repmgrd repmgrd log rotation repmgrd log rotation To ensure the current repmgrd logfile (specified in repmgr.conf with the parameter ) does not grow indefinitely, configure your system's logrotate to regularly rotate it. Sample configuration to rotate logfiles weekly with retention for up to 52 weeks and rotation forced if a file grows beyond 100Mb: /var/log/repmgr/repmgrd.log { missingok compress rotate 52 maxsize 100M weekly create 0600 postgres postgres postrotate /usr/bin/killall -HUP repmgrd endscript }