Quick-start guide

Quick-start guide This section gives a quick introduction to &repmgr;, including setting up a sample &repmgr; installation and a basic replication cluster. These instructions for demonstration purposes and are not suitable for a production install, as issues such as account security considerations, and system administration best practices are omitted. To upgrade an existing &repmgr; 3.x installation, see section . Prerequisites for setting up a basic replication cluster with &repmgr; The following section will describe how to set up a basic replication cluster with a primary and a standby server using the repmgr command line tool. We'll assume the primary is called node1 with IP address 192.168.1.11, and the standby is called node2 with IP address 192.168.1.12 Following software must be installed on both servers: PostgreSQL repmgr (matching the installed PostgreSQL major version) At network level, connections between the PostgreSQL port (default: 5432) must be possible in both directions. If you want repmgr to copy configuration files which are located outside the PostgreSQL data directory, and/or to test switchover functionality, you will also need passwordless SSH connections between both servers, and rsync should be installed. For testing repmgr, it's possible to use multiple PostgreSQL instances running on different ports on the same computer, with passwordless SSH access to localhost enabled. PostgreSQL configuration On the primary server, a PostgreSQL instance must be initialised and running. The following replication settings may need to be adjusted: # Enable replication connections; set this figure to at least one more # than the number of standbys which will connect to this server # (note that repmgr will execute `pg_basebackup` in WAL streaming mode, # which requires two free WAL senders) max_wal_senders = 10 # Ensure WAL files contain enough information to enable read-only queries # on the standby. # # PostgreSQL 9.5 and earlier: one of 'hot_standby' or 'logical' # PostgreSQL 9.6 and later: one of 'replica' or 'logical' # ('hot_standby' will still be accepted as an alias for 'replica') # # See: https://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-LEVEL wal_level = 'hot_standby' # Enable read-only queries on a standby # (Note: this will be ignored on a primary but we recommend including # it anyway) hot_standby = on # Enable WAL file archiving archive_mode = on # Set archive command to a script or application that will safely store # you WALs in a secure place. /bin/true is an example of a command that # ignores archiving. Use something more sensible. archive_command = '/bin/true' # If you have configured "pg_basebackup_options" # in "repmgr.conf" to include the setting "--xlog-method=fetch" (from # PostgreSQL 10 "--wal-method=fetch"), *and* you have not set # "restore_command" in "repmgr.conf"to fetch WAL files from another # source such as Barman, you'll need to set "wal_keep_segments" to a # high enough value to ensure that all WAL files generated while # the standby is being cloned are retained until the standby starts up. # # wal_keep_segments = 5000 Rather than editing these settings in the default postgresql.conf file, create a separate file such as postgresql.replication.conf and include it from the end of the main configuration file with: include 'postgresql.replication.conf. Additionally, if you are intending to use pg_rewind, and the cluster was not initialised using data checksums, you may want to consider enabling wal_log_hints; for more details see . Create the repmgr user and database Create a dedicated PostgreSQL superuser account and a database for the &repmgr; metadata, e.g. createuser -s repmgr createdb repmgr -O repmgr For the examples in this document, the name repmgr will be used for both user and database, but any names can be used. For the sake of simplicity, the repmgr user is created as a superuser. If desired, it's possible to create the repmgr user as a normal user. However for certain operations superuser permissions are requiredl; in this case the command line option --superuser can be provided to specify a superuser. It's also assumed that the repmgr user will be used to make the replication connection from the standby to the primary; again this can be overridden by specifying a separate replication user when registering each node. &repmgr; will install the repmgr extension, which creates a repmgr schema containing the &repmgr;'s metadata tables as well as other functions and views. We also recommend that you set the repmgr user's search path to include this schema name, e.g. ALTER USER repmgr SET search_path TO repmgr, "$user", public; Configuring authentication in pg_hba.conf Ensure the repmgr user has appropriate permissions in pg_hba.conf and can connect in replication mode; pg_hba.conf should contain entries similar to the following: local replication repmgr trust host replication repmgr 127.0.0.1/32 trust host replication repmgr 192.168.1.0/24 trust local repmgr repmgr trust host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.1.0/24 trust Note that these are simple settings for testing purposes. Adjust according to your network environment and authentication requirements. Preparing the standby On the standby, do not create a PostgreSQL instance, but do ensure the destination data directory (and any other directories which you want PostgreSQL to use) exist and are owned by the postgres system user. Permissions must be set to 0700 (drwx------). Check the primary database is reachable from the standby using psql: psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2' &repmgr; stores connection information as libpq connection strings throughout. This documentation refers to them as conninfo strings; an alternative name is DSN (data source name). We'll use these in place of the -h hostname -d databasename -U username syntax. repmgr configuration file Create a repmgr.conf file on the primary server. The file must contain at least the following parameters: node_id=1 node_name=node1 conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/data' repmgr.conf should not be stored inside the PostgreSQL data directory, as it could be overwritten when setting up or reinitialising the PostgreSQL server. See sections and for further details about repmgr.conf. For Debian-based distributions we recommend explictly setting to the directory where pg_ctl and other binaries not in the standard path are located. For PostgreSQL 9.6 this would be /usr/lib/postgresql/9.6/bin/. &repmgr; only uses when it executes PostgreSQL binaries directly. For user-defined scripts such as and the various s, you must always explicitly provide the full path to the binary or script being executed, even if it is &repmgr; itself. This is because these options can contain user-defined scripts in arbitrary locations, so prepending may break them. See the file repmgr.conf.sample for details of all available configuration parameters. Register the primary server To enable &repmgr; to support a replication cluster, the primary node must be registered with &repmgr;. This installs the repmgr extension and metadata objects, and adds a metadata record for the primary server: $ repmgr -f /etc/repmgr.conf primary register INFO: connecting to primary database... NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (id: 1) registered Verify status of the cluster like this: $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Connection string ----+-------+---------+-----------+----------+-------------------------------------------------------- 1 | node1 | primary | * running | | host=node1 dbname=repmgr user=repmgr connect_timeout=2 The record in the repmgr metadata table will look like this: repmgr=# SELECT * FROM repmgr.nodes; -[ RECORD 1 ]----+------------------------------------------------------- node_id | 1 upstream_node_id | active | t node_name | node1 type | primary location | default priority | 100 conninfo | host=node1 dbname=repmgr user=repmgr connect_timeout=2 repluser | repmgr slot_name | config_file | /etc/repmgr.conf Each server in the replication cluster will have its own record. If repmgrd is in use, the fields upstream_node_id, active and type will be updated when the node's status or role changes. Clone the standby server Create a repmgr.conf file on the standby server. It must contain at least the same parameters as the primary's repmgr.conf, but with the mandatory values node, node_name, conninfo (and possibly data_directory) adjusted accordingly, e.g.: node_id=2 node_name=node2 conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/data' Use the --dry-run option to check the standby can be cloned: $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run NOTICE: using provided configuration file "/etc/repmgr.conf" NOTICE: destination directory "/var/lib/postgresql/data" provided INFO: connecting to source node NOTICE: checking for available walsenders on source node (2 required) INFO: sufficient walsenders available on source node (2 required) NOTICE: standby will attach to upstream node 1 HINT: consider using the -c/--fast-checkpoint option INFO: all prerequisites for "standby clone" are met If no problems are reported, the standby can then be cloned with: $ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone NOTICE: using configuration file "/etc/repmgr.conf" NOTICE: destination directory "/var/lib/postgresql/data" provided INFO: connecting to source node NOTICE: checking for available walsenders on source node (2 required) INFO: sufficient walsenders available on source node (2 required) INFO: creating directory "/var/lib/postgresql/data"... NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/data -h node1 -U repmgr -X stream NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /var/lib/postgresql/data start This has cloned the PostgreSQL data directory files from the primary node1 using PostgreSQL's pg_basebackup utility. A recovery.conf file containing the correct parameters to start streaming from this primary server will be created automatically. By default, any configuration files in the primary's data directory will be copied to the standby. Typically these will be postgresql.conf, postgresql.auto.conf, pg_hba.conf and pg_ident.conf. These may require modification before the standby is started. Make any adjustments to the standby's PostgreSQL configuration files now, then start the server. For more details on repmgr standby clone, see the command reference. A more detailed overview of cloning options is available in the administration manual. Verify replication is functioning Connect to the primary server and execute: repmgr=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 19111 usesysid | 16384 usename | repmgr application_name | node2 client_addr | 192.168.1.12 client_hostname | client_port | 50378 backend_start | 2017-08-28 15:14:19.851581+09 backend_xmin | state | streaming sent_location | 0/7000318 write_location | 0/7000318 flush_location | 0/7000318 replay_location | 0/7000318 sync_priority | 0 sync_state | async This shows that the previously cloned standby (node2 shown in the field application_name) has connected to the primary from IP address 192.168.1.12. From PostgreSQL 9.6 you can also use the view pg_stat_wal_receiver to check the replication status from the standby. repmgr=# SELECT * FROM pg_stat_wal_receiver; Expanded display is on. -[ RECORD 1 ]---------+-------------------------------------------------------------------------------- pid | 18236 status | streaming receive_start_lsn | 0/3000000 receive_start_tli | 1 received_lsn | 0/7000538 received_tli | 1 last_msg_send_time | 2017-08-28 15:21:26.465728+09 last_msg_receipt_time | 2017-08-28 15:21:26.465774+09 latest_end_lsn | 0/7000538 latest_end_time | 2017-08-28 15:20:56.418735+09 slot_name | conninfo | user=repmgr dbname=replication host=node1 application_name=node2 Note that the conninfo value is that generated in recovery.conf and will differ slightly from the primary's conninfo as set in repmgr.conf - among others it will contain the connecting node's name as application_name. Register the standby Register the standby server with: $ repmgr -f /etc/repmgr.conf standby register NOTICE: standby node "node2" (ID: 2) successfully registered Check the node is registered by executing repmgr cluster show on the standby: $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string ----+-------+---------+-----------+----------+----------+-------------------------------------- 1 | node1 | primary | * running | | default | host=node1 dbname=repmgr user=repmgr 2 | node2 | standby | running | node1 | default | host=node2 dbname=repmgr user=repmgr Both nodes are now registered with &repmgr; and the records have been copied to the standby server.