Compare commits

..

47 Commits

Author SHA1 Message Date
Lev Kokotov
28172cc1d5 Fix debug log 2022-08-11 22:47:22 -07:00
Lev Kokotov
a5db6881b8 Speed up CI a bit (#119)
* Sleep for 1s

* use premade image

* quicker

* revert shutdown timeout
2022-08-11 22:41:08 -07:00
zainkabani
f963b12821 Health check delay (#118)
* initial commit of server check delay implementation

* fmt

* spelling

* Update name to last_healthcheck and some comments

* Moved server tested stat to after require_healthcheck check

* Make health check delay configurable

* Rename to last_activity

* Fix typo

* Add debug log for healthcheck

* Add address to debug log
2022-08-11 14:42:40 -07:00
Lev Kokotov
a262337ba5 Update CONTRIBUTING.md 2022-08-10 09:51:56 -07:00
Nicholas Dujay
014628d6e0 fix docker compose port allocation for local dev (#117)
change docker compose port to right prometheus port
2022-08-09 14:15:34 -07:00
zainkabani
65c32ad9fb Validates pgcat is closed after shutdown python tests (#116)
* Validates pgcat is closed after shutdown python tests

* Fix pgrep logic

* Moves sigterm step to after cleanup to decouple

* Replace subprocess with os.system for running pgcat
2022-08-09 14:09:53 -07:00
Nicholas Dujay
1b166b462d create a prometheus exporter on a standard http port (#107)
* create a hyper server and add option to enable it in config

* move prometheus stuff to its own file; update format

* create metric type and help lookup table

* finish the metric help type map

* switch to a boolean and a standard port

* dont emit unimplemented metrics

* fail if curl returns a non 200

* resolve conflicts

* move log out of config.show and into main

* terminating new line

* upgrade curl

* include unimplemented stats
2022-08-09 12:19:11 -07:00
Mostafa Abdelraouf
7592339092 Prevent clients from sticking to old pools after config update (#113)
* Re-acquire pool at the beginning of Protocol loop

* Fix query router + add tests for recycling behavior
2022-08-09 12:18:27 -07:00
zainkabani
3719c22322 Implementing graceful shutdown (#105)
* Initial commit for graceful shutdown

* fmt

* Add .vscode to gitignore

* Updates shutdown logic to use channels

* fmt

* fmt

* Adds shutdown timeout

* Fmt and updates tomls

* Updates readme

* fmt and updates log levels

* Update python tests to test shutdown

* merge changes

* Rename listener rx and update bash to be in line with master

* Update python test bash script ordering

* Adds error response message before shutdown

* Add details on shutdown event loop

* Fixes response length for error

* Adds handler for sigterm

* Uses ready for query function and fixes number of bytes

* fmt
2022-08-08 16:01:24 -07:00
Mostafa Abdelraouf
106ebee71c Fix local dev (#112)
* Fix Dev env

* Update tests/sharding/query_routing_setup.sql

* Update tests/sharding/query_routing_setup.sql

* bring pgcat.toml on ci and local dev to parity

* more parity

* pool names

* pool names

* less diff

* fix tests

* fmt

* add other user to setup

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
2022-08-08 13:15:48 -07:00
Mostafa Abdelraouf
b79f55abd6 Generate test coverage report in CircleCI (#110)
* coverage?

* generate_coverage

* +x

* 1.62.1

* 62

* ignore

* store

* quote
2022-08-08 07:51:36 -07:00
Mostafa Abdelraouf
b828e62408 Report banned addresses as disabled (#111) 2022-08-08 07:50:29 -07:00
Mostafa Abdelraouf
499612dd76 Add user to SHOW STATS query (#108)
* Add user to SHOW STATS query

* user_name => username
2022-08-03 18:16:53 -07:00
Mostafa Abdelraouf
5ac85eaadd Fix Python tests and remove CircleCI-specific path (#106)
* Remove CircleCI-specific path in tests

* ..?

* Fix testsP

* Fix python test

* remove pip

* Maybe fail?

* return code?

* no &

* Fix tests
2022-08-02 15:52:22 -07:00
Pradeep Chhetri
20e8f9d74c Sync pgcat config for docker-compose (#104) 2022-08-02 09:23:35 -07:00
Mostafa Abdelraouf
1b648ca00e Send proper server parameters to clients using admin db (#103)
* Send proper server parameters to clients using admin db

* clean up

* fix python test

* build

* Add python

* missing &

* debug ls

* fix tests

* fix tests

* fix

* Fix warning

* Address comments
2022-07-31 19:52:23 -07:00
Mostafa Abdelraouf
35381ba8fd Add test for config Serializer (#102)
* Add test for serializer

* fmt
2022-07-30 16:28:25 -07:00
Mostafa Abdelraouf
e591865d78 Avoid ValueAfterTable when serializing configs (#101) 2022-07-30 16:12:02 -07:00
Mostafa Abdelraouf
48cff1f955 Slightly more light weight health check (#100) 2022-07-29 11:58:25 -07:00
Mostafa Abdelraouf
8a06fc4047 Add Serialize trait to configs (#99) 2022-07-28 15:42:04 -07:00
Pradeep Chhetri
14d4dc45f5 Minor fix for some stats (#97) 2022-07-27 22:59:33 -07:00
Mostafa Abdelraouf
2ae4b438e3 Add support for multi-database / multi-user pools (#96)
* Add support for multi-database / multi-user pools

* Nothing

* cargo fmt

* CI

* remove test users

* rename pool

* Update tests to use admin user/pass

* more fixes

* Revert bad change

* Use PGDATABASE env var

* send server info in case of admin
2022-07-27 19:47:55 -07:00
Lev Kokotov
c5be5565a5 Update Dockerfile 2022-07-25 22:25:59 -07:00
dependabot[bot]
eff8e3e229 Bump activerecord from 7.0.2.2 to 7.0.3.1 in /tests/ruby (#94)
Bumps [activerecord](https://github.com/rails/rails) from 7.0.2.2 to 7.0.3.1.
- [Release notes](https://github.com/rails/rails/releases)
- [Changelog](https://github.com/rails/rails/blob/v7.0.3.1/activerecord/CHANGELOG.md)
- [Commits](https://github.com/rails/rails/compare/v7.0.2.2...v7.0.3.1)

---
updated-dependencies:
- dependency-name: activerecord
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-12 13:24:41 -07:00
Marco Montagna
ae3db111ac Merge pull request #91 from levkk/levkk-tls-2
Support for TLS
2022-06-27 17:12:50 -07:00
Lev
8bcfbed574 forgotten comment 2022-06-27 17:07:40 -07:00
Lev
773602dedf Im about to get a nasty email 2022-06-27 17:06:49 -07:00
Lev
21bf07258c lock em up 2022-06-27 17:05:45 -07:00
Lev
186f8be5b3 lint 2022-06-27 17:01:40 -07:00
Lev
7667fefead config check 2022-06-27 17:01:14 -07:00
Lev
c11d595ac7 bye 2022-06-27 16:46:03 -07:00
Lev
8f3202ed92 hmm 2022-06-27 16:45:41 -07:00
Lev
eb58920870 at least it compiles 2022-06-27 15:52:01 -07:00
Lev Kokotov
b974aacd71 check 2022-06-27 09:46:33 -07:00
Lev Kokotov
7dfe59a91a Fix stats dymanic reload (#87) 2022-06-25 12:22:46 -07:00
Lev Kokotov
5bcd3bf9c3 Automatically reload config every seconds (disabled by default) (#86)
* Automatically reload config every seconds (disabld by default)

* add that
2022-06-25 11:46:20 -07:00
Lev Kokotov
f06f64119c Fix panic & query router bug (#85)
* Fix query router bug

* Fix panic
2022-06-24 15:14:31 -07:00
Lev Kokotov
b93303eb83 Live reloading entire config and bug fixes (#84)
* Support reloading the entire config (including sharding logic) without restart.

* Fix bug incorrectly handing error reporting when the shard is set incorrectly via SET SHARD TO command.
selected wrong shard and the connection keep reporting fatal #80.

* Fix total_received and avg_recv admin database statistics.

* Enabling the query parser by default.

* More tests.
2022-06-24 14:52:38 -07:00
Lev Kokotov
d865d9f9d8 readme 2022-06-20 06:20:12 -07:00
Lev Kokotov
d3310a62c2 Client md5 auth and clean up scram (#77)
* client md5 auth and clean up scram

* add pw

* add user

* add user

* log
2022-06-20 06:15:54 -07:00
Lev Kokotov
d412238f47 Implement SCRAM-SHA-256 for server authentication (PG14) (#76)
* Implement SCRAM-SHA-256

* test it

* trace

* move to community for auth

* hmm
2022-06-18 18:36:00 -07:00
dependabot[bot]
7782933f59 Bump regex from 1.5.4 to 1.5.5 (#75)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.5.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.5)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-06 19:59:50 -07:00
Lev Kokotov
bac4e1f52c Only set application_name if it's different (#74)
* Only set application_name if it's different

* keep server named pgcat until something else changes
2022-06-05 09:48:06 -07:00
Lev Kokotov
37e3a86881 Pass application_name to server (#73)
* Pass application_name to server

* fmt
2022-06-03 00:15:50 -07:00
Lev Kokotov
61db13f614 Fix memory leak in client/server mapping (#71) 2022-05-18 16:24:03 -07:00
Lev Kokotov
fe32b5ef17 Reduce traffic on the stats channel (#69) 2022-05-17 13:05:25 -07:00
Lev Kokotov
54699222f8 Possible fix for clients waiting stat leak (#68) 2022-05-14 21:35:33 -07:00
39 changed files with 3762 additions and 942 deletions

View File

@@ -9,16 +9,20 @@ jobs:
# Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
# See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
docker:
- image: cimg/rust:1.58.1
- image: levkk/pgcat-ci:latest
environment:
RUST_LOG: info
- image: cimg/postgres:14.0
auth:
username: mydockerhub-user
password: $DOCKERHUB_PASSWORD
RUSTFLAGS: "-C instrument-coverage"
LLVM_PROFILE_FILE: "pgcat-%m.profraw"
- image: postgres:14
# auth:
# username: mydockerhub-user
# password: $DOCKERHUB_PASSWORD
environment:
POSTGRES_USER: postgres
POSTGRES_DB: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_HOST_AUTH_METHOD: scram-sha-256
# Add steps to the job
# See: https://circleci.com/docs/2.0/configuration-reference/#steps
steps:
@@ -30,16 +34,19 @@ jobs:
command: "cargo fmt --check"
- run:
name: "Install dependencies"
command: "sudo apt-get update && sudo apt-get install -y psmisc postgresql-contrib-12 postgresql-client-12 ruby ruby-dev libpq-dev"
command: "sudo apt-get update && sudo apt-get install -y psmisc postgresql-contrib-12 postgresql-client-12 ruby ruby-dev libpq-dev python3 python3-pip lcov llvm-11 && sudo apt-get upgrade curl"
- run:
name: "Build"
name: "Install rust tools"
command: "cargo install cargo-binutils rustfilt && rustup component add llvm-tools-preview"
- run:
name: "Build"
command: "cargo build"
- run:
name: "Test"
command: "cargo test"
- run:
name: "Test end-to-end"
command: "bash .circleci/run_tests.sh"
name: "Tests"
command: "cargo test && bash .circleci/run_tests.sh && .circleci/generate_coverage.sh"
- store_artifacts:
path: /tmp/cov
destination: coverage-data
- save_cache:
key: cargo-lock-2-{{ checksum "Cargo.lock" }}
paths:

7
.circleci/generate_coverage.sh Executable file
View File

@@ -0,0 +1,7 @@
#!/bin/bash
rust-profdata merge -sparse pgcat-*.profraw -o pgcat.profdata
rust-cov export -ignore-filename-regex="rustc|registry" -Xdemangler=rustfilt -instr-profile=pgcat.profdata --object ./target/debug/pgcat --format lcov > ./lcov.info
genhtml lcov.info --output-directory /tmp/cov --prefix $(pwd)

View File

@@ -5,74 +5,51 @@
#
# General pooler settings
[general]
# What IP to run on, 0.0.0.0 means accessible from everywhere.
host = "0.0.0.0"
# Port to run on, same as PgBouncer used in this example.
port = 6432
# How many connections to allocate per server.
pool_size = 15
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# enable prometheus exporter on port 9930
enable_prometheus_exporter = true
# How long to wait before aborting a server connection (ms).
connect_timeout = 100
# How much time to give `SELECT 1` health check query to return with a result (ms).
# How much time to give the health check query to return with a result (ms).
healthcheck_timeout = 100
# How long to keep connection available for immediate re-use, without running a healthcheck query on it
healthcheck_delay = 30000
# How much time to give clients during shutdown before forcibly killing client connections (ms).
shutdown_timeout = 5000
# For how long to ban a server if it fails a health check (seconds).
ban_time = 60 # Seconds
#
# User to use for authentication against the server.
[user]
name = "sharding_user"
password = "sharding_user"
# Reload config automatically if it changes.
autoreload = true
# TLS
tls_certificate = ".circleci/server.cert"
tls_private_key = ".circleci/server.key"
#
# Shards in the cluster
[shards]
# Credentials to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
admin_username = "admin_user"
admin_password = "admin_pass"
# Shard 0
[shards.0]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5433, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
# Database name (e.g. "postgres")
database = "shard0"
[shards.1]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5433, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "shard1"
[shards.2]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5433, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "shard2"
# Settings for our query routing layer.
[query_router]
# pool
# configs are structured as pool.<pool_name>
# the pool_name is what clients use as database name when connecting
# For the example below a client can connect using "postgres://sharding_user:sharding_user@pgcat_host:pgcat_port/sharded_db"
[pools.sharded_db]
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# If the client doesn't specify, route traffic to
# this role by default.
@@ -82,12 +59,11 @@ database = "shard2"
# primary: all queries go to the primary unless otherwise specified.
default_role = "any"
# Query parser. If enabled, we'll attempt to parse
# every incoming query to determine if it's a read or a write.
# If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# we'll direct it to the primary.
query_parser_enabled = false
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
@@ -103,3 +79,61 @@ primary_reads_enabled = true
# sha1: A hashing function based on SHA1
#
sharding_function = "pg_bigint_hash"
# Credentials for users that may connect to this cluster
[pools.sharded_db.users.0]
username = "sharding_user"
password = "sharding_user"
# Maximum number of server connections that can be established for this user
# The maximum number of connection from a single Pgcat process to any database in the cluster
# is the sum of pool_size across all users.
pool_size = 9
[pools.sharded_db.users.1]
username = "other_user"
password = "other_user"
pool_size = 21
# Shard 0
[pools.sharded_db.shards.0]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
# Database name (e.g. "postgres")
database = "shard0"
[pools.sharded_db.shards.1]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard1"
[pools.sharded_db.shards.2]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard2"
[pools.simple_db]
pool_mode = "session"
default_role = "primary"
query_parser_enabled = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
[pools.simple_db.users.0]
username = "simple_user"
password = "simple_user"
pool_size = 5
[pools.simple_db.shards.0]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
database = "some_db"

View File

@@ -12,7 +12,8 @@ function start_pgcat() {
}
# Setup the database with shards and user
psql -e -h 127.0.0.1 -p 5432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 5432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard0 -i
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard1 -i
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard2 -i
@@ -30,56 +31,81 @@ toxiproxy-cli create -l 127.0.0.1:5433 -u 127.0.0.1:5432 postgres_replica
start_pgcat "info"
# Check that prometheus is running
curl --fail localhost:9930/metrics
export PGPASSWORD=sharding_user
export PGDATABASE=sharded_db
# pgbench test
pgbench -i -h 127.0.0.1 -p 6432
pgbench -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol simple -f tests/pgbench/simple.sql
pgbench -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol extended
pgbench -U sharding_user -i -h 127.0.0.1 -p 6432
pgbench -U sharding_user -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol simple -f tests/pgbench/simple.sql
pgbench -U sharding_user -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol extended
# COPY TO STDOUT test
psql -h 127.0.0.1 -p 6432 -c 'COPY (SELECT * FROM pgbench_accounts LIMIT 15) TO STDOUT;' > /dev/null
psql -U sharding_user -h 127.0.0.1 -p 6432 -c 'COPY (SELECT * FROM pgbench_accounts LIMIT 15) TO STDOUT;' > /dev/null
# Query cancellation test
(psql -h 127.0.0.1 -p 6432 -c 'SELECT pg_sleep(5)' || true) &
(psql -U sharding_user -h 127.0.0.1 -p 6432 -c 'SELECT pg_sleep(50)' || true) &
sleep 1
killall psql -s SIGINT
# Reload pool (closing unused server connections)
PGPASSWORD=admin_pass psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'RELOAD'
(psql -U sharding_user -h 127.0.0.1 -p 6432 -c 'SELECT pg_sleep(50)' || true) &
sleep 1
killall psql -s SIGINT
# Sharding insert
psql -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_insert.sql
psql -U sharding_user -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_insert.sql
# Sharding select
psql -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_select.sql > /dev/null
psql -U sharding_user -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_select.sql > /dev/null
# Replica/primary selection & more sharding tests
psql -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_primary_replica.sql > /dev/null
psql -U sharding_user -e -h 127.0.0.1 -p 6432 -f tests/sharding/query_routing_test_primary_replica.sql > /dev/null
#
# ActiveRecord tests
#
cd tests/ruby && \
sudo gem install bundler && \
bundle install && \
ruby tests.rb && \
cd tests/ruby
sudo gem install bundler
bundle install
ruby tests.rb
cd ../..
#
# Python tests
# These tests will start and stop the pgcat server so it will need to be restarted after the tests
#
pip3 install -r tests/python/requirements.txt
python3 tests/python/tests.py
start_pgcat "info"
# Admin tests
psql -e -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW STATS' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'RELOAD' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW CONFIG' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW DATABASES' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW LISTS' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW POOLS' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW VERSION' > /dev/null
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c "SET client_encoding TO 'utf8'" > /dev/null # will ignore
(! psql -e -h 127.0.0.1 -p 6432 -d random_db -c 'SHOW STATS' > /dev/null)
export PGPASSWORD=admin_pass
psql -U admin_user -e -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW STATS' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'RELOAD' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW CONFIG' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW DATABASES' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW LISTS' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW POOLS' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW VERSION' > /dev/null
psql -U admin_user -h 127.0.0.1 -p 6432 -d pgbouncer -c "SET client_encoding TO 'utf8'" > /dev/null # will ignore
(! psql -U admin_user -e -h 127.0.0.1 -p 6432 -d random_db -c 'SHOW STATS' > /dev/null)
export PGPASSWORD=sharding_user
# Start PgCat in debug to demonstrate failover better
start_pgcat "debug"
start_pgcat "trace"
# Add latency to the replica at port 5433 slightly above the healthcheck timeout
toxiproxy-cli toxic add -t latency -a latency=300 postgres_replica
sleep 1
# Note the failover in the logs
timeout 5 psql -e -h 127.0.0.1 -p 6432 <<-EOF
timeout 5 psql -U sharding_user -e -h 127.0.0.1 -p 6432 <<-EOF
SELECT 1;
SELECT 1;
SELECT 1;
@@ -91,13 +117,15 @@ toxiproxy-cli toxic remove --toxicName latency_downstream postgres_replica
start_pgcat "info"
# Test session mode (and config reload)
sed -i 's/pool_mode = "transaction"/pool_mode = "session"/' pgcat.toml
sed -i 's/pool_mode = "transaction"/pool_mode = "session"/' .circleci/pgcat.toml
# Reload config test
kill -SIGHUP $(pgrep pgcat)
sleep 1
# Prepared statements that will only work in session mode
pgbench -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol prepared
pgbench -U sharding_user -h 127.0.0.1 -p 6432 -t 500 -c 2 --protocol prepared
# Attempt clean shut down
killall pgcat -s SIGINT

21
.circleci/server.cert Normal file
View File

@@ -0,0 +1,21 @@
-----BEGIN CERTIFICATE-----
MIIDazCCAlOgAwIBAgIUChIvUGFJGJe5EDch32rchqoxER0wDQYJKoZIhvcNAQEL
BQAwRTELMAkGA1UEBhMCQVUxEzARBgNVBAgMClNvbWUtU3RhdGUxITAfBgNVBAoM
GEludGVybmV0IFdpZGdpdHMgUHR5IEx0ZDAeFw0yMjA2MjcyMjI2MDZaFw0yMjA3
MjcyMjI2MDZaMEUxCzAJBgNVBAYTAkFVMRMwEQYDVQQIDApTb21lLVN0YXRlMSEw
HwYDVQQKDBhJbnRlcm5ldCBXaWRnaXRzIFB0eSBMdGQwggEiMA0GCSqGSIb3DQEB
AQUAA4IBDwAwggEKAoIBAQDdTwrBzV1v79faVckFvIn/9V4fypYs4vDi3X+h3wGn
AjEh6mmizlKCwSwAam07D9Q5zKiXFrzNJqzSioOv5zsOAvObwrnzbtKSwfs3aP5g
eEh2clHCZYx9p06WszPcgSB5nTz1NeY4XAwvGn3A+SVCLyPMTNwnem48+ONh2F9u
FHtSuIsEVvTjMlH09O7LjwJlODxy3HNv2JHYM5Hx9tzc+NVYdERPtaVcX8ycw1Eh
9hgGSgfaNM52/JfRMIDhENrsn0S1omRUtcJe72loreiwrECUOLAnAfp9Xqc+rMPP
aLA6ElzmYef1+ZEC0p6isCHPhxY5ESVhKYhE9nQvksjnAgMBAAGjUzBRMB0GA1Ud
DgQWBBQLDtzexqjx7xPtUZuZB/angU9oSDAfBgNVHSMEGDAWgBQLDtzexqjx7xPt
UZuZB/angU9oSDAPBgNVHRMBAf8EBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IBAQC/
mxY/a/WeLENVj2Gg9EUH0CKzfqeTey1mb6YfPGxzrD7oq1m0Vn2MmTbjZrJgh/Ob
QckO3ElF4kC9+6XP+iDPmabGpjeLgllBboT5l2aqnD1syMrf61WPLzgRzRfplYGy
cjBQDDKPu8Lu0QRMWU28tHYN0bMxJoCuXysGGX5WsuFnKCA6f/V+nycJJXxJH3eB
eLjTueD9/RE3OXhi6m8A29Q1E9AE5EF4uRxYXrr91BmYnk4aFvSmBxhUEzE12eSN
lHB/uSc0+Dp+UVmVr6wW8AQfd16UBA0BUf3kSW3aSvirYPYH0rXiOOpEJgOwOMnR
f5+XAbN1Y+3OsFz/ZmP9
-----END CERTIFICATE-----

28
.circleci/server.key Normal file
View File

@@ -0,0 +1,28 @@
-----BEGIN RSA PRIVATE KEY-----
MIIEvwIBADANBgkqhkiG9w0BAQEFAASCBKkwggSlAgEAAoIBAQDdTwrBzV1v79fa
VckFvIn/9V4fypYs4vDi3X+h3wGnAjEh6mmizlKCwSwAam07D9Q5zKiXFrzNJqzS
ioOv5zsOAvObwrnzbtKSwfs3aP5geEh2clHCZYx9p06WszPcgSB5nTz1NeY4XAwv
Gn3A+SVCLyPMTNwnem48+ONh2F9uFHtSuIsEVvTjMlH09O7LjwJlODxy3HNv2JHY
M5Hx9tzc+NVYdERPtaVcX8ycw1Eh9hgGSgfaNM52/JfRMIDhENrsn0S1omRUtcJe
72loreiwrECUOLAnAfp9Xqc+rMPPaLA6ElzmYef1+ZEC0p6isCHPhxY5ESVhKYhE
9nQvksjnAgMBAAECggEAbnvddO9frFhivJ+DIhgEFQKcIOb0nigV9kx6QYehvYy8
lp/+aMb0Lk7d9r8rFQdL/icMK5GwZALg2KNKJvEbbF1Q3PwT9VHoUlgBYKJMDEFA
e9GKu7ASuVBjTZzdUUItwkkbe5eS/aQGeSWSjlpTnX0HNCFS72qRymK+scRhsAQf
ZoHyZHDslkvPR3Pos+sndWBYCDHag5/KoPhsMt1+5S9NQcOUHx9Ac0gLHjau3N+P
0FhODHFFGnnpyQvLvj6u3ZOR34ladMgoBglE0O3vPFhckn92EK4teeTWOsUMotiz
qM3QIJTOJjtiY6VDGY93bIa4pFvt7Zi4vIerenKt0QKBgQD/UMFqfevTAMrk10AC
bOa4+cM07ORY4ZwVj5ILhZn+8crDEEtBsUyuEU2FTINtnoEq1yGc/IXpsyS1BHjL
L1xSml5LN3jInbi8z5XQfY5Sj3VOMtwY6yD20jcdeDC44rz3nStXdkcMWxbTMapx
iOPsap5ciUKOMS7LyMidPEG/LQKBgQDd5vHgrLN0FBIIm+vZg6MEm4QyobstVp4l
7V/GZsdL+M8AQv1Rx+5wSUSWKomOIv5lglis7f6g0c9O7Qkr78/wzoyoKC2RRqPp
I90GjY2Iv22N4GIkRrDAgMZbkTitzIB6tbXEVeLAOh3frFJ8IwauRCOiXIjrZdJ4
FvV86+nU4wKBgQDdWTP2kWkMrBk7QOp7r9Jv+AmnLuHhtOdPQgOJ/bA++X2ik9PL
Bl3GY7XjpSwks1CkxZKcucmXjPp7/X6EGXFfI/owF82dkDADca0e7lufdERtIWb0
K5WOpz2lTPhgsiLGQfq7fw2lxqsJOnvcpqOD6gOVkmKjSDyb7F0RBJazmQKBgQDD
a8PQTcesjpBjLI3EfX1vbVY7ENu6zfFxDV+vZoxVh8UlQdm90AlYse3JIaUKnB7W
Xrihcucv0hZ0N6RAIW5LcFvHK7sVmdR4WbEpODhRGeTtcZJ8yBSZM898jKQRy2vK
pYRyaADNsWDlvujVkjMr/a40KrIaPQ3h3LZNUaYYaQKBgQD1x8A5S5SiE1cN1vFr
aACkmA2WqEDKKhUsUigJdwW6WB/B9kWlIlz/iV1H9uwBXtSIYG4VqCSTAvh0z4gX
Qu2SrdPm5PYnKzpdynpz78OnGdflD1RKWFGHItR6GN6tj/VmulO6mlFvT4jzBQ7j
+Hf8m2TcD4U3ksz3xw+YOD+cmA==
-----END RSA PRIVATE KEY-----

2
.gitignore vendored
View File

@@ -1,2 +1,4 @@
.idea
/target
*.deb
.vscode

View File

@@ -10,10 +10,4 @@ Happy hacking!
## TODOs
A non-exhaustive list of things that would be useful to implement:
#### Client authentication
MD5 is probably sufficient, but maybe others too.
#### Admin
Admin database for stats collection and pooler administration. PgBouncer gives us a nice example on how to do that, specifically how to implement `RowDescription` and `DataRow` messages, [example here](https://github.com/pgbouncer/pgbouncer/blob/4f9ced8e63d317a6ff45c8b0efa876b32161f6db/src/admin.c#L813).
See [Issues]([url](https://github.com/levkk/pgcat/issues)).

518
Cargo.lock generated
View File

@@ -45,6 +45,12 @@ version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cdb031dd78e28731d87d56cc8ffef4a8f36ca26c38fe2de700543e627f8a464a"
[[package]]
name = "base64"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "904dfeac50f3cdaba28fc6f57fdcddb75f49ed61346676a78c4ffe55877802fd"
[[package]]
name = "bb8"
version = "0.7.1"
@@ -73,12 +79,24 @@ dependencies = [
"generic-array",
]
[[package]]
name = "bumpalo"
version = "3.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37ccbd214614c6783386c1af30caf03192f17891059cecc394b4fb119e363de3"
[[package]]
name = "bytes"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4872d67bab6358e59559027aa3b9157c53d9358c51423c17554809a8858e0f8"
[[package]]
name = "cc"
version = "1.0.73"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fff2a6927b3bb87f9595d67196a70493f627687a71d87a0d692242c33f58c11"
[[package]]
name = "cfg-if"
version = "1.0.0"
@@ -109,22 +127,23 @@ dependencies = [
[[package]]
name = "crypto-common"
version = "0.1.1"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "683d6b536309245c849479fba3da410962a43ed8e51c26b729208ec0ac2798d0"
checksum = "57952ca27b5e3606ff4dd79b0020231aaf9d6aa76dc05fd30137538c50bd3ce8"
dependencies = [
"generic-array",
"typenum",
]
[[package]]
name = "digest"
version = "0.10.1"
version = "0.10.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b697d66081d42af4fba142d56918a3cb21dc8eb63372c6b85d14f44fb9c5979b"
checksum = "f2fb860ca6fafa5552fb6d0e816a69c8e49f0908bf524e30a90d97c85892d506"
dependencies = [
"block-buffer",
"crypto-common",
"generic-array",
"subtle",
]
[[package]]
@@ -140,6 +159,12 @@ dependencies = [
"termcolor",
]
[[package]]
name = "fnv"
version = "1.0.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3f9eec918d3f24069decb9af1554cad7c880e2da24a9afd88aca000531ab82c1"
[[package]]
name = "futures-channel"
version = "0.3.19"
@@ -155,6 +180,12 @@ version = "0.3.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d0c8ff0461b82559810cdccfde3215c3f373807f5e5232b71479bff7bb2583d7"
[[package]]
name = "futures-sink"
version = "0.3.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "21163e139fa306126e6eedaf49ecdb4588f939600f0b1e770f4205ee4b7fa868"
[[package]]
name = "futures-task"
version = "0.3.19"
@@ -196,6 +227,31 @@ dependencies = [
"wasi",
]
[[package]]
name = "h2"
version = "0.3.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37a82c6d637fc9515a4694bbf1cb2457b79d81ce52b3108bdeea58b07dd34a57"
dependencies = [
"bytes",
"fnv",
"futures-core",
"futures-sink",
"futures-util",
"http",
"indexmap",
"slab",
"tokio",
"tokio-util",
"tracing",
]
[[package]]
name = "hashbrown"
version = "0.12.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a9ee70c43aaf417c914396645a0fa852624801b24ebb7ae78fe8272889ac888"
[[package]]
name = "hermit-abi"
version = "0.1.19"
@@ -205,12 +261,89 @@ dependencies = [
"libc",
]
[[package]]
name = "hmac"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c49c37c09c17a53d937dfbb742eb3a961d65a994e6bcdcf37e7399d0cc8ab5e"
dependencies = [
"digest",
]
[[package]]
name = "http"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75f43d41e26995c17e71ee126451dd3941010b0514a81a9d11f3b341debc2399"
dependencies = [
"bytes",
"fnv",
"itoa",
]
[[package]]
name = "http-body"
version = "0.4.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d5f38f16d184e36f2408a55281cd658ecbd3ca05cce6d6510a176eca393e26d1"
dependencies = [
"bytes",
"http",
"pin-project-lite",
]
[[package]]
name = "httparse"
version = "1.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "496ce29bb5a52785b44e0f7ca2847ae0bb839c9bd28f69acac9b99d461c0c04c"
[[package]]
name = "httpdate"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4a1e36c821dbe04574f602848a19f742f4fb3c98d40449f11bcad18d6b17421"
[[package]]
name = "humantime"
version = "2.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a3a5bfb195931eeb336b2a7b4d761daec841b97f947d34394601737a7bba5e4"
[[package]]
name = "hyper"
version = "0.14.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "02c929dc5c39e335a03c405292728118860721b10190d98c2a0f0efd5baafbac"
dependencies = [
"bytes",
"futures-channel",
"futures-core",
"futures-util",
"h2",
"http",
"http-body",
"httparse",
"httpdate",
"itoa",
"pin-project-lite",
"socket2",
"tokio",
"tower-service",
"tracing",
"want",
]
[[package]]
name = "indexmap"
version = "1.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "10a35a97730320ffe8e2d410b5d3b69279b98d2c14bdb8b70ea89ecf7888d41e"
dependencies = [
"autocfg",
"hashbrown",
]
[[package]]
name = "instant"
version = "0.1.12"
@@ -221,10 +354,31 @@ dependencies = [
]
[[package]]
name = "libc"
version = "0.2.117"
name = "itoa"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e74d72e0f9b65b5b4ca49a346af3976df0f9c61d550727f349ecd559f251a26c"
checksum = "6c8af84674fe1f223a982c933a0ee1086ac4d4052aa0fb8060c12c6ad838e754"
[[package]]
name = "js-sys"
version = "0.3.58"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3fac17f7123a73ca62df411b1bf727ccc805daa070338fda671c86dac1bdc27"
dependencies = [
"wasm-bindgen",
]
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.126"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "349d5a591cd28b49e1d1037471617a32ddcda5731b99419008085f72d5a53836"
[[package]]
name = "lock_api"
@@ -352,29 +506,81 @@ dependencies = [
[[package]]
name = "pgcat"
version = "0.1.0-beta2"
version = "0.6.0-alpha1"
dependencies = [
"arc-swap",
"async-trait",
"base64",
"bb8",
"bytes",
"chrono",
"env_logger",
"hmac",
"hyper",
"log",
"md-5",
"num_cpus",
"once_cell",
"parking_lot",
"phf",
"rand",
"regex",
"rustls-pemfile",
"serde",
"serde_derive",
"sha-1",
"sha2",
"sqlparser",
"stringprep",
"tokio",
"tokio-rustls",
"toml",
]
[[package]]
name = "phf"
version = "0.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fabbf1ead8a5bcbc20f5f8b939ee3f5b0f6f281b6ad3468b84656b658b455259"
dependencies = [
"phf_macros",
"phf_shared",
"proc-macro-hack",
]
[[package]]
name = "phf_generator"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d5285893bb5eb82e6aaf5d59ee909a06a16737a8970984dd7746ba9283498d6"
dependencies = [
"phf_shared",
"rand",
]
[[package]]
name = "phf_macros"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "58fdf3184dd560f160dd73922bea2d5cd6e8f064bf4b13110abd81b03697b4e0"
dependencies = [
"phf_generator",
"phf_shared",
"proc-macro-hack",
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "phf_shared"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6796ad771acdc0123d2a88dc428b5e38ef24456743ddb1744ed628f9815c096"
dependencies = [
"siphasher",
]
[[package]]
name = "pin-project-lite"
version = "0.2.8"
@@ -393,6 +599,12 @@ version = "0.2.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eb9f9e6e233e5c4a35559a617bf40a4ec447db2e84c20b55a6f83167b7e57872"
[[package]]
name = "proc-macro-hack"
version = "0.5.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dbf0c48bc1d91375ae5c3cd81e3722dff1abcf81a30960240640d223f59fe0e5"
[[package]]
name = "proc-macro2"
version = "1.0.36"
@@ -462,9 +674,9 @@ dependencies = [
[[package]]
name = "regex"
version = "1.5.4"
version = "1.5.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d07a8629359eb56f1e2fb1652bb04212c072a87ba68546a04065d525673ac461"
checksum = "1a11647b6b25ff05a515cb92c365cec08801e83423a235b51e231e1808747286"
dependencies = [
"aho-corasick",
"memchr",
@@ -477,12 +689,58 @@ version = "0.6.25"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f497285884f3fcff424ffc933e56d7cbca511def0c9831a7f9b5f6153e3cc89b"
[[package]]
name = "ring"
version = "0.16.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3053cf52e236a3ed746dfc745aa9cacf1b791d846bdaf412f60a8d7d6e17c8fc"
dependencies = [
"cc",
"libc",
"once_cell",
"spin",
"untrusted",
"web-sys",
"winapi",
]
[[package]]
name = "rustls"
version = "0.20.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5aab8ee6c7097ed6057f43c187a62418d0c05a4bd5f18b3571db50ee0f9ce033"
dependencies = [
"log",
"ring",
"sct",
"webpki",
]
[[package]]
name = "rustls-pemfile"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e7522c9de787ff061458fe9a829dc790a3f5b22dc571694fc5883f448b94d9a9"
dependencies = [
"base64",
]
[[package]]
name = "scopeguard"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d29ab0c6d3fc0ee92fe66e2d99f700eab17a8d57d1c1d3b748380fb20baa78cd"
[[package]]
name = "sct"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d53dcdb7c9f8158937a7981b48accfd39a43af418591a5d008c7b22b5e1b7ca4"
dependencies = [
"ring",
"untrusted",
]
[[package]]
name = "serde"
version = "1.0.136"
@@ -511,6 +769,17 @@ dependencies = [
"digest",
]
[[package]]
name = "sha2"
version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "55deaec60f81eefe3cce0dc50bda92d6d8e88f2a27df7c5033b42afeb1ed2676"
dependencies = [
"cfg-if",
"cpufeatures",
"digest",
]
[[package]]
name = "signal-hook-registry"
version = "1.4.0"
@@ -520,6 +789,12 @@ dependencies = [
"libc",
]
[[package]]
name = "siphasher"
version = "0.3.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7bd3e3206899af3f8b12af284fafc038cc1dc2b41d1b89dd17297221c5d225de"
[[package]]
name = "slab"
version = "0.4.5"
@@ -532,6 +807,22 @@ version = "1.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f2dd574626839106c320a323308629dcb1acfc96e32a8cba364ddc61ac23ee83"
[[package]]
name = "socket2"
version = "0.4.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "66d72b759436ae32898a2af0a14218dbf55efde3feeb170eb623637db85ee1e0"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "spin"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6e63cff320ae2c57904679ba7cb63280a3dc4613885beafb148ee7bf9aa9042d"
[[package]]
name = "sqlparser"
version = "0.14.0"
@@ -541,6 +832,22 @@ dependencies = [
"log",
]
[[package]]
name = "stringprep"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8ee348cb74b87454fff4b551cbf727025810a004f88aeacae7f85b87f4e9a1c1"
dependencies = [
"unicode-bidi",
"unicode-normalization",
]
[[package]]
name = "subtle"
version = "2.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6bdef32e8150c2a081110b42772ffe7d7c9032b606bc226c8260fd97e0976601"
[[package]]
name = "syn"
version = "1.0.86"
@@ -572,6 +879,21 @@ dependencies = [
"winapi",
]
[[package]]
name = "tinyvec"
version = "1.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "87cc5ceb3875bb20c2890005a4e226a4651264a5c75edb2421b52861a0a0cb50"
dependencies = [
"tinyvec_macros",
]
[[package]]
name = "tinyvec_macros"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cda74da7e1a664f795bb1f8a87ec406fb89a02522cf6e50620d016add6dbbf5c"
[[package]]
name = "tokio"
version = "1.16.1"
@@ -602,6 +924,31 @@ dependencies = [
"syn",
]
[[package]]
name = "tokio-rustls"
version = "0.23.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c43ee83903113e03984cb9e5cebe6c04a5116269e900e3ddba8f068a62adda59"
dependencies = [
"rustls",
"tokio",
"webpki",
]
[[package]]
name = "tokio-util"
version = "0.7.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f988a1a1adc2fb21f9c12aa96441da33a1728193ae0b95d2be22dbd17fcb4e5c"
dependencies = [
"bytes",
"futures-core",
"futures-sink",
"pin-project-lite",
"tokio",
"tracing",
]
[[package]]
name = "toml"
version = "0.5.8"
@@ -611,30 +958,179 @@ dependencies = [
"serde",
]
[[package]]
name = "tower-service"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6bc1c9ce2b5135ac7f93c72918fc37feb872bdc6a5533a8b85eb4b86bfdae52"
[[package]]
name = "tracing"
version = "0.1.34"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d0ecdcb44a79f0fe9844f0c4f33a342cbcbb5117de8001e6ba0dc2351327d09"
dependencies = [
"cfg-if",
"pin-project-lite",
"tracing-attributes",
"tracing-core",
]
[[package]]
name = "tracing-attributes"
version = "0.1.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "11c75893af559bc8e10716548bdef5cb2b983f8e637db9d0e15126b61b484ee2"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "tracing-core"
version = "0.1.26"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f54c8ca710e81886d498c2fd3331b56c93aa248d49de2222ad2742247c60072f"
dependencies = [
"lazy_static",
]
[[package]]
name = "try-lock"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "59547bce71d9c38b83d9c0e92b6066c4253371f15005def0c30d9657f50c7642"
[[package]]
name = "typenum"
version = "1.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dcf81ac59edc17cc8697ff311e8f5ef2d99fcbd9817b34cec66f90b6c3dfd987"
[[package]]
name = "unicode-bidi"
version = "0.3.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "099b7128301d285f79ddd55b9a83d5e6b9e97c92e0ea0daebee7263e932de992"
[[package]]
name = "unicode-normalization"
version = "0.1.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d54590932941a9e9266f0832deed84ebe1bf2e4c9e4a3554d393d18f5e854bf9"
dependencies = [
"tinyvec",
]
[[package]]
name = "unicode-xid"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8ccb82d61f80a663efe1f787a51b16b5a51e3314d6ac365b08639f52387b33f3"
[[package]]
name = "untrusted"
version = "0.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a156c684c91ea7d62626509bce3cb4e1d9ed5c4d978f7b4352658f96a4c26b4a"
[[package]]
name = "version_check"
version = "0.9.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "49874b5167b65d7193b8aba1567f5c7d93d001cafc34600cee003eda787e483f"
[[package]]
name = "want"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ce8a968cb1cd110d136ff8b819a556d6fb6d919363c61534f6860c7eb172ba0"
dependencies = [
"log",
"try-lock",
]
[[package]]
name = "wasi"
version = "0.10.0+wasi-snapshot-preview1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1a143597ca7c7793eff794def352d41792a93c481eb1042423ff7ff72ba2c31f"
[[package]]
name = "wasm-bindgen"
version = "0.2.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c53b543413a17a202f4be280a7e5c62a1c69345f5de525ee64f8cfdbc954994"
dependencies = [
"cfg-if",
"wasm-bindgen-macro",
]
[[package]]
name = "wasm-bindgen-backend"
version = "0.2.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5491a68ab4500fa6b4d726bd67408630c3dbe9c4fe7bda16d5c82a1fd8c7340a"
dependencies = [
"bumpalo",
"lazy_static",
"log",
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c441e177922bc58f1e12c022624b6216378e5febc2f0533e41ba443d505b80aa"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
]
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7d94ac45fcf608c1f45ef53e748d35660f168490c10b23704c7779ab8f5c3048"
dependencies = [
"proc-macro2",
"quote",
"syn",
"wasm-bindgen-backend",
"wasm-bindgen-shared",
]
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.81"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6a89911bd99e5f3659ec4acf9c4d93b0a90fe4a2a11f15328472058edc5261be"
[[package]]
name = "web-sys"
version = "0.3.58"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fed94beee57daf8dd7d51f2b15dc2bcde92d7a72304cdf662a4371008b71b90"
dependencies = [
"js-sys",
"wasm-bindgen",
]
[[package]]
name = "webpki"
version = "0.22.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f095d78192e208183081cc07bc5515ef55216397af48b873e5edcd72637fa1bd"
dependencies = [
"ring",
"untrusted",
]
[[package]]
name = "winapi"
version = "0.3.9"

View File

@@ -1,6 +1,6 @@
[package]
name = "pgcat"
version = "0.1.0-beta2"
version = "0.6.0-alpha1"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
@@ -25,3 +25,11 @@ log = "0.4"
arc-swap = "1"
env_logger = "0.9"
parking_lot = "0.11"
hmac = "0.12"
sha2 = "0.10"
base64 = "0.13"
stringprep = "0.1"
tokio-rustls = "0.23"
rustls-pemfile = "1"
hyper = { version = "0.14", features = ["full"] }
phf = { version = "0.10", features = ["macros"] }

View File

@@ -1,9 +1,9 @@
FROM rust:1.58-slim-buster AS builder
FROM rust:1 AS builder
COPY . /app
WORKDIR /app
RUN cargo build --release
FROM debian:buster-slim
FROM debian:bullseye-slim
COPY --from=builder /app/target/release/pgcat /usr/bin/pgcat
COPY --from=builder /app/pgcat.toml /etc/pgcat/pgcat.toml
WORKDIR /etc/pgcat

8
Dockerfile.ci Normal file
View File

@@ -0,0 +1,8 @@
FROM cimg/rust:1.62.0
RUN sudo apt-get update && \
sudo apt-get install -y psmisc postgresql-contrib-12 postgresql-client-12 ruby ruby-dev libpq-dev python3 python3-pip lcov llvm-11 && \
sudo apt-get upgrade curl
RUN cargo install cargo-binutils rustfilt && \
rustup component add llvm-tools-preview
RUN pip3 install psycopg2 && \
sudo gem install bundler

View File

@@ -9,19 +9,19 @@ PostgreSQL pooler (like PgBouncer) with sharding, load balancing and failover su
**Beta**: looking for beta testers, see [#35](https://github.com/levkk/pgcat/issues/35).
## Features
| **Feature** | **Status** | **Comments** |
|--------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :white_check_mark: | Identical to PgBouncer. |
| Session pooling | :white_check_mark: | Identical to PgBouncer. |
| `COPY` support | :white_check_mark: | Both `COPY TO` and `COPY FROM` are supported. |
| Query cancellation | :white_check_mark: | Supported both in transaction and session pooling modes. |
| Load balancing of read queries | :white_check_mark: | Using round-robin between replicas. Primary is included when `primary_reads_enabled` is enabled (default). |
| Sharding | :white_check_mark: | Transactions are sharded using `SET SHARD TO` and `SET SHARDING KEY TO` syntax extensions; see examples below. |
| Failover | :white_check_mark: | Replicas are tested with a health check. If a health check fails, remaining replicas are attempted; see below for algorithm description and examples. |
| Statistics | :white_check_mark: | Statistics available in the admin database (`pgcat` and `pgbouncer`) with `SHOW STATS`, `SHOW POOLS` and others. |
| Live configuration reloading | :white_check_mark: | Reload supported settings with a `SIGHUP` to the process, e.g. `kill -s SIGHUP $(pgrep pgcat)` or `RELOAD` query issued to the admin database. |
| Client authentication | :x: :wrench: | On the roadmap; currently all clients are allowed to connect and one user is used to connect to Postgres. |
| Admin database | :white_check_mark: | The admin database, similar to PgBouncer's, allows to query for statistics and reload the configuration. |
| **Feature** | **Status** | **Comments** |
|--------------------------------|-----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :white_check_mark: | Identical to PgBouncer. |
| Session pooling | :white_check_mark: | Identical to PgBouncer. |
| `COPY` support | :white_check_mark: | Both `COPY TO` and `COPY FROM` are supported. |
| Query cancellation | :white_check_mark: | Supported both in transaction and session pooling modes. |
| Load balancing of read queries | :white_check_mark: | Using round-robin between replicas. Primary is included when `primary_reads_enabled` is enabled (default). |
| Sharding | :white_check_mark: | Transactions are sharded using `SET SHARD TO` and `SET SHARDING KEY TO` syntax extensions; see examples below. |
| Failover | :white_check_mark: | Replicas are tested with a health check. If a health check fails, remaining replicas are attempted; see below for algorithm description and examples. |
| Statistics | :white_check_mark: | Statistics available in the admin database (`pgcat` and `pgbouncer`) with `SHOW STATS`, `SHOW POOLS` and others. |
| Live configuration reloading | :white_check_mark: | Reload supported settings with a `SIGHUP` to the process, e.g. `kill -s SIGHUP $(pgrep pgcat)` or `RELOAD` query issued to the admin database. |
| Client authentication | :white_check_mark: :wrench: | MD5 password authentication is supported, SCRAM is on the roadmap; one user is used to connect to Postgres with both SCRAM and MD5 supported. |
| Admin database | :white_check_mark: | The admin database, similar to PgBouncer's, allows to query for statistics and reload the configuration. |
## Deployment
@@ -47,6 +47,8 @@ psql -h 127.0.0.1 -p 6432 -c 'SELECT 1'
| `pool_mode` | The pool mode to use, i.e. `session` or `transaction`. | `transaction` |
| `connect_timeout` | Maximum time to establish a connection to a server (milliseconds). If reached, the server is banned and the next target is attempted. | `5000` |
| `healthcheck_timeout` | Maximum time to pass a health check (`SELECT 1`, milliseconds). If reached, the server is banned and the next target is attempted. | `1000` |
| `shutdown_timeout` | Maximum time to give clients during shutdown before forcibly killing client connections (ms). | `60000` |
| `healthcheck_delay` | How long to keep connection available for immediate re-use, without running a healthcheck query on it | `30000` |
| `ban_time` | Ban time for a server (seconds). It won't be allowed to serve transactions until the ban expires; failover targets will be used instead. | `60` |
| | | |
| **`user`** | | |
@@ -250,6 +252,8 @@ The config can be reloaded by sending a `kill -s SIGHUP` to the process or by qu
| `pool_mode` | no |
| `connect_timeout` | yes |
| `healthcheck_timeout` | no |
| `shutdown_timeout` | no |
| `healthcheck_delay` | no |
| `ban_time` | no |
| `user` | yes |
| `shards` | yes |

View File

@@ -14,3 +14,4 @@ services:
- "${PWD}/examples/docker/pgcat.toml:/etc/pgcat/pgcat.toml"
ports:
- "6432:6432"
- "9930:9930"

View File

@@ -5,20 +5,14 @@
#
# General pooler settings
[general]
# What IP to run on, 0.0.0.0 means accessible from everywhere.
host = "0.0.0.0"
# Port to run on, same as PgBouncer used in this example.
port = 6432
# How many connections to allocate per server.
pool_size = 15
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# enable prometheus exporter on port 9930
enable_prometheus_exporter = true
# How long to wait before aborting a server connection (ms).
connect_timeout = 5000
@@ -26,53 +20,36 @@ connect_timeout = 5000
# How much time to give `SELECT 1` health check query to return with a result (ms).
healthcheck_timeout = 1000
# How long to keep connection available for immediate re-use, without running a healthcheck query on it
healthcheck_delay = 30000
# How much time to give clients during shutdown before forcibly killing client connections (ms).
shutdown_timeout = 60000
# For how long to ban a server if it fails a health check (seconds).
ban_time = 60 # Seconds
ban_time = 60 # seconds
#
# User to use for authentication against the server.
[user]
name = "postgres"
password = "postgres"
# Reload config automatically if it changes.
autoreload = false
# TLS
# tls_certificate = "server.cert"
# tls_private_key = "server.key"
#
# Shards in the cluster
[shards]
# Credentials to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
admin_username = "postgres"
admin_password = "postgres"
# Shard 0
[shards.0]
# [ host, port, role ]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
# Database name (e.g. "postgres")
database = "postgres"
[shards.1]
# [ host, port, role ]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "postgres"
[shards.2]
# [ host, port, role ]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "postgres"
# Settings for our query routing layer.
[query_router]
# pool
# configs are structured as pool.<pool_name>
# the pool_name is what clients use as database name when connecting
# For the example below a client can connect using "postgres://sharding_user:sharding_user@pgcat_host:pgcat_port/sharded"
[pools.sharded]
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# If the client doesn't specify, route traffic to
# this role by default.
@@ -82,12 +59,11 @@ database = "postgres"
# primary: all queries go to the primary unless otherwise specified.
default_role = "any"
# Query parser. If enabled, we'll attempt to parse
# every incoming query to determine if it's a read or a write.
# If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# we'll direct it to the primary.
query_parser_enabled = false
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
@@ -103,3 +79,61 @@ primary_reads_enabled = true
# sha1: A hashing function based on SHA1
#
sharding_function = "pg_bigint_hash"
# Credentials for users that may connect to this cluster
[pools.sharded.users.0]
username = "postgres"
password = "postgres"
# Maximum number of server connections that can be established for this user
# The maximum number of connection from a single Pgcat process to any database in the cluster
# is the sum of pool_size across all users.
pool_size = 9
[pools.sharded.users.1]
username = "postgres"
password = "postgres"
pool_size = 21
# Shard 0
[pools.sharded.shards.0]
# [ host, port, role ]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ]
]
# Database name (e.g. "postgres")
database = "postgres"
[pools.sharded.shards.1]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ],
]
database = "postgres"
[pools.sharded.shards.2]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ],
]
database = "postgres"
[pools.simple_db]
pool_mode = "session"
default_role = "primary"
query_parser_enabled = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
[pools.simple_db.users.0]
username = "postgres"
password = "postgres"
pool_size = 5
[pools.simple_db.shards.0]
servers = [
[ "postgres", 5432, "primary" ],
[ "postgres", 5432, "replica" ]
]
database = "postgres"

View File

@@ -5,74 +5,51 @@
#
# General pooler settings
[general]
# What IP to run on, 0.0.0.0 means accessible from everywhere.
host = "0.0.0.0"
# Port to run on, same as PgBouncer used in this example.
port = 6432
# How many connections to allocate per server.
pool_size = 15
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# enable prometheus exporter on port 9930
enable_prometheus_exporter = true
# How long to wait before aborting a server connection (ms).
connect_timeout = 5000
# How much time to give `SELECT 1` health check query to return with a result (ms).
# How much time to give the health check query to return with a result (ms).
healthcheck_timeout = 1000
# How long to keep connection available for immediate re-use, without running a healthcheck query on it
healthcheck_delay = 30000
# How much time to give clients during shutdown before forcibly killing client connections (ms).
shutdown_timeout = 60000
# For how long to ban a server if it fails a health check (seconds).
ban_time = 60 # Seconds
ban_time = 60 # seconds
#
# User to use for authentication against the server.
[user]
name = "sharding_user"
password = "sharding_user"
# Reload config automatically if it changes.
autoreload = false
# TLS
# tls_certificate = "server.cert"
# tls_private_key = "server.key"
#
# Shards in the cluster
[shards]
# Credentials to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
admin_username = "admin_user"
admin_password = "admin_pass"
# Shard 0
[shards.0]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
# Database name (e.g. "postgres")
database = "shard0"
[shards.1]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "shard1"
[shards.2]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
# [ "127.0.1.1", 5432, "replica" ],
]
database = "shard2"
# Settings for our query routing layer.
[query_router]
# pool
# configs are structured as pool.<pool_name>
# the pool_name is what clients use as database name when connecting
# For the example below a client can connect using "postgres://sharding_user:sharding_user@pgcat_host:pgcat_port/sharded_db"
[pools.sharded_db]
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# If the client doesn't specify, route traffic to
# this role by default.
@@ -82,16 +59,15 @@ database = "shard2"
# primary: all queries go to the primary unless otherwise specified.
default_role = "any"
# Query parser. If enabled, we'll attempt to parse
# every incoming query to determine if it's a read or a write.
# If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# we'll direct it to the primary.
query_parser_enabled = false
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
# queries. The primary can always be explicitely selected with our custom protocol.
# queries. The primary can always be explicitly selected with our custom protocol.
primary_reads_enabled = true
# So what if you wanted to implement a different hashing function,
@@ -103,3 +79,61 @@ primary_reads_enabled = true
# sha1: A hashing function based on SHA1
#
sharding_function = "pg_bigint_hash"
# Credentials for users that may connect to this cluster
[pools.sharded_db.users.0]
username = "sharding_user"
password = "sharding_user"
# Maximum number of server connections that can be established for this user
# The maximum number of connection from a single Pgcat process to any database in the cluster
# is the sum of pool_size across all users.
pool_size = 9
[pools.sharded_db.users.1]
username = "other_user"
password = "other_user"
pool_size = 21
# Shard 0
[pools.sharded_db.shards.0]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
# Database name (e.g. "postgres")
database = "shard0"
[pools.sharded_db.shards.1]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard1"
[pools.sharded_db.shards.2]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard2"
[pools.simple_db]
pool_mode = "session"
default_role = "primary"
query_parser_enabled = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
[pools.simple_db.users.0]
username = "simple_user"
password = "simple_user"
pool_size = 5
[pools.simple_db.shards.0]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
database = "some_db"

View File

@@ -2,20 +2,35 @@
use bytes::{Buf, BufMut, BytesMut};
use log::{info, trace};
use std::collections::HashMap;
use tokio::net::tcp::OwnedWriteHalf;
use crate::config::{get_config, parse};
use crate::config::{get_config, reload_config, VERSION};
use crate::errors::Error;
use crate::messages::*;
use crate::pool::ConnectionPool;
use crate::pool::get_all_pools;
use crate::stats::get_stats;
use crate::ClientServerMap;
pub fn generate_server_info_for_admin() -> BytesMut {
let mut server_info = BytesMut::new();
server_info.put(server_paramater_message("application_name", ""));
server_info.put(server_paramater_message("client_encoding", "UTF8"));
server_info.put(server_paramater_message("server_encoding", "UTF8"));
server_info.put(server_paramater_message("server_version", VERSION));
server_info.put(server_paramater_message("DateStyle", "ISO, MDY"));
return server_info;
}
/// Handle admin client.
pub async fn handle_admin(
stream: &mut OwnedWriteHalf,
pub async fn handle_admin<T>(
stream: &mut T,
mut query: BytesMut,
pool: ConnectionPool,
) -> Result<(), Error> {
client_server_map: ClientServerMap,
) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let code = query.get_u8() as char;
if code != 'Q' {
@@ -31,22 +46,22 @@ pub async fn handle_admin(
if query.starts_with("SHOW STATS") {
trace!("SHOW STATS");
show_stats(stream, &pool).await
show_stats(stream).await
} else if query.starts_with("RELOAD") {
trace!("RELOAD");
reload(stream).await
reload(stream, client_server_map).await
} else if query.starts_with("SHOW CONFIG") {
trace!("SHOW CONFIG");
show_config(stream).await
} else if query.starts_with("SHOW DATABASES") {
trace!("SHOW DATABASES");
show_databases(stream, &pool).await
show_databases(stream).await
} else if query.starts_with("SHOW POOLS") {
trace!("SHOW POOLS");
show_pools(stream, &pool).await
show_pools(stream).await
} else if query.starts_with("SHOW LISTS") {
trace!("SHOW LISTS");
show_lists(stream, &pool).await
show_lists(stream).await
} else if query.starts_with("SHOW VERSION") {
trace!("SHOW VERSION");
show_version(stream).await
@@ -59,22 +74,28 @@ pub async fn handle_admin(
}
/// Column-oriented statistics.
async fn show_lists(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Result<(), Error> {
async fn show_lists<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let stats = get_stats();
let columns = vec![("list", DataType::Text), ("items", DataType::Int4)];
let mut users = 1;
let mut databases = 1;
for (_, pool) in get_all_pools() {
databases += pool.databases();
users += 1; // One user per pool
}
let mut res = BytesMut::new();
res.put(row_description(&columns));
res.put(data_row(&vec![
"databases".to_string(),
(pool.databases() + 1).to_string(), // see comment below
databases.to_string(),
]));
res.put(data_row(&vec!["users".to_string(), "1".to_string()]));
res.put(data_row(&vec![
"pools".to_string(),
(pool.databases() + 1).to_string(), // +1 for the pgbouncer admin db pool which isn't real
])); // but admin tools that work with pgbouncer want this
res.put(data_row(&vec!["users".to_string(), users.to_string()]));
res.put(data_row(&vec!["pools".to_string(), databases.to_string()]));
res.put(data_row(&vec![
"free_clients".to_string(),
stats
@@ -126,11 +147,14 @@ async fn show_lists(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Resul
}
/// Show PgCat version.
async fn show_version(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
async fn show_version<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut res = BytesMut::new();
res.put(row_description(&vec![("version", DataType::Text)]));
res.put(data_row(&vec!["PgCat 0.1.0".to_string()]));
res.put(data_row(&vec![format!("PgCat {}", VERSION).to_string()]));
res.put(command_complete("SHOW"));
res.put_u8(b'Z');
@@ -141,12 +165,11 @@ async fn show_version(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
}
/// Show utilization of connection pools for each shard and replicas.
async fn show_pools(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Result<(), Error> {
async fn show_pools<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let stats = get_stats();
let config = {
let guard = get_config();
&*guard.clone()
};
let columns = vec![
("database", DataType::Text),
@@ -166,24 +189,26 @@ async fn show_pools(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Resul
let mut res = BytesMut::new();
res.put(row_description(&columns));
for (_, pool) in get_all_pools() {
let pool_config = &pool.settings;
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let stats = match stats.get(&address.id) {
Some(stats) => stats.clone(),
None => HashMap::new(),
};
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let stats = match stats.get(&address.id) {
Some(stats) => stats.clone(),
None => HashMap::new(),
};
let mut row = vec![address.name(), pool_config.user.username.clone()];
let mut row = vec![address.name(), config.user.name.clone()];
for column in &columns[2..columns.len() - 1] {
let value = stats.get(column.0).unwrap_or(&0).to_string();
row.push(value);
}
for column in &columns[2..columns.len() - 1] {
let value = stats.get(column.0).unwrap_or(&0).to_string();
row.push(value);
row.push(pool_config.pool_mode.to_string());
res.put(data_row(&row));
}
row.push(config.general.pool_mode.to_string());
res.put(data_row(&row));
}
}
@@ -198,11 +223,10 @@ async fn show_pools(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Resul
}
/// Show shards and replicas.
async fn show_databases(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Result<(), Error> {
let guard = get_config();
let config = &*guard.clone();
drop(guard);
async fn show_databases<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
// Columns
let columns = vec![
("name", DataType::Text),
@@ -224,31 +248,37 @@ async fn show_databases(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> R
res.put(row_description(&columns));
for shard in 0..pool.shards() {
let database_name = &config.shards[&shard.to_string()].database;
for (_, pool) in get_all_pools() {
let pool_config = pool.settings.clone();
for shard in 0..pool.shards() {
let database_name = &pool_config.shards[&shard.to_string()].database;
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let pool_state = pool.pool_state(shard, server);
let banned = pool.is_banned(address, shard, Some(address.role));
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let pool_state = pool.pool_state(shard, server);
res.put(data_row(&vec![
address.name(), // name
address.host.to_string(), // host
address.port.to_string(), // port
database_name.to_string(), // database
config.user.name.to_string(), // force_user
config.general.pool_size.to_string(), // pool_size
"0".to_string(), // min_pool_size
"0".to_string(), // reserve_pool
config.general.pool_mode.to_string(), // pool_mode
config.general.pool_size.to_string(), // max_connections
pool_state.connections.to_string(), // current_connections
"0".to_string(), // paused
"0".to_string(), // disabled
]));
res.put(data_row(&vec![
address.name(), // name
address.host.to_string(), // host
address.port.to_string(), // port
database_name.to_string(), // database
pool_config.user.username.to_string(), // force_user
pool_config.user.pool_size.to_string(), // pool_size
"0".to_string(), // min_pool_size
"0".to_string(), // reserve_pool
pool_config.pool_mode.to_string(), // pool_mode
pool_config.user.pool_size.to_string(), // max_connections
pool_state.connections.to_string(), // current_connections
"0".to_string(), // paused
match banned {
// disabled
true => "1".to_string(),
false => "0".to_string(),
},
]));
}
}
}
res.put(command_complete("SHOW"));
// ReadyForQuery
@@ -261,22 +291,23 @@ async fn show_databases(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> R
/// Ignore any SET commands the client sends.
/// This is common initialization done by ORMs.
async fn ignore_set(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
async fn ignore_set<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
custom_protocol_response_ok(stream, "SET").await
}
/// Reload the configuration file without restarting the process.
async fn reload(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
async fn reload<T>(stream: &mut T, client_server_map: ClientServerMap) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
info!("Reloading config");
let config = get_config();
let path = config.path.clone().unwrap();
reload_config(client_server_map).await?;
parse(&path).await?;
let config = get_config();
config.show();
get_config().show();
let mut res = BytesMut::new();
@@ -291,11 +322,12 @@ async fn reload(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
}
/// Shows current configuration.
async fn show_config(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
let guard = get_config();
let config = &*guard.clone();
async fn show_config<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let config = &get_config();
let config: HashMap<String, String> = config.into();
drop(guard);
// Configs that cannot be changed without restarting.
let immutables = ["host", "port", "connect_timeout"];
@@ -336,9 +368,13 @@ async fn show_config(stream: &mut OwnedWriteHalf) -> Result<(), Error> {
}
/// Show shard and replicas statistics.
async fn show_stats(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Result<(), Error> {
async fn show_stats<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let columns = vec![
("database", DataType::Text),
("user", DataType::Text),
("total_xact_count", DataType::Numeric),
("total_query_count", DataType::Numeric),
("total_received", DataType::Numeric),
@@ -359,21 +395,24 @@ async fn show_stats(stream: &mut OwnedWriteHalf, pool: &ConnectionPool) -> Resul
let mut res = BytesMut::new();
res.put(row_description(&columns));
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let stats = match stats.get(&address.id) {
Some(stats) => stats.clone(),
None => HashMap::new(),
};
for ((_db_name, username), pool) in get_all_pools() {
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let stats = match stats.get(&address.id) {
Some(stats) => stats.clone(),
None => HashMap::new(),
};
let mut row = vec![address.name()];
let mut row = vec![address.name()];
row.push(username.clone());
for column in &columns[1..] {
row.push(stats.get(column.0).unwrap_or(&0).to_string());
for column in &columns[2..] {
row.push(stats.get(column.0).unwrap_or(&0).to_string());
}
res.put(data_row(&row));
}
res.put(data_row(&row));
}
}

View File

@@ -1,36 +1,47 @@
/// Handle clients by pretending to be a PostgreSQL server.
use bytes::{Buf, BufMut, BytesMut};
use log::{debug, error, trace};
use log::{debug, error, info, trace};
use std::collections::HashMap;
use tokio::io::{AsyncReadExt, BufReader};
use tokio::net::{
tcp::{OwnedReadHalf, OwnedWriteHalf},
TcpStream,
};
use tokio::io::{split, AsyncReadExt, BufReader, ReadHalf, WriteHalf};
use tokio::net::TcpStream;
use tokio::sync::broadcast::Receiver;
use crate::admin::handle_admin;
use crate::config::get_config;
use crate::admin::{generate_server_info_for_admin, handle_admin};
use crate::config::{get_config, Address};
use crate::constants::*;
use crate::errors::Error;
use crate::messages::*;
use crate::pool::{ClientServerMap, ConnectionPool};
use crate::pool::{get_pool, ClientServerMap, ConnectionPool};
use crate::query_router::{Command, QueryRouter};
use crate::server::Server;
use crate::stats::Reporter;
use crate::stats::{get_reporter, Reporter};
use crate::tls::Tls;
use tokio_rustls::server::TlsStream;
/// Type of connection received from client.
enum ClientConnectionType {
Startup,
Tls,
CancelQuery,
}
/// The client state. One of these is created per client.
pub struct Client {
pub struct Client<S, T> {
/// The reads are buffered (8K by default).
read: BufReader<OwnedReadHalf>,
read: BufReader<S>,
/// We buffer the writes ourselves because we know the protocol
/// better than a stock buffer.
write: OwnedWriteHalf,
write: T,
/// Internal buffer, where we place messages until we have to flush
/// them to the backend.
buffer: BytesMut,
/// Address
addr: std::net::SocketAddr,
/// The client was started with the sole reason to cancel another running query.
cancel_mode: bool,
@@ -61,130 +72,400 @@ pub struct Client {
/// Last server process id we talked to.
last_server_id: Option<i32>,
/// Name of the server pool for this client (This comes from the database name in the connection string)
target_pool_name: String,
/// Postgres user for this client (This comes from the user in the connection string)
target_user_name: String,
/// Used to notify clients about an impending shutdown
shutdown_event_receiver: Receiver<()>,
}
impl Client {
/// Perform client startup sequence.
/// See docs: <https://www.postgresql.org/docs/12/protocol-flow.html#id-1.10.5.7.3>
pub async fn startup(
mut stream: TcpStream,
client_server_map: ClientServerMap,
server_info: BytesMut,
stats: Reporter,
) -> Result<Client, Error> {
let config = get_config();
let transaction_mode = config.general.pool_mode.starts_with("t");
drop(config);
loop {
trace!("Waiting for StartupMessage");
/// Client entrypoint.
pub async fn client_entrypoint(
mut stream: TcpStream,
client_server_map: ClientServerMap,
shutdown_event_receiver: Receiver<()>,
) -> Result<(), Error> {
// Figure out if the client wants TLS or not.
let addr = stream.peer_addr().unwrap();
// Could be StartupMessage, SSLRequest or CancelRequest.
let len = match stream.read_i32().await {
Ok(len) => len,
Err(_) => return Err(Error::ClientBadStartup),
};
match get_startup::<TcpStream>(&mut stream).await {
// Client requested a TLS connection.
Ok((ClientConnectionType::Tls, _)) => {
let config = get_config();
let mut startup = vec![0u8; len as usize - 4];
// TLS settings are configured, will setup TLS now.
if config.general.tls_certificate != None {
debug!("Accepting TLS request");
match stream.read_exact(&mut startup).await {
Ok(_) => (),
Err(_) => return Err(Error::ClientBadStartup),
};
let mut yes = BytesMut::new();
yes.put_u8(b'S');
write_all(&mut stream, yes).await?;
let mut bytes = BytesMut::from(&startup[..]);
let code = bytes.get_i32();
// Negotiate TLS.
match startup_tls(stream, client_server_map, shutdown_event_receiver).await {
Ok(mut client) => {
info!("Client {:?} connected (TLS)", addr);
match code {
// Client wants SSL. We don't support it at the moment.
SSL_REQUEST_CODE => {
trace!("Rejecting SSLRequest");
let mut no = BytesMut::with_capacity(1);
no.put_u8(b'N');
write_all(&mut stream, no).await?;
client.handle().await
}
Err(err) => Err(err),
}
}
// TLS is not configured, we cannot offer it.
else {
// Rejecting client request for TLS.
let mut no = BytesMut::new();
no.put_u8(b'N');
write_all(&mut stream, no).await?;
// Regular startup message.
PROTOCOL_VERSION_NUMBER => {
trace!("Got StartupMessage");
// Attempting regular startup. Client can disconnect now
// if they choose.
match get_startup::<TcpStream>(&mut stream).await {
// Client accepted unencrypted connection.
Ok((ClientConnectionType::Startup, bytes)) => {
let (read, write) = split(stream);
// TODO: perform actual auth.
let parameters = parse_startup(bytes.clone())?;
// Continue with regular startup.
match Client::startup(
read,
write,
addr,
bytes,
client_server_map,
shutdown_event_receiver,
)
.await
{
Ok(mut client) => {
info!("Client {:?} connected (plain)", addr);
// Generate random backend ID and secret key
let process_id: i32 = rand::random();
let secret_key: i32 = rand::random();
client.handle().await
}
Err(err) => Err(err),
}
}
auth_ok(&mut stream).await?;
write_all(&mut stream, server_info).await?;
backend_key_data(&mut stream, process_id, secret_key).await?;
ready_for_query(&mut stream).await?;
trace!("Startup OK");
let database = parameters
.get("database")
.unwrap_or(parameters.get("user").unwrap());
let admin = ["pgcat", "pgbouncer"]
.iter()
.filter(|db| *db == &database)
.count()
== 1;
// Split the read and write streams
// so we can control buffering.
let (read, write) = stream.into_split();
return Ok(Client {
read: BufReader::new(read),
write: write,
buffer: BytesMut::with_capacity(8196),
cancel_mode: false,
transaction_mode: transaction_mode,
process_id: process_id,
secret_key: secret_key,
client_server_map: client_server_map,
parameters: parameters,
stats: stats,
admin: admin,
last_address_id: None,
last_server_id: None,
});
// Client probably disconnected rejecting our plain text connection.
_ => Err(Error::ProtocolSyncError),
}
// Query cancel request.
CANCEL_REQUEST_CODE => {
let (read, write) = stream.into_split();
let process_id = bytes.get_i32();
let secret_key = bytes.get_i32();
return Ok(Client {
read: BufReader::new(read),
write: write,
buffer: BytesMut::with_capacity(8196),
cancel_mode: true,
transaction_mode: transaction_mode,
process_id: process_id,
secret_key: secret_key,
client_server_map: client_server_map,
parameters: HashMap::new(),
stats: stats,
admin: false,
last_address_id: None,
last_server_id: None,
});
}
_ => {
return Err(Error::ProtocolSyncError);
}
};
}
}
// Client wants to use plain connection without encryption.
Ok((ClientConnectionType::Startup, bytes)) => {
let (read, write) = split(stream);
// Continue with regular startup.
match Client::startup(
read,
write,
addr,
bytes,
client_server_map,
shutdown_event_receiver,
)
.await
{
Ok(mut client) => {
info!("Client {:?} connected (plain)", addr);
client.handle().await
}
Err(err) => Err(err),
}
}
// Client wants to cancel a query.
Ok((ClientConnectionType::CancelQuery, bytes)) => {
let (read, write) = split(stream);
// Continue with cancel query request.
match Client::cancel(
read,
write,
addr,
bytes,
client_server_map,
shutdown_event_receiver,
)
.await
{
Ok(mut client) => {
info!("Client {:?} issued a cancel query request", addr);
client.handle().await
}
Err(err) => Err(err),
}
}
// Something failed, probably the socket.
Err(err) => Err(err),
}
}
/// Handle the first message the client sends.
async fn get_startup<S>(stream: &mut S) -> Result<(ClientConnectionType, BytesMut), Error>
where
S: tokio::io::AsyncRead + std::marker::Unpin + tokio::io::AsyncWrite,
{
// Get startup message length.
let len = match stream.read_i32().await {
Ok(len) => len,
Err(_) => return Err(Error::ClientBadStartup),
};
// Get the rest of the message.
let mut startup = vec![0u8; len as usize - 4];
match stream.read_exact(&mut startup).await {
Ok(_) => (),
Err(_) => return Err(Error::ClientBadStartup),
};
let mut bytes = BytesMut::from(&startup[..]);
let code = bytes.get_i32();
match code {
// Client is requesting SSL (TLS).
SSL_REQUEST_CODE => Ok((ClientConnectionType::Tls, bytes)),
// Client wants to use plain text, requesting regular startup.
PROTOCOL_VERSION_NUMBER => Ok((ClientConnectionType::Startup, bytes)),
// Client is requesting to cancel a running query (plain text connection).
CANCEL_REQUEST_CODE => Ok((ClientConnectionType::CancelQuery, bytes)),
// Something else, probably something is wrong and it's not our fault,
// e.g. badly implemented Postgres client.
_ => Err(Error::ProtocolSyncError),
}
}
/// Handle TLS connection negotiation.
pub async fn startup_tls(
stream: TcpStream,
client_server_map: ClientServerMap,
shutdown_event_receiver: Receiver<()>,
) -> Result<Client<ReadHalf<TlsStream<TcpStream>>, WriteHalf<TlsStream<TcpStream>>>, Error> {
// Negotiate TLS.
let tls = Tls::new()?;
let addr = stream.peer_addr().unwrap();
let mut stream = match tls.acceptor.accept(stream).await {
Ok(stream) => stream,
// TLS negotiation failed.
Err(err) => {
error!("TLS negotiation failed: {:?}", err);
return Err(Error::TlsError);
}
};
// TLS negotiation successful.
// Continue with regular startup using encrypted connection.
match get_startup::<TlsStream<TcpStream>>(&mut stream).await {
// Got good startup message, proceeding like normal except we
// are encrypted now.
Ok((ClientConnectionType::Startup, bytes)) => {
let (read, write) = split(stream);
Client::startup(
read,
write,
addr,
bytes,
client_server_map,
shutdown_event_receiver,
)
.await
}
// Bad Postgres client.
_ => Err(Error::ProtocolSyncError),
}
}
impl<S, T> Client<S, T>
where
S: tokio::io::AsyncRead + std::marker::Unpin,
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
/// Handle Postgres client startup after TLS negotiation is complete
/// or over plain text.
pub async fn startup(
mut read: S,
mut write: T,
addr: std::net::SocketAddr,
bytes: BytesMut, // The rest of the startup message.
client_server_map: ClientServerMap,
shutdown_event_receiver: Receiver<()>,
) -> Result<Client<S, T>, Error> {
let config = get_config();
let stats = get_reporter();
trace!("Got StartupMessage");
let parameters = parse_startup(bytes.clone())?;
let target_pool_name = match parameters.get("database") {
Some(db) => db,
None => return Err(Error::ClientError),
};
let target_user_name = match parameters.get("user") {
Some(user) => user,
None => return Err(Error::ClientError),
};
let admin = ["pgcat", "pgbouncer"]
.iter()
.filter(|db| *db == &target_pool_name)
.count()
== 1;
// Generate random backend ID and secret key
let process_id: i32 = rand::random();
let secret_key: i32 = rand::random();
// Perform MD5 authentication.
// TODO: Add SASL support.
let salt = md5_challenge(&mut write).await?;
let code = match read.read_u8().await {
Ok(p) => p,
Err(_) => return Err(Error::SocketError),
};
// PasswordMessage
if code as char != 'p' {
debug!("Expected p, got {}", code as char);
return Err(Error::ProtocolSyncError);
}
let len = match read.read_i32().await {
Ok(len) => len,
Err(_) => return Err(Error::SocketError),
};
let mut password_response = vec![0u8; (len - 4) as usize];
match read.read_exact(&mut password_response).await {
Ok(_) => (),
Err(_) => return Err(Error::SocketError),
};
let (transaction_mode, server_info) = if admin {
let correct_user = config.general.admin_username.as_str();
let correct_password = config.general.admin_password.as_str();
// Compare server and client hashes.
let password_hash = md5_hash_password(correct_user, correct_password, &salt);
if password_hash != password_response {
debug!("Password authentication failed");
wrong_password(&mut write, target_user_name).await?;
return Err(Error::ClientError);
}
(false, generate_server_info_for_admin())
} else {
let target_pool = match get_pool(target_pool_name.clone(), target_user_name.clone()) {
Some(pool) => pool,
None => {
error_response(
&mut write,
&format!(
"No pool configured for database: {:?}, user: {:?}",
target_pool_name, target_user_name
),
)
.await?;
return Err(Error::ClientError);
}
};
let transaction_mode = target_pool.settings.pool_mode == "transaction";
let server_info = target_pool.server_info();
// Compare server and client hashes.
let correct_password = target_pool.settings.user.password.as_str();
let password_hash = md5_hash_password(&target_user_name, correct_password, &salt);
if password_hash != password_response {
debug!("Password authentication failed");
wrong_password(&mut write, &target_user_name).await?;
return Err(Error::ClientError);
}
(transaction_mode, server_info)
};
debug!("Password authentication successful");
auth_ok(&mut write).await?;
write_all(&mut write, server_info).await?;
backend_key_data(&mut write, process_id, secret_key).await?;
ready_for_query(&mut write).await?;
trace!("Startup OK");
// Split the read and write streams
// so we can control buffering.
return Ok(Client {
read: BufReader::new(read),
write: write,
addr,
buffer: BytesMut::with_capacity(8196),
cancel_mode: false,
transaction_mode: transaction_mode,
process_id: process_id,
secret_key: secret_key,
client_server_map: client_server_map,
parameters: parameters.clone(),
stats: stats,
admin: admin,
last_address_id: None,
last_server_id: None,
target_pool_name: target_pool_name.clone(),
target_user_name: target_user_name.clone(),
shutdown_event_receiver: shutdown_event_receiver,
});
}
/// Handle cancel request.
pub async fn cancel(
read: S,
write: T,
addr: std::net::SocketAddr,
mut bytes: BytesMut, // The rest of the startup message.
client_server_map: ClientServerMap,
shutdown_event_receiver: Receiver<()>,
) -> Result<Client<S, T>, Error> {
let process_id = bytes.get_i32();
let secret_key = bytes.get_i32();
return Ok(Client {
read: BufReader::new(read),
write: write,
addr,
buffer: BytesMut::with_capacity(8196),
cancel_mode: true,
transaction_mode: false,
process_id: process_id,
secret_key: secret_key,
client_server_map: client_server_map,
parameters: HashMap::new(),
stats: get_reporter(),
admin: false,
last_address_id: None,
last_server_id: None,
target_pool_name: String::from("undefined"),
target_user_name: String::from("undefined"),
shutdown_event_receiver: shutdown_event_receiver,
});
}
/// Handle a connected and authenticated client.
pub async fn handle(&mut self, mut pool: ConnectionPool) -> Result<(), Error> {
pub async fn handle(&mut self) -> Result<(), Error> {
// The client wants to cancel a query it has issued previously.
if self.cancel_mode {
trace!("Sending CancelRequest");
@@ -215,47 +496,103 @@ impl Client {
return Ok(Server::cancel(&address, &port, process_id, secret_key).await?);
}
// The query router determines where the query is going to go,
// e.g. primary, replica, which shard.
let mut query_router = QueryRouter::new();
let mut round_robin = 0;
// Our custom protocol loop.
// We expect the client to either start a transaction with regular queries
// or issue commands for our sharding and server selection protocol.
loop {
trace!("Client idle, waiting for message");
trace!(
"Client idle, waiting for message, transaction mode: {}",
self.transaction_mode
);
// Read a complete message from the client, which normally would be
// either a `Q` (query) or `P` (prepare, extended protocol).
// We can parse it here before grabbing a server from the pool,
// in case the client is sending some custom protocol messages, e.g.
// SET SHARDING KEY TO 'bigint';
let mut message = read_message(&mut self.read).await?;
let mut message = tokio::select! {
_ = self.shutdown_event_receiver.recv() => {
error_response_terminal(&mut self.write, &format!("terminating connection due to administrator command")).await?;
return Ok(())
},
message_result = read_message(&mut self.read) => message_result?
};
// Avoid taking a server if the client just wants to disconnect.
if message[0] as char == 'X' {
trace!("Client disconnecting");
debug!("Client disconnecting");
return Ok(());
}
// Handle admin database queries.
if self.admin {
trace!("Handling admin command");
handle_admin(&mut self.write, message, pool.clone()).await?;
debug!("Handling admin command");
handle_admin(&mut self.write, message, self.client_server_map.clone()).await?;
continue;
}
// Get a pool instance referenced by the most up-to-date
// pointer. This ensures we always read the latest config
// when starting a query.
let pool = match get_pool(self.target_pool_name.clone(), self.target_user_name.clone())
{
Some(pool) => pool,
None => {
error_response(
&mut self.write,
&format!(
"No pool configured for database: {:?}, user: {:?}",
self.target_pool_name, self.target_user_name
),
)
.await?;
return Err(Error::ClientError);
}
};
query_router.update_pool_settings(pool.settings.clone());
let current_shard = query_router.shard();
// Handle all custom protocol commands, if any.
match query_router.try_execute_command(message.clone()) {
// Normal query, not a custom command.
None => {
// Attempt to infer which server we want to query, i.e. primary or replica.
if query_router.query_parser_enabled() && query_router.role() == None {
if query_router.query_parser_enabled() {
query_router.infer_role(message.clone());
}
}
// SET SHARD TO
Some((Command::SetShard, _)) => {
custom_protocol_response_ok(&mut self.write, "SET SHARD").await?;
// Selected shard is not configured.
if query_router.shard() >= pool.shards() {
// Set the shard back to what it was.
query_router.set_shard(current_shard);
error_response(
&mut self.write,
&format!(
"shard {} is more than configured {}, staying on shard {}",
query_router.shard(),
pool.shards(),
current_shard,
),
)
.await?;
} else {
custom_protocol_response_ok(&mut self.write, "SET SHARD").await?;
}
continue;
}
// SET PRIMARY READS TO
Some((Command::SetPrimaryReads, _)) => {
custom_protocol_response_ok(&mut self.write, "SET PRIMARY READS").await?;
continue;
}
@@ -282,27 +619,24 @@ impl Client {
show_response(&mut self.write, "shard", &value).await?;
continue;
}
};
// Make sure we selected a valid shard.
if query_router.shard() >= pool.shards() {
error_response(
&mut self.write,
&format!(
"shard {} is more than configured {}",
query_router.shard(),
pool.shards()
),
)
.await?;
continue;
}
// SHOW PRIMARY READS
Some((Command::ShowPrimaryReads, value)) => {
show_response(&mut self.write, "primary reads", &value).await?;
continue;
}
};
debug!("Waiting for connection from pool");
// Grab a server from the pool.
let connection = match pool
.get(query_router.shard(), query_router.role(), self.process_id)
.get(
query_router.shard(),
query_router.role(),
self.process_id,
round_robin,
)
.await
{
Ok(conn) => {
@@ -321,6 +655,8 @@ impl Client {
let address = connection.1;
let server = &mut *reference;
round_robin += 1;
// Server is assigned to the client in case the client wants to
// cancel a query later.
server.claim(self.process_id, self.secret_key);
@@ -338,10 +674,18 @@ impl Client {
debug!(
"Client {:?} talking to server {:?}",
self.write.peer_addr().unwrap(),
self.addr,
server.address()
);
// Set application_name if any.
// TODO: investigate other parameters and set them too.
if self.parameters.contains_key("application_name") {
server
.set_name(&self.parameters["application_name"])
.await?;
}
// Transaction loop. Multiple queries can be issued by the client here.
// The connection belongs to the client until the transaction is over,
// or until the client disconnects if we are in session mode.
@@ -361,6 +705,7 @@ impl Client {
if server.in_transaction() {
server.query("ROLLBACK").await?;
server.query("DISCARD ALL").await?;
server.set_name("pgcat").await?;
}
return Err(err);
@@ -386,12 +731,26 @@ impl Client {
'Q' => {
debug!("Sending query to server");
server.send(original).await?;
self.send_server_message(
server,
original,
&address,
query_router.shard(),
&pool,
)
.await?;
// Read all data the server has to offer, which can be multiple messages
// buffered in 8196 bytes chunks.
loop {
let response = server.recv().await?;
let response = self
.receive_server_message(
server,
&address,
query_router.shard(),
&pool,
)
.await?;
// Send server reply to the client.
match write_all_half(&mut self.write, response).await {
@@ -432,8 +791,11 @@ impl Client {
if server.in_transaction() {
server.query("ROLLBACK").await?;
server.query("DISCARD ALL").await?;
server.set_name("pgcat").await?;
}
self.release();
return Ok(());
}
@@ -468,14 +830,28 @@ impl Client {
self.buffer.put(&original[..]);
server.send(self.buffer.clone()).await?;
self.send_server_message(
server,
self.buffer.clone(),
&address,
query_router.shard(),
&pool,
)
.await?;
self.buffer.clear();
// Read all data the server has to offer, which can be multiple messages
// buffered in 8196 bytes chunks.
loop {
let response = server.recv().await?;
let response = self
.receive_server_message(
server,
&address,
query_router.shard(),
&pool,
)
.await?;
match write_all_half(&mut self.write, response).await {
Ok(_) => (),
@@ -509,15 +885,31 @@ impl Client {
'd' => {
// Forward the data to the server,
// don't buffer it since it can be rather large.
server.send(original).await?;
self.send_server_message(
server,
original,
&address,
query_router.shard(),
&pool,
)
.await?;
}
// CopyDone or CopyFail
// Copy is done, successfully or not.
'c' | 'f' => {
server.send(original).await?;
self.send_server_message(
server,
original,
&address,
query_router.shard(),
&pool,
)
.await?;
let response = server.recv().await?;
let response = self
.receive_server_message(server, &address, query_router.shard(), &pool)
.await?;
match write_all_half(&mut self.write, response).await {
Ok(_) => (),
@@ -559,10 +951,46 @@ impl Client {
let mut guard = self.client_server_map.lock();
guard.remove(&(self.process_id, self.secret_key));
}
async fn send_server_message(
&self,
server: &mut Server,
message: BytesMut,
address: &Address,
shard: usize,
pool: &ConnectionPool,
) -> Result<(), Error> {
match server.send(message).await {
Ok(_) => Ok(()),
Err(err) => {
pool.ban(address, shard, self.process_id);
Err(err)
}
}
}
async fn receive_server_message(
&self,
server: &mut Server,
address: &Address,
shard: usize,
pool: &ConnectionPool,
) -> Result<BytesMut, Error> {
match server.recv().await {
Ok(message) => Ok(message),
Err(err) => {
pool.ban(address, shard, self.process_id);
Err(err)
}
}
}
}
impl Drop for Client {
impl<S, T> Drop for Client<S, T> {
fn drop(&mut self) {
let mut guard = self.client_server_map.lock();
guard.remove(&(self.process_id, self.secret_key));
// Update statistics.
if let Some(address_id) = self.last_address_id {
self.stats.client_disconnecting(self.process_id, address_id);
@@ -571,5 +999,7 @@ impl Drop for Client {
self.stats.server_idle(process_id, address_id);
}
}
// self.release();
}
}

View File

@@ -1,21 +1,27 @@
/// Parse the configuration file.
use arc_swap::{ArcSwap, Guard};
use arc_swap::ArcSwap;
use log::{error, info};
use once_cell::sync::Lazy;
use serde_derive::Deserialize;
use serde_derive::{Deserialize, Serialize};
use std::collections::{HashMap, HashSet};
use std::hash::Hash;
use std::path::Path;
use std::sync::Arc;
use tokio::fs::File;
use tokio::io::AsyncReadExt;
use toml;
use crate::errors::Error;
use crate::tls::{load_certs, load_keys};
use crate::{ClientServerMap, ConnectionPool};
pub const VERSION: &str = env!("CARGO_PKG_VERSION");
/// Globally available configuration.
static CONFIG: Lazy<ArcSwap<Config>> = Lazy::new(|| ArcSwap::from_pointee(Config::default()));
/// Server role: primary or replica.
#[derive(Clone, PartialEq, Deserialize, Hash, std::cmp::Eq, Debug, Copy)]
#[derive(Clone, PartialEq, Serialize, Deserialize, Hash, std::cmp::Eq, Debug, Copy)]
pub enum Role {
Primary,
Replica,
@@ -55,6 +61,7 @@ pub struct Address {
pub host: String,
pub port: String,
pub shard: usize,
pub database: String,
pub role: Role,
pub replica_number: usize,
}
@@ -67,6 +74,7 @@ impl Default for Address {
port: String::from("5432"),
shard: 0,
replica_number: 0,
database: String::from("database"),
role: Role::Replica,
}
}
@@ -76,39 +84,50 @@ impl Address {
/// Address name (aka database) used in `SHOW STATS`, `SHOW DATABASES`, and `SHOW POOLS`.
pub fn name(&self) -> String {
match self.role {
Role::Primary => format!("shard_{}_primary", self.shard),
Role::Primary => format!("{}_shard_{}_primary", self.database, self.shard),
Role::Replica => format!("shard_{}_replica_{}", self.shard, self.replica_number),
Role::Replica => format!(
"{}_shard_{}_replica_{}",
self.database, self.shard, self.replica_number
),
}
}
}
/// PostgreSQL user.
#[derive(Clone, PartialEq, Hash, std::cmp::Eq, Deserialize, Debug)]
#[derive(Clone, PartialEq, Hash, std::cmp::Eq, Serialize, Deserialize, Debug)]
pub struct User {
pub name: String,
pub username: String,
pub password: String,
pub pool_size: u32,
}
impl Default for User {
fn default() -> User {
User {
name: String::from("postgres"),
username: String::from("postgres"),
password: String::new(),
pool_size: 15,
}
}
}
/// General configuration.
#[derive(Deserialize, Debug, Clone)]
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct General {
pub host: String,
pub port: i16,
pub pool_size: u32,
pub pool_mode: String,
pub enable_prometheus_exporter: Option<bool>,
pub connect_timeout: u64,
pub healthcheck_timeout: u64,
pub shutdown_timeout: u64,
pub healthcheck_delay: u64,
pub ban_time: i64,
pub autoreload: bool,
pub tls_certificate: Option<String>,
pub tls_private_key: Option<String>,
pub admin_username: String,
pub admin_password: String,
}
impl Default for General {
@@ -116,20 +135,49 @@ impl Default for General {
General {
host: String::from("localhost"),
port: 5432,
pool_size: 15,
pool_mode: String::from("transaction"),
enable_prometheus_exporter: Some(false),
connect_timeout: 5000,
healthcheck_timeout: 1000,
shutdown_timeout: 60000,
healthcheck_delay: 30000,
ban_time: 60,
autoreload: false,
tls_certificate: None,
tls_private_key: None,
admin_username: String::from("admin"),
admin_password: String::from("admin"),
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct Pool {
pub pool_mode: String,
pub default_role: String,
pub query_parser_enabled: bool,
pub primary_reads_enabled: bool,
pub sharding_function: String,
pub shards: HashMap<String, Shard>,
pub users: HashMap<String, User>,
}
impl Default for Pool {
fn default() -> Pool {
Pool {
pool_mode: String::from("transaction"),
shards: HashMap::from([(String::from("1"), Shard::default())]),
users: HashMap::default(),
default_role: String::from("any"),
query_parser_enabled: false,
primary_reads_enabled: true,
sharding_function: "pg_bigint_hash".to_string(),
}
}
}
/// Shard configuration.
#[derive(Deserialize, Debug, Clone)]
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct Shard {
pub servers: Vec<(String, u16, String)>,
pub database: String,
pub servers: Vec<(String, u16, String)>,
}
impl Default for Shard {
@@ -141,61 +189,88 @@ impl Default for Shard {
}
}
/// Query Router configuration.
#[derive(Deserialize, Debug, Clone)]
pub struct QueryRouter {
pub default_role: String,
pub query_parser_enabled: bool,
pub primary_reads_enabled: bool,
pub sharding_function: String,
}
impl Default for QueryRouter {
fn default() -> QueryRouter {
QueryRouter {
default_role: String::from("any"),
query_parser_enabled: false,
primary_reads_enabled: true,
sharding_function: "pg_bigint_hash".to_string(),
}
}
fn default_path() -> String {
String::from("pgcat.toml")
}
/// Configuration wrapper.
#[derive(Deserialize, Debug, Clone)]
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct Config {
pub path: Option<String>,
// Serializer maintains the order of fields in the struct
// so we should always put simple fields before nested fields
// in all serializable structs to avoid ValueAfterTable errors
// These errors occur when the toml serializer is about to produce
// ambigous toml structure like the one below
// [main]
// field1_under_main = 1
// field2_under_main = 2
// [main.subconf]
// field1_under_subconf = 1
// field3_under_main = 3 # This field will be interpreted as being under subconf and not under main
#[serde(default = "default_path")]
pub path: String,
pub general: General,
pub user: User,
pub shards: HashMap<String, Shard>,
pub query_router: QueryRouter,
pub pools: HashMap<String, Pool>,
}
impl Default for Config {
fn default() -> Config {
Config {
path: Some(String::from("pgcat.toml")),
path: String::from("pgcat.toml"),
general: General::default(),
user: User::default(),
shards: HashMap::from([(String::from("1"), Shard::default())]),
query_router: QueryRouter::default(),
pools: HashMap::default(),
}
}
}
impl From<&Config> for std::collections::HashMap<String, String> {
fn from(config: &Config) -> HashMap<String, String> {
HashMap::from([
let mut r: Vec<(String, String)> = config
.pools
.iter()
.flat_map(|(pool_name, pool)| {
[
(
format!("pools.{}.pool_mode", pool_name),
pool.pool_mode.clone(),
),
(
format!("pools.{}.primary_reads_enabled", pool_name),
pool.primary_reads_enabled.to_string(),
),
(
format!("pools.{}.query_parser_enabled", pool_name),
pool.query_parser_enabled.to_string(),
),
(
format!("pools.{}.default_role", pool_name),
pool.default_role.clone(),
),
(
format!("pools.{}.sharding_function", pool_name),
pool.sharding_function.clone(),
),
(
format!("pools.{:?}.shard_count", pool_name),
pool.shards.len().to_string(),
),
(
format!("pools.{:?}.users", pool_name),
pool.users
.iter()
.map(|(_username, user)| &user.username)
.cloned()
.collect::<Vec<String>>()
.join(", "),
),
]
})
.collect();
let mut static_settings = vec![
("host".to_string(), config.general.host.to_string()),
("port".to_string(), config.general.port.to_string()),
(
"pool_size".to_string(),
config.general.pool_size.to_string(),
),
(
"pool_mode".to_string(),
config.general.pool_mode.to_string(),
),
(
"connect_timeout".to_string(),
config.general.connect_timeout.to_string(),
@@ -204,48 +279,78 @@ impl From<&Config> for std::collections::HashMap<String, String> {
"healthcheck_timeout".to_string(),
config.general.healthcheck_timeout.to_string(),
),
(
"shutdown_timeout".to_string(),
config.general.shutdown_timeout.to_string(),
),
(
"healthcheck_delay".to_string(),
config.general.healthcheck_delay.to_string(),
),
("ban_time".to_string(), config.general.ban_time.to_string()),
(
"default_role".to_string(),
config.query_router.default_role.to_string(),
),
(
"query_parser_enabled".to_string(),
config.query_router.query_parser_enabled.to_string(),
),
(
"primary_reads_enabled".to_string(),
config.query_router.primary_reads_enabled.to_string(),
),
(
"sharding_function".to_string(),
config.query_router.sharding_function.to_string(),
),
])
];
r.append(&mut static_settings);
return r.iter().cloned().collect();
}
}
impl Config {
/// Print current configuration.
pub fn show(&self) {
info!("Pool size: {}", self.general.pool_size);
info!("Pool mode: {}", self.general.pool_mode);
info!("Ban time: {}s", self.general.ban_time);
info!(
"Healthcheck timeout: {}ms",
self.general.healthcheck_timeout
);
info!("Connection timeout: {}ms", self.general.connect_timeout);
info!("Sharding function: {}", self.query_router.sharding_function);
info!("Number of shards: {}", self.shards.len());
info!("Shutdown timeout: {}ms", self.general.shutdown_timeout);
info!("Healthcheck delay: {}ms", self.general.healthcheck_delay);
match self.general.tls_certificate.clone() {
Some(tls_certificate) => {
info!("TLS certificate: {}", tls_certificate);
match self.general.tls_private_key.clone() {
Some(tls_private_key) => {
info!("TLS private key: {}", tls_private_key);
info!("TLS support is enabled");
}
None => (),
}
}
None => {
info!("TLS support is disabled");
}
};
for (pool_name, pool_config) in &self.pools {
info!("--- Settings for pool {} ---", pool_name);
info!(
"Pool size from all users: {}",
pool_config
.users
.iter()
.map(|(_, user_cfg)| user_cfg.pool_size)
.sum::<u32>()
.to_string()
);
info!("Pool mode: {}", pool_config.pool_mode);
info!("Sharding function: {}", pool_config.sharding_function);
info!("Primary reads: {}", pool_config.primary_reads_enabled);
info!("Query router: {}", pool_config.query_parser_enabled);
info!("Number of shards: {}", pool_config.shards.len());
info!("Number of users: {}", pool_config.users.len());
}
}
}
/// Get a read-only instance of the configuration
/// from anywhere in the app.
/// ArcSwap makes this cheap and quick.
pub fn get_config() -> Guard<Arc<Config>> {
CONFIG.load()
pub fn get_config() -> Config {
(*(*CONFIG.load())).clone()
}
/// Parse the configuration file located at the path.
@@ -275,89 +380,122 @@ pub async fn parse(path: &str) -> Result<(), Error> {
}
};
match config.query_router.sharding_function.as_ref() {
"pg_bigint_hash" => (),
"sha1" => (),
_ => {
error!(
"Supported sharding functions are: 'pg_bigint_hash', 'sha1', got: '{}'",
config.query_router.sharding_function
);
return Err(Error::BadConfig);
// Validate TLS!
match config.general.tls_certificate.clone() {
Some(tls_certificate) => {
match load_certs(&Path::new(&tls_certificate)) {
Ok(_) => {
// Cert is okay, but what about the private key?
match config.general.tls_private_key.clone() {
Some(tls_private_key) => match load_keys(&Path::new(&tls_private_key)) {
Ok(_) => (),
Err(err) => {
error!("tls_private_key is incorrectly configured: {:?}", err);
return Err(Error::BadConfig);
}
},
None => {
error!("tls_certificate is set, but the tls_private_key is not");
return Err(Error::BadConfig);
}
};
}
Err(err) => {
error!("tls_certificate is incorrectly configured: {:?}", err);
return Err(Error::BadConfig);
}
}
}
None => (),
};
// Quick config sanity check.
for shard in &config.shards {
// We use addresses as unique identifiers,
// let's make sure they are unique in the config as well.
let mut dup_check = HashSet::new();
let mut primary_count = 0;
match shard.0.parse::<usize>() {
Ok(_) => (),
Err(_) => {
for (pool_name, pool) in &config.pools {
match pool.sharding_function.as_ref() {
"pg_bigint_hash" => (),
"sha1" => (),
_ => {
error!(
"Shard '{}' is not a valid number, shards must be numbered starting at 0",
shard.0
"Supported sharding functions are: 'pg_bigint_hash', 'sha1', got: '{}' in pool {} settings",
pool.sharding_function,
pool_name
);
return Err(Error::BadConfig);
}
};
if shard.1.servers.len() == 0 {
error!("Shard {} has no servers configured", shard.0);
return Err(Error::BadConfig);
}
match pool.default_role.as_ref() {
"any" => (),
"primary" => (),
"replica" => (),
other => {
error!(
"Query router default_role must be 'primary', 'replica', or 'any', got: '{}'",
other
);
return Err(Error::BadConfig);
}
};
for server in &shard.1.servers {
dup_check.insert(server);
for shard in &pool.shards {
// We use addresses as unique identifiers,
// let's make sure they are unique in the config as well.
let mut dup_check = HashSet::new();
let mut primary_count = 0;
// Check that we define only zero or one primary.
match server.2.as_ref() {
"primary" => primary_count += 1,
_ => (),
};
// Check role spelling.
match server.2.as_ref() {
"primary" => (),
"replica" => (),
_ => {
match shard.0.parse::<usize>() {
Ok(_) => (),
Err(_) => {
error!(
"Shard {} server role must be either 'primary' or 'replica', got: '{}'",
shard.0, server.2
"Shard '{}' is not a valid number, shards must be numbered starting at 0",
shard.0
);
return Err(Error::BadConfig);
}
};
}
if primary_count > 1 {
error!("Shard {} has more than on primary configured", &shard.0);
return Err(Error::BadConfig);
}
if shard.1.servers.len() == 0 {
error!("Shard {} has no servers configured", shard.0);
return Err(Error::BadConfig);
}
if dup_check.len() != shard.1.servers.len() {
error!("Shard {} contains duplicate server configs", &shard.0);
return Err(Error::BadConfig);
for server in &shard.1.servers {
dup_check.insert(server);
// Check that we define only zero or one primary.
match server.2.as_ref() {
"primary" => primary_count += 1,
_ => (),
};
// Check role spelling.
match server.2.as_ref() {
"primary" => (),
"replica" => (),
_ => {
error!(
"Shard {} server role must be either 'primary' or 'replica', got: '{}'",
shard.0, server.2
);
return Err(Error::BadConfig);
}
};
}
if primary_count > 1 {
error!("Shard {} has more than on primary configured", &shard.0);
return Err(Error::BadConfig);
}
if dup_check.len() != shard.1.servers.len() {
error!("Shard {} contains duplicate server configs", &shard.0);
return Err(Error::BadConfig);
}
}
}
match config.query_router.default_role.as_ref() {
"any" => (),
"primary" => (),
"replica" => (),
other => {
error!(
"Query router default_role must be 'primary', 'replica', or 'any', got: '{}'",
other
);
return Err(Error::BadConfig);
}
};
config.path = Some(path.to_string());
config.path = path.to_string();
// Update the configuration globally.
CONFIG.store(Arc::new(config.clone()));
@@ -365,6 +503,28 @@ pub async fn parse(path: &str) -> Result<(), Error> {
Ok(())
}
pub async fn reload_config(client_server_map: ClientServerMap) -> Result<bool, Error> {
let old_config = get_config();
match parse(&old_config.path).await {
Ok(()) => (),
Err(err) => {
error!("Config reload error: {:?}", err);
return Err(Error::BadConfig);
}
};
let new_config = get_config();
if old_config.pools != new_config.pools {
info!("Pool configuration changed, re-creating server pools");
ConnectionPool::from_config(client_server_map).await?;
Ok(true)
} else if old_config != new_config {
Ok(true)
} else {
Ok(false)
}
}
#[cfg(test)]
mod test {
use super::*;
@@ -372,11 +532,67 @@ mod test {
#[tokio::test]
async fn test_config() {
parse("pgcat.toml").await.unwrap();
assert_eq!(get_config().general.pool_size, 15);
assert_eq!(get_config().shards.len(), 3);
assert_eq!(get_config().shards["1"].servers[0].0, "127.0.0.1");
assert_eq!(get_config().shards["0"].servers[0].2, "primary");
assert_eq!(get_config().query_router.default_role, "any");
assert_eq!(get_config().path, Some("pgcat.toml".to_string()));
assert_eq!(get_config().path, "pgcat.toml".to_string());
assert_eq!(get_config().general.ban_time, 60);
assert_eq!(get_config().pools.len(), 2);
assert_eq!(get_config().pools["sharded_db"].shards.len(), 3);
assert_eq!(get_config().pools["simple_db"].shards.len(), 1);
assert_eq!(get_config().pools["sharded_db"].users.len(), 2);
assert_eq!(get_config().pools["simple_db"].users.len(), 1);
assert_eq!(
get_config().pools["sharded_db"].shards["0"].servers[0].0,
"127.0.0.1"
);
assert_eq!(
get_config().pools["sharded_db"].shards["1"].servers[0].2,
"primary"
);
assert_eq!(
get_config().pools["sharded_db"].shards["1"].database,
"shard1"
);
assert_eq!(
get_config().pools["sharded_db"].users["0"].username,
"sharding_user"
);
assert_eq!(
get_config().pools["sharded_db"].users["1"].password,
"other_user"
);
assert_eq!(get_config().pools["sharded_db"].users["1"].pool_size, 21);
assert_eq!(get_config().pools["sharded_db"].default_role, "any");
assert_eq!(
get_config().pools["simple_db"].shards["0"].servers[0].0,
"127.0.0.1"
);
assert_eq!(
get_config().pools["simple_db"].shards["0"].servers[0].1,
5432
);
assert_eq!(
get_config().pools["simple_db"].shards["0"].database,
"some_db"
);
assert_eq!(get_config().pools["simple_db"].default_role, "primary");
assert_eq!(
get_config().pools["simple_db"].users["0"].username,
"simple_user"
);
assert_eq!(
get_config().pools["simple_db"].users["0"].password,
"simple_user"
);
assert_eq!(get_config().pools["simple_db"].users["0"].pool_size, 5);
}
#[tokio::test]
async fn test_serialize_configs() {
parse("pgcat.toml").await.unwrap();
print!("{}", toml::to_string(&get_config()).unwrap());
}
}

View File

@@ -14,6 +14,13 @@ pub const CANCEL_REQUEST_CODE: i32 = 80877102;
// AuthenticationMD5Password
pub const MD5_ENCRYPTED_PASSWORD: i32 = 5;
// SASL
pub const SASL: i32 = 10;
pub const SASL_CONTINUE: i32 = 11;
pub const SASL_FINAL: i32 = 12;
pub const SCRAM_SHA_256: &str = "SCRAM-SHA-256";
pub const NONCE_LENGTH: usize = 24;
// AuthenticationOk
pub const AUTHENTICATION_SUCCESSFUL: i32 = 0;

View File

@@ -9,4 +9,6 @@ pub enum Error {
ServerError,
BadConfig,
AllServersDown,
ClientError,
TlsError,
}

View File

@@ -1,4 +1,4 @@
// Copyright (c) 2022 Lev Kokotov <lev@levthe.dev>
// Copyright (c) 2022 Lev Kokotov <hi@levthe.dev>
// Permission is hereby granted, free of charge, to any person obtaining
// a copy of this software and associated documentation files (the
@@ -28,23 +28,27 @@ extern crate log;
extern crate md5;
extern crate num_cpus;
extern crate once_cell;
extern crate rustls_pemfile;
extern crate serde;
extern crate serde_derive;
extern crate sqlparser;
extern crate tokio;
extern crate tokio_rustls;
extern crate toml;
use log::{error, info};
use log::{debug, error, info};
use parking_lot::Mutex;
use tokio::net::TcpListener;
use tokio::{
signal,
signal::unix::{signal as unix_signal, SignalKind},
sync::mpsc,
};
use std::collections::HashMap;
use std::net::SocketAddr;
use std::str::FromStr;
use std::sync::Arc;
use tokio::sync::broadcast;
mod admin;
mod client;
@@ -53,19 +57,23 @@ mod constants;
mod errors;
mod messages;
mod pool;
mod prometheus;
mod query_router;
mod scram;
mod server;
mod sharding;
mod stats;
mod tls;
use config::get_config;
use pool::{ClientServerMap, ConnectionPool};
use stats::{Collector, Reporter};
use crate::config::{get_config, reload_config, VERSION};
use crate::pool::{ClientServerMap, ConnectionPool};
use crate::prometheus::start_metric_server;
use crate::stats::{Collector, Reporter, REPORTER};
#[tokio::main(worker_threads = 4)]
async fn main() {
env_logger::init();
info!("Welcome to PgCat! Meow.");
info!("Welcome to PgCat! Meow. (Version {})", VERSION);
if !query_router::QueryRouter::setup() {
error!("Could not setup query router");
@@ -89,6 +97,21 @@ async fn main() {
};
let config = get_config();
if let Some(true) = config.general.enable_prometheus_exporter {
let http_addr_str = format!("{}:{}", config.general.host, crate::prometheus::HTTP_PORT);
let http_addr = match SocketAddr::from_str(&http_addr_str) {
Ok(addr) => addr,
Err(err) => {
error!("Invalid http address: {}", err);
return;
}
};
tokio::task::spawn(async move {
start_metric_server(http_addr).await;
});
}
let addr = format!("{}:{}", config.general.host, config.general.port);
let listener = match TcpListener::bind(&addr).await {
@@ -108,76 +131,93 @@ async fn main() {
// Statistics reporting.
let (tx, rx) = mpsc::channel(100);
REPORTER.store(Arc::new(Reporter::new(tx.clone())));
// Connection pool that allows to query all shards and replicas.
let mut pool =
ConnectionPool::from_config(client_server_map.clone(), Reporter::new(tx.clone())).await;
// Statistics collector task.
let collector_tx = tx.clone();
let addresses = pool.databases();
tokio::task::spawn(async move {
let mut stats_collector = Collector::new(rx, collector_tx);
stats_collector.collect(addresses).await;
});
// Connect to all servers and validate their versions.
let server_info = match pool.validate().await {
Ok(info) => info,
match ConnectionPool::from_config(client_server_map.clone()).await {
Ok(_) => (),
Err(err) => {
error!("Could not validate connection pool: {:?}", err);
error!("Pool error: {:?}", err);
return;
}
};
// Statistics collector task.
let collector_tx = tx.clone();
// Save these for reloading
let reload_client_server_map = client_server_map.clone();
let autoreload_client_server_map = client_server_map.clone();
tokio::task::spawn(async move {
let mut stats_collector = Collector::new(rx, collector_tx);
stats_collector.collect().await;
});
info!("Waiting for clients");
let (shutdown_event_tx, mut shutdown_event_rx) = broadcast::channel::<()>(1);
let shutdown_event_tx_clone = shutdown_event_tx.clone();
// Client connection loop.
tokio::task::spawn(async move {
// Creates event subscriber for shutdown event, this is dropped when shutdown event is broadcast
let mut listener_shutdown_event_rx = shutdown_event_tx_clone.subscribe();
loop {
let pool = pool.clone();
let client_server_map = client_server_map.clone();
let server_info = server_info.clone();
let reporter = Reporter::new(tx.clone());
let (socket, addr) = match listener.accept().await {
Ok((socket, addr)) => (socket, addr),
Err(err) => {
error!("{:?}", err);
continue;
// Listen for shutdown event and client connection at the same time
let (socket, addr) = tokio::select! {
_ = listener_shutdown_event_rx.recv() => {
// Exits client connection loop which drops listener, listener_shutdown_event_rx and shutdown_event_tx_clone
break;
}
listener_response = listener.accept() => {
match listener_response {
Ok((socket, addr)) => (socket, addr),
Err(err) => {
error!("{:?}", err);
continue;
}
}
}
};
// Used to signal shutdown
let client_shutdown_handler_rx = shutdown_event_tx_clone.subscribe();
// Used to signal that the task has completed
let dummy_tx = shutdown_event_tx_clone.clone();
// Handle client.
tokio::task::spawn(async move {
let start = chrono::offset::Utc::now().naive_utc();
match client::Client::startup(socket, client_server_map, server_info, reporter)
.await
match client::client_entrypoint(
socket,
client_server_map,
client_shutdown_handler_rx,
)
.await
{
Ok(mut client) => {
info!("Client {:?} connected", addr);
match client.handle(pool).await {
Ok(()) => {
let duration = chrono::offset::Utc::now().naive_utc() - start;
Ok(_) => {
let duration = chrono::offset::Utc::now().naive_utc() - start;
info!(
"Client {:?} disconnected, session duration: {}",
addr,
format_duration(&duration)
);
}
Err(err) => {
error!("Client disconnected with error: {:?}", err);
client.release();
}
}
info!(
"Client {:?} disconnected, session duration: {}",
addr,
format_duration(&duration)
);
}
Err(err) => {
error!("Client failed to login: {:?}", err);
debug!("Client disconnected with error {:?}", err);
}
};
// Drop this transmitter so receiver knows that the task is completed
drop(dummy_tx);
});
}
});
@@ -189,26 +229,73 @@ async fn main() {
loop {
stream.recv().await;
info!("Reloading config");
match config::parse("pgcat.toml").await {
Ok(_) => {
get_config().show();
}
Err(err) => {
error!("{:?}", err);
return;
}
match reload_config(reload_client_server_map.clone()).await {
Ok(_) => (),
Err(_) => continue,
};
get_config().show();
}
});
// Exit on Ctrl-C (SIGINT) and SIGTERM.
if config.general.autoreload {
let mut interval = tokio::time::interval(tokio::time::Duration::from_millis(15_000));
tokio::task::spawn(async move {
info!("Config autoreloader started");
loop {
interval.tick().await;
match reload_config(autoreload_client_server_map.clone()).await {
Ok(changed) => {
if changed {
get_config().show()
}
}
Err(_) => (),
};
}
});
}
let mut term_signal = unix_signal(SignalKind::terminate()).unwrap();
let mut interrupt_signal = unix_signal(SignalKind::interrupt()).unwrap();
tokio::select! {
_ = signal::ctrl_c() => (),
// Initiate graceful shutdown sequence on sig int
_ = interrupt_signal.recv() => {
info!("Got SIGINT, waiting for client connection drain now");
// Broadcast that client tasks need to finish
shutdown_event_tx.send(()).unwrap();
// Closes transmitter
drop(shutdown_event_tx);
// This is in a loop because the first event that the receiver receives will be the shutdown event
// This is not what we are waiting for instead, we want the receiver to send an error once all senders are closed which is reached after the shutdown event is received
loop {
match tokio::time::timeout(
tokio::time::Duration::from_millis(config.general.shutdown_timeout),
shutdown_event_rx.recv(),
)
.await
{
Ok(res) => match res {
Ok(_) => {}
Err(_) => break,
},
Err(_) => {
info!("Timed out while waiting for clients to shutdown");
break;
}
}
}
},
_ = term_signal.recv() => (),
};
}
info!("Shutting down...");
}

View File

@@ -2,14 +2,12 @@
/// and handle TcpStream (TCP socket).
use bytes::{Buf, BufMut, BytesMut};
use md5::{Digest, Md5};
use tokio::io::{AsyncReadExt, AsyncWriteExt, BufReader};
use tokio::net::{
tcp::{OwnedReadHalf, OwnedWriteHalf},
TcpStream,
};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpStream;
use crate::errors::Error;
use std::collections::HashMap;
use std::mem;
/// Postgres data type mappings
/// used in RowDescription ('T') message.
@@ -30,7 +28,10 @@ impl From<&DataType> for i32 {
}
/// Tell the client that authentication handshake completed successfully.
pub async fn auth_ok(stream: &mut TcpStream) -> Result<(), Error> {
pub async fn auth_ok<S>(stream: &mut S) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut auth_ok = BytesMut::with_capacity(9);
auth_ok.put_u8(b'R');
@@ -40,13 +41,39 @@ pub async fn auth_ok(stream: &mut TcpStream) -> Result<(), Error> {
Ok(write_all(stream, auth_ok).await?)
}
/// Generate md5 password challenge.
pub async fn md5_challenge<S>(stream: &mut S) -> Result<[u8; 4], Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
// let mut rng = rand::thread_rng();
let salt: [u8; 4] = [
rand::random(),
rand::random(),
rand::random(),
rand::random(),
];
let mut res = BytesMut::new();
res.put_u8(b'R');
res.put_i32(12);
res.put_i32(5); // MD5
res.put_slice(&salt[..]);
write_all(stream, res).await?;
Ok(salt)
}
/// Give the client the process_id and secret we generated
/// used in query cancellation.
pub async fn backend_key_data(
stream: &mut TcpStream,
pub async fn backend_key_data<S>(
stream: &mut S,
backend_id: i32,
secret_key: i32,
) -> Result<(), Error> {
) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut key_data = BytesMut::from(&b"K"[..]);
key_data.put_i32(12);
key_data.put_i32(backend_id);
@@ -67,8 +94,13 @@ pub fn simple_query(query: &str) -> BytesMut {
}
/// Tell the client we're ready for another query.
pub async fn ready_for_query(stream: &mut TcpStream) -> Result<(), Error> {
let mut bytes = BytesMut::with_capacity(5);
pub async fn ready_for_query<S>(stream: &mut S) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut bytes = BytesMut::with_capacity(
mem::size_of::<u8>() + mem::size_of::<i32>() + mem::size_of::<u8>(),
);
bytes.put_u8(b'Z');
bytes.put_i32(5);
@@ -160,14 +192,8 @@ pub fn parse_startup(bytes: BytesMut) -> Result<HashMap<String, String>, Error>
Ok(result)
}
/// Send password challenge response to the server.
/// This is the MD5 challenge.
pub async fn md5_password(
stream: &mut TcpStream,
user: &str,
password: &str,
salt: &[u8],
) -> Result<(), Error> {
/// Create md5 password hash given a salt.
pub fn md5_hash_password(user: &str, password: &str, salt: &[u8]) -> Vec<u8> {
let mut md5 = Md5::new();
// First pass
@@ -186,6 +212,22 @@ pub async fn md5_password(
.collect::<Vec<u8>>();
password.push(0);
password
}
/// Send password challenge response to the server.
/// This is the MD5 challenge.
pub async fn md5_password<S>(
stream: &mut S,
user: &str,
password: &str,
salt: &[u8],
) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let password = md5_hash_password(user, password, salt);
let mut message = BytesMut::with_capacity(password.len() as usize + 5);
message.put_u8(b'p');
@@ -198,10 +240,10 @@ pub async fn md5_password(
/// Implements a response to our custom `SET SHARDING KEY`
/// and `SET SERVER ROLE` commands.
/// This tells the client we're ready for the next query.
pub async fn custom_protocol_response_ok(
stream: &mut OwnedWriteHalf,
message: &str,
) -> Result<(), Error> {
pub async fn custom_protocol_response_ok<S>(stream: &mut S, message: &str) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut res = BytesMut::with_capacity(25);
let set_complete = BytesMut::from(&format!("{}\0", message)[..]);
@@ -212,18 +254,28 @@ pub async fn custom_protocol_response_ok(
res.put_i32(len);
res.put_slice(&set_complete[..]);
// ReadyForQuery (idle)
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, res).await
write_all_half(stream, res).await?;
ready_for_query(stream).await
}
/// Send a custom error message to the client.
/// Tell the client we are ready for the next query and no rollback is necessary.
/// Docs on error codes: <https://www.postgresql.org/docs/12/errcodes-appendix.html>.
pub async fn error_response(stream: &mut OwnedWriteHalf, message: &str) -> Result<(), Error> {
pub async fn error_response<S>(stream: &mut S, message: &str) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
error_response_terminal(stream, message).await?;
ready_for_query(stream).await
}
/// Send a custom error message to the client.
/// Tell the client we are ready for the next query and no rollback is necessary.
/// Docs on error codes: <https://www.postgresql.org/docs/12/errcodes-appendix.html>.
pub async fn error_response_terminal<S>(stream: &mut S, message: &str) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut error = BytesMut::new();
// Error level
@@ -245,31 +297,57 @@ pub async fn error_response(stream: &mut OwnedWriteHalf, message: &str) -> Resul
// No more fields follow.
error.put_u8(0);
// Ready for query, no rollback needed (I = idle).
let mut ready_for_query = BytesMut::new();
// Compose the two message reply.
let mut res = BytesMut::with_capacity(error.len() + 5);
ready_for_query.put_u8(b'Z');
ready_for_query.put_i32(5);
ready_for_query.put_u8(b'I');
res.put_u8(b'E');
res.put_i32(error.len() as i32 + 4);
res.put(error);
Ok(write_all_half(stream, res).await?)
}
pub async fn wrong_password<S>(stream: &mut S, user: &str) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut error = BytesMut::new();
// Error level
error.put_u8(b'S');
error.put_slice(&b"FATAL\0"[..]);
// Error level (non-translatable)
error.put_u8(b'V');
error.put_slice(&b"FATAL\0"[..]);
// Error code: not sure how much this matters.
error.put_u8(b'C');
error.put_slice(&b"28P01\0"[..]); // system_error, see Appendix A.
// The short error message.
error.put_u8(b'M');
error.put_slice(&format!("password authentication failed for user \"{}\"\0", user).as_bytes());
// No more fields follow.
error.put_u8(0);
// Compose the two message reply.
let mut res = BytesMut::with_capacity(error.len() + ready_for_query.len() + 5);
let mut res = BytesMut::new();
res.put_u8(b'E');
res.put_i32(error.len() as i32 + 4);
res.put(error);
res.put(ready_for_query);
Ok(write_all_half(stream, res).await?)
write_all(stream, res).await
}
/// Respond to a SHOW SHARD command.
pub async fn show_response(
stream: &mut OwnedWriteHalf,
name: &str,
value: &str,
) -> Result<(), Error> {
pub async fn show_response<S>(stream: &mut S, name: &str, value: &str) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
// A SELECT response consists of:
// 1. RowDescription
// 2. One or more DataRow
@@ -288,12 +366,8 @@ pub async fn show_response(
// CommandComplete
res.put(command_complete("SELECT 1"));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, res).await
write_all_half(stream, res).await?;
ready_for_query(stream).await
}
pub fn row_description(columns: &Vec<(&str, DataType)>) -> BytesMut {
@@ -370,7 +444,10 @@ pub fn command_complete(command: &str) -> BytesMut {
}
/// Write all data in the buffer to the TcpStream.
pub async fn write_all(stream: &mut TcpStream, buf: BytesMut) -> Result<(), Error> {
pub async fn write_all<S>(stream: &mut S, buf: BytesMut) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
match stream.write_all(&buf).await {
Ok(_) => Ok(()),
Err(_) => return Err(Error::SocketError),
@@ -378,7 +455,10 @@ pub async fn write_all(stream: &mut TcpStream, buf: BytesMut) -> Result<(), Erro
}
/// Write all the data in the buffer to the TcpStream, write owned half (see mpsc).
pub async fn write_all_half(stream: &mut OwnedWriteHalf, buf: BytesMut) -> Result<(), Error> {
pub async fn write_all_half<S>(stream: &mut S, buf: BytesMut) -> Result<(), Error>
where
S: tokio::io::AsyncWrite + std::marker::Unpin,
{
match stream.write_all(&buf).await {
Ok(_) => Ok(()),
Err(_) => return Err(Error::SocketError),
@@ -386,7 +466,10 @@ pub async fn write_all_half(stream: &mut OwnedWriteHalf, buf: BytesMut) -> Resul
}
/// Read a complete message from the socket.
pub async fn read_message(stream: &mut BufReader<OwnedReadHalf>) -> Result<BytesMut, Error> {
pub async fn read_message<S>(stream: &mut S) -> Result<BytesMut, Error>
where
S: tokio::io::AsyncRead + std::marker::Unpin,
{
let code = match stream.read_u8().await {
Ok(code) => code,
Err(_) => return Err(Error::SocketError),
@@ -412,3 +495,20 @@ pub async fn read_message(stream: &mut BufReader<OwnedReadHalf>) -> Result<Bytes
Ok(bytes)
}
pub fn server_paramater_message(key: &str, value: &str) -> BytesMut {
let mut server_info = BytesMut::new();
let null_byte_size = 1;
let len: usize =
mem::size_of::<i32>() + key.len() + null_byte_size + value.len() + null_byte_size;
server_info.put_slice("S".as_bytes());
server_info.put_i32(len.try_into().unwrap());
server_info.put_slice(key.as_bytes());
server_info.put_bytes(0, 1);
server_info.put_slice(value.as_bytes());
server_info.put_bytes(0, 1);
return server_info;
}

View File

@@ -1,119 +1,195 @@
/// Pooling, failover and banlist.
use arc_swap::ArcSwap;
use async_trait::async_trait;
use bb8::{ManageConnection, Pool, PooledConnection};
use bytes::BytesMut;
use chrono::naive::NaiveDateTime;
use log::{debug, error, info, warn};
use once_cell::sync::Lazy;
use parking_lot::{Mutex, RwLock};
use std::collections::HashMap;
use std::sync::Arc;
use std::time::Instant;
use crate::config::{get_config, Address, Role, User};
use crate::config::{get_config, Address, Role, Shard, User};
use crate::errors::Error;
use crate::server::Server;
use crate::stats::Reporter;
use crate::stats::{get_reporter, Reporter};
pub type BanList = Arc<RwLock<Vec<HashMap<Address, NaiveDateTime>>>>;
pub type ClientServerMap = Arc<Mutex<HashMap<(i32, i32), (i32, i32, String, String)>>>;
pub type PoolMap = HashMap<(String, String), ConnectionPool>;
/// The connection pool, globally available.
/// This is atomic and safe and read-optimized.
/// The pool is recreated dynamically when the config is reloaded.
pub static POOLS: Lazy<ArcSwap<PoolMap>> = Lazy::new(|| ArcSwap::from_pointee(HashMap::default()));
#[derive(Clone, Debug)]
pub struct PoolSettings {
pub pool_mode: String,
pub shards: HashMap<String, Shard>,
pub user: User,
pub default_role: String,
pub query_parser_enabled: bool,
pub primary_reads_enabled: bool,
pub sharding_function: String,
}
impl Default for PoolSettings {
fn default() -> PoolSettings {
PoolSettings {
pool_mode: String::from("transaction"),
shards: HashMap::from([(String::from("1"), Shard::default())]),
user: User::default(),
default_role: String::from("any"),
query_parser_enabled: false,
primary_reads_enabled: true,
sharding_function: "pg_bigint_hash".to_string(),
}
}
}
/// The globally accessible connection pool.
#[derive(Clone, Debug)]
#[derive(Clone, Debug, Default)]
pub struct ConnectionPool {
/// The pools handled internally by bb8.
databases: Vec<Vec<Pool<ServerPool>>>,
/// The addresses (host, port, role) to handle
/// failover and load balancing deterministically.
addresses: Vec<Vec<Address>>,
round_robin: usize,
/// List of banned addresses (see above)
/// that should not be queried.
banlist: BanList,
/// The statistics aggregator runs in a separate task
/// and receives stats from clients, servers, and the pool.
stats: Reporter,
/// The server information (K messages) have to be passed to the
/// clients on startup. We pre-connect to all shards and replicas
/// on pool creation and save the K messages here.
server_info: BytesMut,
pub settings: PoolSettings,
}
impl ConnectionPool {
/// Construct the connection pool from the configuration.
pub async fn from_config(
client_server_map: ClientServerMap,
stats: Reporter,
) -> ConnectionPool {
pub async fn from_config(client_server_map: ClientServerMap) -> Result<(), Error> {
let config = get_config();
let mut shards = Vec::new();
let mut addresses = Vec::new();
let mut banlist = Vec::new();
let mut new_pools = PoolMap::default();
let mut address_id = 0;
let mut shard_ids = config
.shards
.clone()
.into_keys()
.map(|x| x.to_string())
.collect::<Vec<String>>();
shard_ids.sort_by_key(|k| k.parse::<i64>().unwrap());
for (pool_name, pool_config) in &config.pools {
for (_user_index, user_info) in &pool_config.users {
let mut shards = Vec::new();
let mut addresses = Vec::new();
let mut banlist = Vec::new();
let mut shard_ids = pool_config
.shards
.clone()
.into_keys()
.map(|x| x.to_string())
.collect::<Vec<String>>();
for shard_idx in shard_ids {
let shard = &config.shards[&shard_idx];
let mut pools = Vec::new();
let mut servers = Vec::new();
let mut replica_number = 0;
// Sort by shard number to ensure consistency.
shard_ids.sort_by_key(|k| k.parse::<i64>().unwrap());
for server in shard.servers.iter() {
let role = match server.2.as_ref() {
"primary" => Role::Primary,
"replica" => Role::Replica,
_ => {
error!("Config error: server role can be 'primary' or 'replica', have: '{}'. Defaulting to 'replica'.", server.2);
Role::Replica
for shard_idx in shard_ids {
let shard = &pool_config.shards[&shard_idx];
let mut pools = Vec::new();
let mut servers = Vec::new();
let mut replica_number = 0;
for server in shard.servers.iter() {
let role = match server.2.as_ref() {
"primary" => Role::Primary,
"replica" => Role::Replica,
_ => {
error!("Config error: server role can be 'primary' or 'replica', have: '{}'. Defaulting to 'replica'.", server.2);
Role::Replica
}
};
let address = Address {
id: address_id,
database: pool_name.clone(),
host: server.0.clone(),
port: server.1.to_string(),
role: role,
replica_number,
shard: shard_idx.parse::<usize>().unwrap(),
};
address_id += 1;
if role == Role::Replica {
replica_number += 1;
}
let manager = ServerPool::new(
address.clone(),
user_info.clone(),
&shard.database,
client_server_map.clone(),
get_reporter(),
);
let pool = Pool::builder()
.max_size(user_info.pool_size)
.connection_timeout(std::time::Duration::from_millis(
config.general.connect_timeout,
))
.test_on_check_out(false)
.build(manager)
.await
.unwrap();
pools.push(pool);
servers.push(address);
}
};
let address = Address {
id: address_id,
host: server.0.clone(),
port: server.1.to_string(),
role: role,
replica_number,
shard: shard_idx.parse::<usize>().unwrap(),
};
address_id += 1;
if role == Role::Replica {
replica_number += 1;
shards.push(pools);
addresses.push(servers);
banlist.push(HashMap::new());
}
let manager = ServerPool::new(
address.clone(),
config.user.clone(),
&shard.database,
client_server_map.clone(),
stats.clone(),
);
assert_eq!(shards.len(), addresses.len());
let pool = Pool::builder()
.max_size(config.general.pool_size)
.connection_timeout(std::time::Duration::from_millis(
config.general.connect_timeout,
))
.test_on_check_out(false)
.build(manager)
.await
.unwrap();
let mut pool = ConnectionPool {
databases: shards,
addresses: addresses,
banlist: Arc::new(RwLock::new(banlist)),
stats: get_reporter(),
server_info: BytesMut::new(),
settings: PoolSettings {
pool_mode: pool_config.pool_mode.clone(),
shards: pool_config.shards.clone(),
user: user_info.clone(),
default_role: pool_config.default_role.clone(),
query_parser_enabled: pool_config.query_parser_enabled.clone(),
primary_reads_enabled: pool_config.primary_reads_enabled,
sharding_function: pool_config.sharding_function.clone(),
},
};
pools.push(pool);
servers.push(address);
// Connect to the servers to make sure pool configuration is valid
// before setting it globally.
match pool.validate().await {
Ok(_) => (),
Err(err) => {
error!("Could not validate connection pool: {:?}", err);
return Err(err);
}
};
new_pools.insert((pool_name.clone(), user_info.username.clone()), pool);
}
shards.push(pools);
addresses.push(servers);
banlist.push(HashMap::new());
}
assert_eq!(shards.len(), addresses.len());
let address_len = addresses.len();
POOLS.store(Arc::new(new_pools.clone()));
ConnectionPool {
databases: shards,
addresses: addresses,
round_robin: rand::random::<usize>() % address_len, // Start at a random replica
banlist: Arc::new(RwLock::new(banlist)),
stats: stats,
}
Ok(())
}
/// Connect to all shards and grab server information.
@@ -121,16 +197,18 @@ impl ConnectionPool {
/// when they connect.
/// This also warms up the pool for clients that connect when
/// the pooler starts up.
pub async fn validate(&mut self) -> Result<BytesMut, Error> {
async fn validate(&mut self) -> Result<(), Error> {
let mut server_infos = Vec::new();
let stats = self.stats.clone();
for shard in 0..self.shards() {
let mut round_robin = 0;
for _ in 0..self.servers(shard) {
// To keep stats consistent.
let fake_process_id = 0;
let connection = match self.get(shard, None, fake_process_id).await {
let connection = match self.get(shard, None, fake_process_id, round_robin).await {
Ok(conn) => conn,
Err(err) => {
error!("Shard {} down or misconfigured: {:?}", shard, err);
@@ -138,10 +216,9 @@ impl ConnectionPool {
}
};
let mut proxy = connection.0;
let proxy = connection.0;
let address = connection.1;
let server = &mut *proxy;
let server = &*proxy;
let server_info = server.server_info();
stats.client_disconnecting(fake_process_id, address.id);
@@ -157,6 +234,7 @@ impl ConnectionPool {
}
server_infos.push(server_info);
round_robin += 1;
}
}
@@ -166,15 +244,18 @@ impl ConnectionPool {
return Err(Error::AllServersDown);
}
Ok(server_infos[0].clone())
self.server_info = server_infos[0].clone();
Ok(())
}
/// Get a connection from the pool.
pub async fn get(
&mut self,
shard: usize,
role: Option<Role>,
process_id: i32,
&self,
shard: usize, // shard number
role: Option<Role>, // primary or replica
process_id: i32, // client id
mut round_robin: usize, // round robin offset
) -> Result<(PooledConnection<'_, ServerPool>, Address), Error> {
let now = Instant::now();
let addresses = &self.addresses[shard];
@@ -202,15 +283,16 @@ impl ConnectionPool {
return Err(Error::BadConfig);
}
let healthcheck_timeout = get_config().general.healthcheck_timeout;
let healthcheck_delay = get_config().general.healthcheck_delay as u128;
while allowed_attempts > 0 {
// Round-robin replicas.
self.round_robin += 1;
round_robin += 1;
let index = self.round_robin % addresses.len();
let index = round_robin % addresses.len();
let address = &addresses[index];
self.stats.client_waiting(process_id, address.id);
// Make sure you're getting a primary or a replica
// as per request. If no specific role is requested, the first
// available will be chosen.
@@ -220,16 +302,20 @@ impl ConnectionPool {
allowed_attempts -= 1;
// Don't attempt to connect to banned servers.
if self.is_banned(address, shard, role) {
continue;
}
// Indicate we're waiting on a server connection from a pool.
self.stats.client_waiting(process_id, address.id);
// Check if we can connect
let mut conn = match self.databases[shard][index].get().await {
Ok(conn) => conn,
Err(err) => {
error!("Banning replica {}, error: {:?}", index, err);
self.ban(address, shard);
self.ban(address, shard, process_id);
self.stats.client_disconnecting(process_id, address.id);
self.stats
.checkout_time(now.elapsed().as_micros(), process_id, address.id);
@@ -239,13 +325,25 @@ impl ConnectionPool {
// // Check if this server is alive with a health check.
let server = &mut *conn;
let healthcheck_timeout = get_config().general.healthcheck_timeout;
// Will return error if timestamp is greater than current system time, which it should never be set to
let require_healthcheck =
server.last_activity().elapsed().unwrap().as_millis() > healthcheck_delay;
if !require_healthcheck {
self.stats
.checkout_time(now.elapsed().as_micros(), process_id, address.id);
self.stats.server_idle(conn.process_id(), address.id);
return Ok((conn, address.clone()));
}
debug!("Running health check on server {:?}", address);
self.stats.server_tested(server.process_id(), address.id);
match tokio::time::timeout(
tokio::time::Duration::from_millis(healthcheck_timeout),
server.query("SELECT 1"),
server.query(";"),
)
.await
{
@@ -265,10 +363,7 @@ impl ConnectionPool {
// Don't leave a bad connection in the pool.
server.mark_bad();
self.ban(address, shard);
self.stats.client_disconnecting(process_id, address.id);
self.stats
.checkout_time(now.elapsed().as_micros(), process_id, address.id);
self.ban(address, shard, process_id);
continue;
}
},
@@ -279,10 +374,7 @@ impl ConnectionPool {
// Don't leave a bad connection in the pool.
server.mark_bad();
self.ban(address, shard);
self.stats.client_disconnecting(process_id, address.id);
self.stats
.checkout_time(now.elapsed().as_micros(), process_id, address.id);
self.ban(address, shard, process_id);
continue;
}
}
@@ -294,7 +386,11 @@ impl ConnectionPool {
/// Ban an address (i.e. replica). It no longer will serve
/// traffic for any new transactions. Existing transactions on that replica
/// will finish successfully or error out to the clients.
pub fn ban(&self, address: &Address, shard: usize) {
pub fn ban(&self, address: &Address, shard: usize, process_id: i32) {
self.stats.client_disconnecting(process_id, address.id);
self.stats
.checkout_time(Instant::now().elapsed().as_micros(), process_id, address.id);
error!("Banning {:?}", address);
let now = chrono::offset::Utc::now().naive_utc();
let mut guard = self.banlist.write();
@@ -389,6 +485,10 @@ impl ConnectionPool {
pub fn address(&self, shard: usize, server: usize) -> &Address {
&self.addresses[shard][server]
}
pub fn server_info(&self) -> BytesMut {
self.server_info.clone()
}
}
/// Wrapper for the bb8 connection pool.
@@ -428,7 +528,7 @@ impl ManageConnection for ServerPool {
info!(
"Creating a new connection to {:?} using user {:?}",
self.address.name(),
self.user.name
self.user.username
);
// Put a temporary process_id into the stats
@@ -469,3 +569,22 @@ impl ManageConnection for ServerPool {
conn.is_bad()
}
}
/// Get the connection pool
pub fn get_pool(db: String, user: String) -> Option<ConnectionPool> {
match get_all_pools().get(&(db, user)) {
Some(pool) => Some(pool.clone()),
None => None,
}
}
pub fn get_number_of_addresses() -> usize {
get_all_pools()
.iter()
.map(|(_, pool)| pool.databases())
.sum()
}
pub fn get_all_pools() -> HashMap<(String, String), ConnectionPool> {
return (*(*POOLS.load())).clone();
}

212
src/prometheus.rs Normal file
View File

@@ -0,0 +1,212 @@
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Method, Request, Response, Server, StatusCode};
use log::{error, info, warn};
use phf::phf_map;
use std::collections::HashMap;
use std::fmt;
use std::net::SocketAddr;
use crate::config::Address;
use crate::pool::get_all_pools;
use crate::stats::get_stats;
pub const HTTP_PORT: usize = 9930;
struct MetricHelpType {
help: &'static str,
ty: &'static str,
}
// reference for metric types: https://prometheus.io/docs/concepts/metric_types/
// counters only increase
// gauges can arbitrarily increase or decrease
static METRIC_HELP_AND_TYPES_LOOKUP: phf::Map<&'static str, MetricHelpType> = phf_map! {
"total_query_count" => MetricHelpType {
help: "Number of queries sent by all clients",
ty: "counter",
},
"total_query_time" => MetricHelpType {
help: "Total amount of time for queries to execute",
ty: "counter",
},
"total_received" => MetricHelpType {
help: "Number of bytes received from the server",
ty: "counter",
},
"total_sent" => MetricHelpType {
help: "Number of bytes sent to the server",
ty: "counter",
},
"total_xact_count" => MetricHelpType {
help: "Total number of transactions started by the client",
ty: "counter",
},
"total_xact_time" => MetricHelpType {
help: "Total amount of time for all transactions to execute",
ty: "counter",
},
"total_wait_time" => MetricHelpType {
help: "Total time client waited for a server connection",
ty: "counter",
},
"avg_query_count" => MetricHelpType {
help: "Average of total_query_count every 15 seconds",
ty: "gauge",
},
"avg_query_time" => MetricHelpType {
help: "Average time taken for queries to execute every 15 seconds",
ty: "gauge",
},
"avg_recv" => MetricHelpType {
help: "Average of total_received bytes every 15 seconds",
ty: "gauge",
},
"avg_sent" => MetricHelpType {
help: "Average of total_sent bytes every 15 seconds",
ty: "gauge",
},
"avg_xact_count" => MetricHelpType {
help: "Average of total_xact_count every 15 seconds",
ty: "gauge",
},
"avg_xact_time" => MetricHelpType {
help: "Average of total_xact_time every 15 seconds",
ty: "gauge",
},
"avg_wait_time" => MetricHelpType {
help: "Average of total_wait_time every 15 seconds",
ty: "gauge",
},
"maxwait_us" => MetricHelpType {
help: "The time a client waited for a server connection in microseconds",
ty: "gauge",
},
"maxwait" => MetricHelpType {
help: "The time a client waited for a server connection in seconds",
ty: "gauge",
},
"cl_waiting" => MetricHelpType {
help: "How many clients are waiting for a connection from the pool",
ty: "gauge",
},
"cl_active" => MetricHelpType {
help: "How many clients are actively communicating with a server",
ty: "gauge",
},
"cl_idle" => MetricHelpType {
help: "How many clients are idle",
ty: "gauge",
},
"sv_idle" => MetricHelpType {
help: "How many server connections are idle",
ty: "gauge",
},
"sv_active" => MetricHelpType {
help: "How many server connections are actively communicating with a client",
ty: "gauge",
},
"sv_login" => MetricHelpType {
help: "How many server connections are currently being created",
ty: "gauge",
},
"sv_tested" => MetricHelpType {
help: "How many server connections are currently waiting on a health check to succeed",
ty: "gauge",
},
};
struct PrometheusMetric {
name: String,
help: String,
ty: String,
labels: HashMap<&'static str, String>,
value: i64,
}
impl fmt::Display for PrometheusMetric {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let formatted_labels = self
.labels
.iter()
.map(|(key, value)| format!("{}=\"{}\"", key, value))
.collect::<Vec<_>>()
.join(",");
write!(
f,
"# HELP {name} {help}\n# TYPE {name} {ty}\n{name}{{{formatted_labels}}} {value}\n",
name = format_args!("pgcat_{}", self.name),
help = self.help,
ty = self.ty,
formatted_labels = formatted_labels,
value = self.value
)
}
}
impl PrometheusMetric {
fn new(address: &Address, name: &str, value: i64) -> Option<PrometheusMetric> {
let mut labels = HashMap::new();
labels.insert("host", address.host.clone());
labels.insert("shard", address.shard.to_string());
labels.insert("role", address.role.to_string());
labels.insert("database", address.database.to_string());
METRIC_HELP_AND_TYPES_LOOKUP
.get(name)
.map(|metric| PrometheusMetric {
name: name.to_owned(),
help: metric.help.to_owned(),
ty: metric.ty.to_owned(),
labels,
value,
})
}
}
async fn prometheus_stats(request: Request<Body>) -> Result<Response<Body>, hyper::http::Error> {
match (request.method(), request.uri().path()) {
(&Method::GET, "/metrics") => {
let stats = get_stats();
let mut lines = Vec::new();
for (_, pool) in get_all_pools() {
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
if let Some(address_stats) = stats.get(&address.id) {
for (key, value) in address_stats.iter() {
if let Some(prometheus_metric) =
PrometheusMetric::new(address, key, *value)
{
lines.push(prometheus_metric.to_string());
} else {
warn!("Metric {} not implemented for {}", key, address.name());
}
}
}
}
}
}
Response::builder()
.header("content-type", "text/plain; version=0.0.4")
.body(lines.join("\n").into())
}
_ => Response::builder()
.status(StatusCode::NOT_FOUND)
.body("".into()),
}
}
pub async fn start_metric_server(http_addr: SocketAddr) {
let http_service_factory =
make_service_fn(|_conn| async { Ok::<_, hyper::Error>(service_fn(prometheus_stats)) });
let server = Server::bind(&http_addr.into()).serve(http_service_factory);
info!(
"Exposing prometheus metrics on http://{}/metrics.",
http_addr
);
if let Err(e) = server.await {
error!("Failed to run HTTP server: {}.", e);
}
}

View File

@@ -8,16 +8,19 @@ use sqlparser::ast::Statement::{Query, StartTransaction};
use sqlparser::dialect::PostgreSqlDialect;
use sqlparser::parser::Parser;
use crate::config::{get_config, Role};
use crate::config::Role;
use crate::pool::PoolSettings;
use crate::sharding::{Sharder, ShardingFunction};
/// Regexes used to parse custom commands.
const CUSTOM_SQL_REGEXES: [&str; 5] = [
const CUSTOM_SQL_REGEXES: [&str; 7] = [
r"(?i)^ *SET SHARDING KEY TO '?([0-9]+)'? *;? *$",
r"(?i)^ *SET SHARD TO '?([0-9]+|ANY)'? *;? *$",
r"(?i)^ *SHOW SHARD *;? *$",
r"(?i)^ *SET SERVER ROLE TO '(PRIMARY|REPLICA|ANY|AUTO|DEFAULT)' *;? *$",
r"(?i)^ *SHOW SERVER ROLE *;? *$",
r"(?i)^ *SET PRIMARY READS TO '?(on|off|default)'? *;? *$",
r"(?i)^ *SHOW PRIMARY READS *;? *$",
];
/// Custom commands.
@@ -28,6 +31,8 @@ pub enum Command {
ShowShard,
SetServerRole,
ShowServerRole,
SetPrimaryReads,
ShowPrimaryReads,
}
/// Quickly test for match when a query is received.
@@ -38,27 +43,19 @@ static CUSTOM_SQL_REGEX_LIST: OnceCell<Vec<Regex>> = OnceCell::new();
/// The query router.
pub struct QueryRouter {
/// By default, queries go here, unless we have better information
/// about what the client wants.
default_server_role: Option<Role>,
/// Number of shards in the cluster.
shards: usize,
/// Which shard we should be talking to right now.
active_shard: Option<usize>,
/// Which server should we be talking to.
active_role: Option<Role>,
/// Should we try to parse queries to route them to replicas or primary automatically
query_parser_enabled: bool,
/// Include the primary into the replica pool for reads.
primary_reads_enabled: bool,
/// Should we try to parse queries to route them to replicas or primary automatically.
query_parser_enabled: bool,
/// Which sharding function we're using.
sharding_function: ShardingFunction,
pool_settings: PoolSettings,
}
impl QueryRouter {
@@ -95,33 +92,19 @@ impl QueryRouter {
/// Create a new instance of the query router. Each client gets its own.
pub fn new() -> QueryRouter {
let config = get_config();
let default_server_role = match config.query_router.default_role.as_ref() {
"any" => None,
"primary" => Some(Role::Primary),
"replica" => Some(Role::Replica),
_ => unreachable!(),
};
let sharding_function = match config.query_router.sharding_function.as_ref() {
"pg_bigint_hash" => ShardingFunction::PgBigintHash,
"sha1" => ShardingFunction::Sha1,
_ => unreachable!(),
};
QueryRouter {
default_server_role: default_server_role,
shards: config.shards.len(),
active_role: default_server_role,
active_shard: None,
primary_reads_enabled: config.query_router.primary_reads_enabled,
query_parser_enabled: config.query_router.query_parser_enabled,
sharding_function,
active_role: None,
query_parser_enabled: false,
primary_reads_enabled: false,
pool_settings: PoolSettings::default(),
}
}
pub fn update_pool_settings(&mut self, pool_settings: PoolSettings) {
self.pool_settings = pool_settings;
}
/// Try to parse a command and execute it.
pub fn try_execute_command(&mut self, mut buf: BytesMut) -> Option<(Command, String)> {
let code = buf.get_u8() as char;
@@ -146,21 +129,42 @@ impl QueryRouter {
let matches: Vec<_> = regex_set.matches(&query).into_iter().collect();
// This is not a custom query, try to infer which
// server it'll go to if the query parser is enabled.
if matches.len() != 1 {
debug!("Regular query, not a command");
return None;
}
let sharding_function = match self.pool_settings.sharding_function.as_ref() {
"pg_bigint_hash" => ShardingFunction::PgBigintHash,
"sha1" => ShardingFunction::Sha1,
_ => unreachable!(),
};
let default_server_role = match self.pool_settings.default_role.as_ref() {
"any" => None,
"primary" => Some(Role::Primary),
"replica" => Some(Role::Replica),
_ => unreachable!(),
};
let command = match matches[0] {
0 => Command::SetShardingKey,
1 => Command::SetShard,
2 => Command::ShowShard,
3 => Command::SetServerRole,
4 => Command::ShowServerRole,
5 => Command::SetPrimaryReads,
6 => Command::ShowPrimaryReads,
_ => unreachable!(),
};
let mut value = match command {
Command::SetShardingKey | Command::SetShard | Command::SetServerRole => {
Command::SetShardingKey
| Command::SetShard
| Command::SetServerRole
| Command::SetPrimaryReads => {
// Capture value. I know this re-runs the regex engine, but I haven't
// figured out a better way just yet. I think I can write a single Regex
// that matches all 5 custom SQL patterns, but maybe that's not very legible?
@@ -187,11 +191,16 @@ impl QueryRouter {
}
}
},
Command::ShowPrimaryReads => match self.primary_reads_enabled {
true => String::from("on"),
false => String::from("off"),
},
};
match command {
Command::SetShardingKey => {
let sharder = Sharder::new(self.shards, self.sharding_function);
let sharder = Sharder::new(self.pool_settings.shards.len(), sharding_function);
let shard = sharder.shard(value.parse::<i64>().unwrap());
self.active_shard = Some(shard);
value = shard.to_string();
@@ -199,7 +208,7 @@ impl QueryRouter {
Command::SetShard => {
self.active_shard = match value.to_ascii_uppercase().as_ref() {
"ANY" => Some(rand::random::<usize>() % self.shards),
"ANY" => Some(rand::random::<usize>() % self.pool_settings.shards.len()),
_ => Some(value.parse::<usize>().unwrap()),
};
}
@@ -227,8 +236,8 @@ impl QueryRouter {
}
"default" => {
self.active_role = self.default_server_role;
self.query_parser_enabled = get_config().query_router.query_parser_enabled;
self.active_role = default_server_role;
self.query_parser_enabled = self.query_parser_enabled;
self.active_role
}
@@ -236,6 +245,19 @@ impl QueryRouter {
};
}
Command::SetPrimaryReads => {
if value == "on" {
debug!("Setting primary reads to on");
self.primary_reads_enabled = true;
} else if value == "off" {
debug!("Setting primary reads to off");
self.primary_reads_enabled = false;
} else if value == "default" {
debug!("Setting primary reads to default");
self.primary_reads_enabled = self.pool_settings.primary_reads_enabled;
}
}
_ => (),
}
@@ -244,6 +266,8 @@ impl QueryRouter {
/// Try to infer which server to connect to based on the contents of the query.
pub fn infer_role(&mut self, mut buf: BytesMut) -> bool {
debug!("Inferring role");
let code = buf.get_u8() as char;
let len = buf.get_i32() as usize;
@@ -330,27 +354,21 @@ impl QueryRouter {
}
}
/// Reset the router back to defaults.
/// This must be called at the end of every transaction in transaction mode.
pub fn _reset(&mut self) {
self.active_role = self.default_server_role;
self.active_shard = None;
pub fn set_shard(&mut self, shard: usize) {
self.active_shard = Some(shard);
}
/// Should we attempt to parse queries?
#[allow(dead_code)]
pub fn query_parser_enabled(&self) -> bool {
self.query_parser_enabled
}
/// Allows to toggle primary reads in tests.
#[allow(dead_code)]
pub fn toggle_primary_reads(&mut self, value: bool) {
self.primary_reads_enabled = value;
}
}
#[cfg(test)]
mod test {
use std::collections::HashMap;
use super::*;
use crate::messages::simple_query;
use bytes::BufMut;
@@ -369,7 +387,8 @@ mod test {
let mut qr = QueryRouter::new();
assert!(qr.try_execute_command(simple_query("SET SERVER ROLE TO 'auto'")) != None);
assert_eq!(qr.query_parser_enabled(), true);
qr.toggle_primary_reads(false);
assert!(qr.try_execute_command(simple_query("SET PRIMARY READS TO off")) != None);
let queries = vec![
simple_query("SELECT * FROM items WHERE id = 5"),
@@ -410,7 +429,7 @@ mod test {
QueryRouter::setup();
let mut qr = QueryRouter::new();
let query = simple_query("SELECT * FROM items WHERE id = 5");
qr.toggle_primary_reads(true);
assert!(qr.try_execute_command(simple_query("SET PRIMARY READS TO on")) != None);
assert!(qr.infer_role(query));
assert_eq!(qr.role(), None);
@@ -421,7 +440,7 @@ mod test {
QueryRouter::setup();
let mut qr = QueryRouter::new();
qr.try_execute_command(simple_query("SET SERVER ROLE TO 'auto'"));
qr.toggle_primary_reads(false);
assert!(qr.try_execute_command(simple_query("SET PRIMARY READS TO off")) != None);
let prepared_stmt = BytesMut::from(
&b"WITH t AS (SELECT * FROM items WHERE name = $1) SELECT * FROM t WHERE id = $2\0"[..],
@@ -450,6 +469,10 @@ mod test {
"SET SERVER ROLE TO 'any'",
"SET SERVER ROLE TO 'auto'",
"SHOW SERVER ROLE",
"SET PRIMARY READS TO 'on'",
"SET PRIMARY READS TO 'off'",
"SET PRIMARY READS TO 'default'",
"SHOW PRIMARY READS",
// Lower case
"set sharding key to '1'",
"set shard to '1'",
@@ -459,9 +482,13 @@ mod test {
"set server role to 'any'",
"set server role to 'auto'",
"show server role",
"set primary reads to 'on'",
"set primary reads to 'OFF'",
"set primary reads to 'deFaUlt'",
// No quotes
"SET SHARDING KEY TO 11235",
"SET SHARD TO 15",
"SET PRIMARY READS TO off",
// Spaces and semicolon
" SET SHARDING KEY TO 11235 ; ",
" SET SHARD TO 15; ",
@@ -469,18 +496,23 @@ mod test {
" SET SERVER ROLE TO 'primary'; ",
" SET SERVER ROLE TO 'primary' ; ",
" SET SERVER ROLE TO 'primary' ;",
" SET PRIMARY READS TO 'off' ;",
];
// Which regexes it'll match to in the list
let matches = [
0, 1, 2, 3, 3, 3, 3, 4, 0, 1, 2, 3, 3, 3, 3, 4, 0, 1, 0, 1, 0, 3, 3, 3,
0, 1, 2, 3, 3, 3, 3, 4, 5, 5, 5, 6, 0, 1, 2, 3, 3, 3, 3, 4, 5, 5, 5, 0, 1, 5, 0, 1, 0,
3, 3, 3, 5,
];
let list = CUSTOM_SQL_REGEX_LIST.get().unwrap();
let set = CUSTOM_SQL_REGEX_SET.get().unwrap();
for (i, test) in tests.iter().enumerate() {
assert!(list[matches[i]].is_match(test));
if !list[matches[i]].is_match(test) {
println!("{} does not match {}", test, list[matches[i]]);
assert!(false);
}
assert_eq!(set.matches(test).into_iter().collect::<Vec<_>>().len(), 1);
}
@@ -503,9 +535,9 @@ mod test {
let query = simple_query("SET SHARDING KEY TO 13");
assert_eq!(
qr.try_execute_command(query),
Some((Command::SetShardingKey, String::from("1")))
Some((Command::SetShardingKey, String::from("0")))
);
assert_eq!(qr.shard(), 1);
assert_eq!(qr.shard(), 0);
// SetShard
let query = simple_query("SET SHARD TO '1'");
@@ -549,6 +581,26 @@ mod test {
Some((Command::ShowServerRole, String::from(*role)))
);
}
let primary_reads = ["on", "off", "default"];
let primary_reads_enabled = ["on", "off", "on"];
for (idx, primary_reads) in primary_reads.iter().enumerate() {
assert_eq!(
qr.try_execute_command(simple_query(&format!(
"SET PRIMARY READS TO {}",
primary_reads
))),
Some((Command::SetPrimaryReads, String::from(*primary_reads)))
);
assert_eq!(
qr.try_execute_command(simple_query("SHOW PRIMARY READS")),
Some((
Command::ShowPrimaryReads,
String::from(primary_reads_enabled[idx])
))
);
}
}
#[test]
@@ -556,7 +608,7 @@ mod test {
QueryRouter::setup();
let mut qr = QueryRouter::new();
let query = simple_query("SET SERVER ROLE TO 'auto'");
qr.toggle_primary_reads(false);
assert!(qr.try_execute_command(simple_query("SET PRIMARY READS TO off")) != None);
assert!(qr.try_execute_command(query) != None);
assert!(qr.query_parser_enabled());
@@ -573,6 +625,45 @@ mod test {
assert!(qr.query_parser_enabled());
let query = simple_query("SET SERVER ROLE TO 'default'");
assert!(qr.try_execute_command(query) != None);
assert!(!qr.query_parser_enabled());
assert!(qr.query_parser_enabled());
}
#[test]
fn test_update_from_pool_settings() {
QueryRouter::setup();
let pool_settings = PoolSettings {
pool_mode: "transaction".to_string(),
shards: HashMap::default(),
user: crate::config::User::default(),
default_role: Role::Replica.to_string(),
query_parser_enabled: true,
primary_reads_enabled: false,
sharding_function: "pg_bigint_hash".to_string(),
};
let mut qr = QueryRouter::new();
assert_eq!(qr.active_role, None);
assert_eq!(qr.active_shard, None);
assert_eq!(qr.query_parser_enabled, false);
assert_eq!(qr.primary_reads_enabled, false);
// Internal state must not be changed due to this, only defaults
qr.update_pool_settings(pool_settings.clone());
assert_eq!(qr.active_role, None);
assert_eq!(qr.active_shard, None);
assert_eq!(qr.query_parser_enabled, false);
assert_eq!(qr.primary_reads_enabled, false);
let q1 = simple_query("SET SERVER ROLE TO 'primary'");
assert!(qr.try_execute_command(q1) != None);
assert_eq!(qr.active_role.unwrap(), Role::Primary);
let q2 = simple_query("SET SERVER ROLE TO 'default'");
assert!(qr.try_execute_command(q2) != None);
assert_eq!(
qr.active_role.unwrap().to_string(),
pool_settings.clone().default_role
);
}
}

320
src/scram.rs Normal file
View File

@@ -0,0 +1,320 @@
// SCRAM-SHA-256 authentication. Heavily inspired by
// https://github.com/sfackler/rust-postgres/
// SASL implementation.
use bytes::BytesMut;
use hmac::{Hmac, Mac};
use rand::{self, Rng};
use sha2::digest::FixedOutput;
use sha2::{Digest, Sha256};
use std::fmt::Write;
use crate::constants::*;
use crate::errors::Error;
/// Normalize a password string. Postgres
/// passwords don't have to be UTF-8.
fn normalize(pass: &[u8]) -> Vec<u8> {
let pass = match std::str::from_utf8(pass) {
Ok(pass) => pass,
Err(_) => return pass.to_vec(),
};
match stringprep::saslprep(pass) {
Ok(pass) => pass.into_owned().into_bytes(),
Err(_) => pass.as_bytes().to_vec(),
}
}
/// Keep the SASL state through the exchange.
/// It takes 3 messages to complete the authentication.
pub struct ScramSha256 {
password: String,
salted_password: [u8; 32],
auth_message: String,
message: BytesMut,
nonce: String,
}
impl ScramSha256 {
/// Create the Scram state from a password. It'll automatically
/// generate a nonce.
pub fn new(password: &str) -> ScramSha256 {
let mut rng = rand::thread_rng();
let nonce = (0..NONCE_LENGTH)
.map(|_| {
let mut v = rng.gen_range(0x21u8..0x7e);
if v == 0x2c {
v = 0x7e
}
v as char
})
.collect::<String>();
Self::from_nonce(password, &nonce)
}
/// Used for testing.
pub fn from_nonce(password: &str, nonce: &str) -> ScramSha256 {
let message = BytesMut::from(&format!("{}n=,r={}", "n,,", nonce).as_bytes()[..]);
ScramSha256 {
password: password.to_string(),
nonce: String::from(nonce),
message,
salted_password: [0u8; 32],
auth_message: String::new(),
}
}
/// Get the current state of the SASL authentication.
pub fn message(&mut self) -> BytesMut {
self.message.clone()
}
/// Update the state with message received from server.
pub fn update(&mut self, message: &BytesMut) -> Result<BytesMut, Error> {
let server_message = Message::parse(message)?;
if !server_message.nonce.starts_with(&self.nonce) {
return Err(Error::ProtocolSyncError);
}
let salt = match base64::decode(&server_message.salt) {
Ok(salt) => salt,
Err(_) => return Err(Error::ProtocolSyncError),
};
let salted_password = Self::hi(
&normalize(&self.password.as_bytes()[..]),
&salt,
server_message.iterations,
);
// Save for verification of final server message.
self.salted_password = salted_password;
let mut hmac = match Hmac::<Sha256>::new_from_slice(&salted_password) {
Ok(hmac) => hmac,
Err(_) => return Err(Error::ServerError),
};
hmac.update(b"Client Key");
let client_key = hmac.finalize().into_bytes();
let mut hash = Sha256::default();
hash.update(client_key.as_slice());
let stored_key = hash.finalize_fixed();
let mut cbind_input = vec![];
cbind_input.extend("n,,".as_bytes());
let cbind_input = base64::encode(&cbind_input);
self.message.clear();
// Start writing the client reply.
match write!(
&mut self.message,
"c={},r={}",
cbind_input, server_message.nonce
) {
Ok(_) => (),
Err(_) => return Err(Error::ServerError),
};
let auth_message = format!(
"n=,r={},{},{}",
self.nonce,
String::from_utf8_lossy(&message[..]),
String::from_utf8_lossy(&self.message[..])
);
let mut hmac = match Hmac::<Sha256>::new_from_slice(&stored_key) {
Ok(hmac) => hmac,
Err(_) => return Err(Error::ServerError),
};
hmac.update(auth_message.as_bytes());
// Save the auth message for server final message verification.
self.auth_message = auth_message;
let client_signature = hmac.finalize().into_bytes();
// Sign the client proof.
let mut client_proof = client_key;
for (proof, signature) in client_proof.iter_mut().zip(client_signature) {
*proof ^= signature;
}
match write!(&mut self.message, ",p={}", base64::encode(&*client_proof)) {
Ok(_) => (),
Err(_) => return Err(Error::ServerError),
};
Ok(self.message.clone())
}
/// Verify final server message.
pub fn finish(&mut self, message: &BytesMut) -> Result<(), Error> {
let final_message = FinalMessage::parse(message)?;
let verifier = match base64::decode(&final_message.value) {
Ok(verifier) => verifier,
Err(_) => return Err(Error::ProtocolSyncError),
};
let mut hmac = match Hmac::<Sha256>::new_from_slice(&self.salted_password) {
Ok(hmac) => hmac,
Err(_) => return Err(Error::ServerError),
};
hmac.update(b"Server Key");
let server_key = hmac.finalize().into_bytes();
let mut hmac = match Hmac::<Sha256>::new_from_slice(&server_key) {
Ok(hmac) => hmac,
Err(_) => return Err(Error::ServerError),
};
hmac.update(self.auth_message.as_bytes());
match hmac.verify_slice(&verifier) {
Ok(_) => Ok(()),
Err(_) => return Err(Error::ServerError),
}
}
/// Hash the password with the salt i-times.
fn hi(str: &[u8], salt: &[u8], i: u32) -> [u8; 32] {
let mut hmac =
Hmac::<Sha256>::new_from_slice(str).expect("HMAC is able to accept all key sizes");
hmac.update(salt);
hmac.update(&[0, 0, 0, 1]);
let mut prev = hmac.finalize().into_bytes();
let mut hi = prev;
for _ in 1..i {
let mut hmac = Hmac::<Sha256>::new_from_slice(str).expect("already checked above");
hmac.update(&prev);
prev = hmac.finalize().into_bytes();
for (hi, prev) in hi.iter_mut().zip(prev) {
*hi ^= prev;
}
}
hi.into()
}
}
/// Parse the server challenge.
struct Message {
nonce: String,
salt: String,
iterations: u32,
}
impl Message {
/// Parse the server SASL challenge.
fn parse(message: &BytesMut) -> Result<Message, Error> {
let parts = String::from_utf8_lossy(&message[..])
.split(",")
.map(|s| s.to_string())
.collect::<Vec<String>>();
if parts.len() != 3 {
return Err(Error::ProtocolSyncError);
}
let nonce = str::replace(&parts[0], "r=", "");
let salt = str::replace(&parts[1], "s=", "");
let iterations = match str::replace(&parts[2], "i=", "").parse::<u32>() {
Ok(iterations) => iterations,
Err(_) => return Err(Error::ProtocolSyncError),
};
Ok(Message {
nonce,
salt,
iterations,
})
}
}
/// Parse server final validation message.
struct FinalMessage {
value: String,
}
impl FinalMessage {
/// Parse the server final validation message.
pub fn parse(message: &BytesMut) -> Result<FinalMessage, Error> {
if !message.starts_with(b"v=") || message.len() < 4 {
return Err(Error::ProtocolSyncError);
}
Ok(FinalMessage {
value: String::from_utf8_lossy(&message[2..]).to_string(),
})
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn parse_server_first_message() {
let message = BytesMut::from(
&"r=fyko+d2lbbFgONRv9qkxdawL3rfcNHYJY1ZVvWVs7j,s=QSXCR+Q6sek8bf92,i=4096".as_bytes()[..],
);
let message = Message::parse(&message).unwrap();
assert_eq!(message.nonce, "fyko+d2lbbFgONRv9qkxdawL3rfcNHYJY1ZVvWVs7j");
assert_eq!(message.salt, "QSXCR+Q6sek8bf92");
assert_eq!(message.iterations, 4096);
}
#[test]
fn parse_server_last_message() {
let f = FinalMessage::parse(&BytesMut::from(
&"v=U+ppxD5XUKtradnv8e2MkeupiA8FU87Sg8CXzXHDAzw".as_bytes()[..],
))
.unwrap();
assert_eq!(
f.value,
"U+ppxD5XUKtradnv8e2MkeupiA8FU87Sg8CXzXHDAzw".to_string()
);
}
// recorded auth exchange from psql
#[test]
fn exchange() {
let password = "foobar";
let nonce = "9IZ2O01zb9IgiIZ1WJ/zgpJB";
let client_first = "n,,n=,r=9IZ2O01zb9IgiIZ1WJ/zgpJB";
let server_first =
"r=9IZ2O01zb9IgiIZ1WJ/zgpJBjx/oIRLs02gGSHcw1KEty3eY,s=fs3IXBy7U7+IvVjZ,i\
=4096";
let client_final =
"c=biws,r=9IZ2O01zb9IgiIZ1WJ/zgpJBjx/oIRLs02gGSHcw1KEty3eY,p=AmNKosjJzS3\
1NTlQYNs5BTeQjdHdk7lOflDo5re2an8=";
let server_final = "v=U+ppxD5XUKtradnv8e2MkeupiA8FU87Sg8CXzXHDAzw=";
let mut scram = ScramSha256::from_nonce(password, nonce);
let message = scram.message();
assert_eq!(std::str::from_utf8(&message).unwrap(), client_first);
let result = scram
.update(&BytesMut::from(&server_first.as_bytes()[..]))
.unwrap();
assert_eq!(std::str::from_utf8(&result).unwrap(), client_final);
scram
.finish(&BytesMut::from(&server_final.as_bytes()[..]))
.unwrap();
}
}

View File

@@ -2,6 +2,7 @@
/// Here we are pretending to the a Postgres client.
use bytes::{Buf, BufMut, BytesMut};
use log::{debug, error, info, trace};
use std::time::SystemTime;
use tokio::io::{AsyncReadExt, BufReader};
use tokio::net::{
tcp::{OwnedReadHalf, OwnedWriteHalf},
@@ -12,6 +13,7 @@ use crate::config::{Address, User};
use crate::constants::*;
use crate::errors::Error;
use crate::messages::*;
use crate::scram::ScramSha256;
use crate::stats::Reporter;
use crate::ClientServerMap;
@@ -54,6 +56,12 @@ pub struct Server {
/// Reports various metrics, e.g. data sent & received.
stats: Reporter,
/// Application name using the server at the moment.
application_name: String,
// Last time that a successful server send or response happened
last_activity: SystemTime,
}
impl Server {
@@ -78,7 +86,7 @@ impl Server {
trace!("Sending StartupMessage");
// StartupMessage
startup(&mut stream, &user.name, database).await?;
startup(&mut stream, &user.username, database).await?;
let mut server_info = BytesMut::new();
let mut process_id: i32 = 0;
@@ -86,6 +94,8 @@ impl Server {
// We'll be handling multiple packets, but they will all be structured the same.
// We'll loop here until this exchange is complete.
let mut scram = ScramSha256::new(&user.password);
loop {
let code = match stream.read_u8().await {
Ok(code) => code as char,
@@ -121,12 +131,97 @@ impl Server {
Err(_) => return Err(Error::SocketError),
};
md5_password(&mut stream, &user.name, &user.password, &salt[..])
md5_password(&mut stream, &user.username, &user.password, &salt[..])
.await?;
}
AUTHENTICATION_SUCCESSFUL => (),
SASL => {
debug!("Starting SASL authentication");
let sasl_len = (len - 8) as usize;
let mut sasl_auth = vec![0u8; sasl_len];
match stream.read_exact(&mut sasl_auth).await {
Ok(_) => (),
Err(_) => return Err(Error::SocketError),
};
let sasl_type = String::from_utf8_lossy(&sasl_auth[..sasl_len - 2]);
if sasl_type == SCRAM_SHA_256 {
debug!("Using {}", SCRAM_SHA_256);
// Generate client message.
let sasl_response = scram.message();
// SASLInitialResponse (F)
let mut res = BytesMut::new();
res.put_u8(b'p');
// length + String length + length + length of sasl response
res.put_i32(
4 // i32 size
+ SCRAM_SHA_256.len() as i32 // length of SASL version string,
+ 1 // Null terminator for the SASL version string,
+ 4 // i32 size
+ sasl_response.len() as i32, // length of SASL response
);
res.put_slice(&format!("{}\0", SCRAM_SHA_256).as_bytes()[..]);
res.put_i32(sasl_response.len() as i32);
res.put(sasl_response);
write_all(&mut stream, res).await?;
} else {
error!("Unsupported SCRAM version: {}", sasl_type);
return Err(Error::ServerError);
}
}
SASL_CONTINUE => {
trace!("Continuing SASL");
let mut sasl_data = vec![0u8; (len - 8) as usize];
match stream.read_exact(&mut sasl_data).await {
Ok(_) => (),
Err(_) => return Err(Error::SocketError),
};
let msg = BytesMut::from(&sasl_data[..]);
let sasl_response = scram.update(&msg)?;
// SASLResponse
let mut res = BytesMut::new();
res.put_u8(b'p');
res.put_i32(4 + sasl_response.len() as i32);
res.put(sasl_response);
write_all(&mut stream, res).await?;
}
SASL_FINAL => {
trace!("Final SASL");
let mut sasl_final = vec![0u8; len as usize - 8];
match stream.read_exact(&mut sasl_final).await {
Ok(_) => (),
Err(_) => return Err(Error::SocketError),
};
match scram.finish(&BytesMut::from(&sasl_final[..])) {
Ok(_) => {
debug!("SASL authentication successful");
}
Err(err) => {
debug!("SASL authentication failed");
return Err(err);
}
};
}
_ => {
error!("Unsupported authentication mechanism: {}", auth_code);
return Err(Error::ServerError);
@@ -210,7 +305,7 @@ impl Server {
let (read, write) = stream.into_split();
return Ok(Server {
let mut server = Server {
address: address.clone(),
read: BufReader::new(read),
write: write,
@@ -224,7 +319,13 @@ impl Server {
client_server_map: client_server_map,
connected_at: chrono::offset::Utc::now().naive_utc(),
stats: stats,
});
application_name: String::new(),
last_activity: SystemTime::now(),
};
server.set_name("pgcat").await?;
return Ok(server);
}
// We have an unexpected message from the server during this exchange.
@@ -270,7 +371,11 @@ impl Server {
.data_sent(messages.len(), self.process_id, self.address.id);
match write_all_half(&mut self.write, messages).await {
Ok(_) => Ok(()),
Ok(_) => {
// Successfully sent to server
self.last_activity = SystemTime::now();
Ok(())
}
Err(err) => {
error!("Terminating server because of: {:?}", err);
self.bad = true;
@@ -317,7 +422,7 @@ impl Server {
self.in_transaction = false;
}
// Some error occured, the transaction was rolled back.
// Some error occurred, the transaction was rolled back.
'E' => {
self.in_transaction = true;
}
@@ -378,6 +483,9 @@ impl Server {
// Clear the buffer for next query.
self.buffer.clear();
// Successfully received data from server
self.last_activity = SystemTime::now();
// Pass the data back to the client.
Ok(bytes)
}
@@ -448,9 +556,14 @@ impl Server {
/// A shorthand for `SET application_name = $1`.
#[allow(dead_code)]
pub async fn set_name(&mut self, name: &str) -> Result<(), Error> {
Ok(self
.query(&format!("SET application_name = '{}'", name))
.await?)
if self.application_name != name {
self.application_name = name.to_string();
Ok(self
.query(&format!("SET application_name = '{}'", name))
.await?)
} else {
Ok(())
}
}
/// Get the servers address.
@@ -463,6 +576,11 @@ impl Server {
pub fn process_id(&self) -> i32 {
self.process_id
}
// Get server's latest response timestamp
pub fn last_activity(&self) -> SystemTime {
self.last_activity
}
}
impl Drop for Server {

View File

@@ -1,9 +1,15 @@
use arc_swap::ArcSwap;
/// Statistics and reporting.
use log::info;
use once_cell::sync::Lazy;
use parking_lot::Mutex;
use std::collections::HashMap;
use tokio::sync::mpsc::{Receiver, Sender};
use tokio::sync::mpsc::{channel, Receiver, Sender};
use crate::pool::get_number_of_addresses;
pub static REPORTER: Lazy<ArcSwap<Reporter>> =
Lazy::new(|| ArcSwap::from_pointee(Reporter::default()));
/// Latest stats updated every second; used in SHOW STATS and other admin commands.
static LATEST_STATS: Lazy<Mutex<HashMap<usize, HashMap<String, i64>>>> =
@@ -60,6 +66,13 @@ pub struct Reporter {
tx: Sender<Event>,
}
impl Default for Reporter {
fn default() -> Reporter {
let (tx, _rx) = channel(5);
Reporter { tx }
}
}
impl Reporter {
/// Create a new Reporter instance.
pub fn new(tx: Sender<Event>) -> Reporter {
@@ -274,22 +287,23 @@ impl Collector {
/// The statistics collection handler. It will collect statistics
/// for `address_id`s starting at 0 up to `addresses`.
pub async fn collect(&mut self, addresses: usize) {
pub async fn collect(&mut self) {
info!("Events reporter started");
let stats_template = HashMap::from([
("total_query_count", 0),
("total_xact_count", 0),
("total_sent", 0),
("total_received", 0),
("total_xact_time", 0),
("total_query_time", 0),
("total_received", 0),
("total_sent", 0),
("total_xact_count", 0),
("total_xact_time", 0),
("total_wait_time", 0),
("avg_xact_time", 0),
("avg_query_count", 0),
("avg_query_time", 0),
("avg_xact_count", 0),
("avg_recv", 0),
("avg_sent", 0),
("avg_received", 0),
("avg_xact_count", 0),
("avg_xact_time", 0),
("avg_wait_time", 0),
("maxwait_us", 0),
("maxwait", 0),
@@ -318,7 +332,8 @@ impl Collector {
tokio::time::interval(tokio::time::Duration::from_millis(STAT_PERIOD / 15));
loop {
interval.tick().await;
for address_id in 0..addresses {
let address_count = get_number_of_addresses();
for address_id in 0..address_count {
let _ = tx.try_send(Event {
name: EventName::UpdateStats,
value: 0,
@@ -335,7 +350,8 @@ impl Collector {
tokio::time::interval(tokio::time::Duration::from_millis(STAT_PERIOD));
loop {
interval.tick().await;
for address_id in 0..addresses {
let address_count = get_number_of_addresses();
for address_id in 0..address_count {
let _ = tx.try_send(Event {
name: EventName::UpdateAverages,
value: 0,
@@ -491,12 +507,18 @@ impl Collector {
// Calculate averages
for stat in &[
"avg_query_count",
"avgxact_count",
"avg_query_time",
"avg_recv",
"avg_sent",
"avg_received",
"avg_xact_time",
"avg_xact_count",
"avg_wait_time",
] {
let total_name = stat.replace("avg_", "total_");
let total_name = match stat {
&"avg_recv" => "total_received".to_string(), // Because PgBouncer is saving bytes
_ => stat.replace("avg_", "total_"),
};
let old_value = old_stats.entry(total_name.clone()).or_insert(0);
let new_value = stats.get(total_name.as_str()).unwrap_or(&0).to_owned();
let avg = (new_value - *old_value) / (STAT_PERIOD as i64 / 1_000); // Avg / second
@@ -515,3 +537,8 @@ impl Collector {
pub fn get_stats() -> HashMap<usize, HashMap<String, i64>> {
LATEST_STATS.lock().clone()
}
/// Get the statistics reporter used to update stats across the pools/clients.
pub fn get_reporter() -> Reporter {
(*(*REPORTER.load())).clone()
}

57
src/tls.rs Normal file
View File

@@ -0,0 +1,57 @@
// Stream wrapper.
use rustls_pemfile::{certs, rsa_private_keys};
use std::path::Path;
use std::sync::Arc;
use tokio_rustls::rustls::{self, Certificate, PrivateKey};
use tokio_rustls::TlsAcceptor;
use crate::config::get_config;
use crate::errors::Error;
// TLS
pub fn load_certs(path: &Path) -> std::io::Result<Vec<Certificate>> {
certs(&mut std::io::BufReader::new(std::fs::File::open(path)?))
.map_err(|_| std::io::Error::new(std::io::ErrorKind::InvalidInput, "invalid cert"))
.map(|mut certs| certs.drain(..).map(Certificate).collect())
}
pub fn load_keys(path: &Path) -> std::io::Result<Vec<PrivateKey>> {
rsa_private_keys(&mut std::io::BufReader::new(std::fs::File::open(path)?))
.map_err(|_| std::io::Error::new(std::io::ErrorKind::InvalidInput, "invalid key"))
.map(|mut keys| keys.drain(..).map(PrivateKey).collect())
}
pub struct Tls {
pub acceptor: TlsAcceptor,
}
impl Tls {
pub fn new() -> Result<Self, Error> {
let config = get_config();
let certs = match load_certs(&Path::new(&config.general.tls_certificate.unwrap())) {
Ok(certs) => certs,
Err(_) => return Err(Error::TlsError),
};
let mut keys = match load_keys(&Path::new(&config.general.tls_private_key.unwrap())) {
Ok(keys) => keys,
Err(_) => return Err(Error::TlsError),
};
let config = match rustls::ServerConfig::builder()
.with_safe_defaults()
.with_no_client_auth()
.with_single_cert(certs, keys.remove(0))
.map_err(|err| std::io::Error::new(std::io::ErrorKind::InvalidInput, err))
{
Ok(c) => c,
Err(_) => return Err(Error::TlsError),
};
Ok(Tls {
acceptor: TlsAcceptor::from(Arc::new(config)),
})
}
}

View File

@@ -12,6 +12,8 @@
SET SHARD TO :shard;
SET SERVER ROLE TO 'auto';
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
@@ -26,3 +28,12 @@ INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :a
END;
SET SHARDING KEY TO :aid;
-- Read load balancing
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
SET SERVER ROLE TO 'replica';
-- Read load balancing
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

View File

@@ -1 +1,2 @@
psycopg2==2.9.3
psutil==5.9.1

View File

@@ -1,11 +1,167 @@
from typing import Tuple
import psycopg2
import psutil
import os
import signal
import time
conn = psycopg2.connect("postgres://random:password@127.0.0.1:6432/db")
cur = conn.cursor()
SHUTDOWN_TIMEOUT = 5
cur.execute("SELECT 1");
res = cur.fetchall()
PGCAT_HOST = "127.0.0.1"
PGCAT_PORT = "6432"
print(res)
# conn.commit()
def pgcat_start():
pg_cat_send_signal(signal.SIGTERM)
os.system("./target/debug/pgcat .circleci/pgcat.toml &")
def pg_cat_send_signal(signal: signal.Signals):
for proc in psutil.process_iter(["pid", "name"]):
if "pgcat" == proc.name():
os.kill(proc.pid, signal)
if signal == signal.SIGTERM:
# Returns 0 if pgcat process exists
time.sleep(2)
if not os.system('pgrep pgcat'):
raise Exception("pgcat not closed after SIGTERM")
def connect_normal_db(
autocommit: bool = False,
) -> Tuple[psycopg2.extensions.connection, psycopg2.extensions.cursor]:
conn = psycopg2.connect(
f"postgres://sharding_user:sharding_user@{PGCAT_HOST}:{PGCAT_PORT}/sharded_db?application_name=testing_pgcat"
)
conn.autocommit = autocommit
cur = conn.cursor()
return (conn, cur)
def cleanup_conn(conn: psycopg2.extensions.connection, cur: psycopg2.extensions.cursor):
cur.close()
conn.close()
def test_normal_db_access():
conn, cur = connect_normal_db()
cur.execute("SELECT 1")
res = cur.fetchall()
print(res)
cleanup_conn(conn, cur)
def test_admin_db_access():
conn = psycopg2.connect(
f"postgres://admin_user:admin_pass@{PGCAT_HOST}:{PGCAT_PORT}/pgcat"
)
conn.autocommit = True # BEGIN/COMMIT is not supported by admin db
cur = conn.cursor()
cur.execute("SHOW POOLS")
res = cur.fetchall()
print(res)
cleanup_conn(conn, cur)
def test_shutdown_logic():
##### NO ACTIVE QUERIES SIGINT HANDLING #####
# Start pgcat
pgcat_start()
# Wait for server to fully start up
time.sleep(2)
# Create client connection and send query (not in transaction)
conn, cur = connect_normal_db(True)
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
cur.execute("COMMIT;")
# Send sigint to pgcat
pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
# Check that any new queries fail after sigint since server should close with no active transactions
try:
cur.execute("SELECT 1;")
except psycopg2.OperationalError as e:
pass
else:
# Fail if query execution succeeded
raise Exception("Server not closed after sigint")
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
##### END #####
##### HANDLE TRANSACTION WITH SIGINT #####
# Start pgcat
pgcat_start()
# Wait for server to fully start up
time.sleep(2)
# Create client connection and begin transaction
conn, cur = connect_normal_db(True)
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
# Check that any new queries succeed after sigint since server should still allow transaction to complete
try:
cur.execute("SELECT 1;")
except psycopg2.OperationalError as e:
# Fail if query fails since server closed
raise Exception("Server closed while in transaction", e.pgerror)
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
##### END #####
##### HANDLE SHUTDOWN TIMEOUT WITH SIGINT #####
# Start pgcat
pgcat_start()
# Wait for server to fully start up
time.sleep(3)
# Create client connection and begin transaction, which should prevent server shutdown unless shutdown timeout is reached
conn, cur = connect_normal_db(True)
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
# pgcat shutdown timeout is set to SHUTDOWN_TIMEOUT seconds, so we sleep for SHUTDOWN_TIMEOUT + 1 seconds
time.sleep(SHUTDOWN_TIMEOUT + 1)
# Check that any new queries succeed after sigint since server should still allow transaction to complete
try:
cur.execute("SELECT 1;")
except psycopg2.OperationalError as e:
pass
else:
# Fail if query execution succeeded
raise Exception("Server not closed after sigint and expected timeout")
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
##### END #####
test_normal_db_access()
test_admin_db_access()
test_shutdown_logic()

View File

@@ -1 +1,2 @@
2.7.1
3.0.0

View File

@@ -3,3 +3,4 @@ source "https://rubygems.org"
gem "pg"
gem "activerecord"
gem "rubocop"
gem "toml", "~> 0.3.0"

View File

@@ -1,24 +1,25 @@
GEM
remote: https://rubygems.org/
specs:
activemodel (7.0.2.2)
activesupport (= 7.0.2.2)
activerecord (7.0.2.2)
activemodel (= 7.0.2.2)
activesupport (= 7.0.2.2)
activesupport (7.0.2.2)
activemodel (7.0.3.1)
activesupport (= 7.0.3.1)
activerecord (7.0.3.1)
activemodel (= 7.0.3.1)
activesupport (= 7.0.3.1)
activesupport (7.0.3.1)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 1.6, < 2)
minitest (>= 5.1)
tzinfo (~> 2.0)
ast (2.4.2)
concurrent-ruby (1.1.9)
i18n (1.10.0)
concurrent-ruby (1.1.10)
i18n (1.11.0)
concurrent-ruby (~> 1.0)
minitest (5.15.0)
minitest (5.16.2)
parallel (1.22.1)
parser (3.1.2.0)
ast (~> 2.4.1)
parslet (2.0.0)
pg (1.3.2)
rainbow (3.1.1)
regexp_parser (2.3.1)
@@ -35,17 +36,21 @@ GEM
rubocop-ast (1.17.0)
parser (>= 3.1.1.0)
ruby-progressbar (1.11.0)
toml (0.3.0)
parslet (>= 1.8.0, < 3.0.0)
tzinfo (2.0.4)
concurrent-ruby (~> 1.0)
unicode-display_width (2.1.0)
PLATFORMS
arm64-darwin-21
x86_64-linux
DEPENDENCIES
activerecord
pg
rubocop
toml (~> 0.3.0)
BUNDLED WITH
2.3.7

View File

@@ -2,6 +2,7 @@
require 'active_record'
require 'pg'
require 'toml'
$stdout.sync = true
@@ -15,7 +16,8 @@ ActiveRecord::Base.establish_connection(
port: 6432,
username: 'sharding_user',
password: 'sharding_user',
database: 'rails_dev',
database: 'sharded_db',
application_name: 'testing_pgcat',
prepared_statements: false, # Transaction mode
advisory_locks: false # Same
)
@@ -116,7 +118,7 @@ end
# Test evil clients
def poorly_behaved_client
conn = PG::connect("postgres://sharding_user:sharding_user@127.0.0.1:6432/rails_dev")
conn = PG::connect("postgres://sharding_user:sharding_user@127.0.0.1:6432/sharded_db?application_name=testing_pgcat")
conn.async_exec 'BEGIN'
conn.async_exec 'SELECT 1'
@@ -127,3 +129,75 @@ end
25.times do
poorly_behaved_client
end
def test_server_parameters
server_conn = PG::connect("postgres://sharding_user:sharding_user@127.0.0.1:6432/sharded_db?application_name=testing_pgcat")
raise StandardError, "Bad server version" if server_conn.server_version == 0
server_conn.close
admin_conn = PG::connect("postgres://admin_user:admin_pass@127.0.0.1:6432/pgcat")
raise StandardError, "Bad server version" if admin_conn.server_version == 0
admin_conn.close
puts 'Server parameters ok'
end
class ConfigEditor
def initialize
@original_config_text = File.read('../../.circleci/pgcat.toml')
text_to_load = @original_config_text.gsub("5432", "\"5432\"")
@original_configs = TOML.load(text_to_load)
end
def original_configs
TOML.load(TOML::Generator.new(@original_configs).body)
end
def with_modified_configs(new_configs)
text_to_write = TOML::Generator.new(new_configs).body
text_to_write = text_to_write.gsub("\"5432\"", "5432")
File.write('../../.circleci/pgcat.toml', text_to_write)
yield
ensure
File.write('../../.circleci/pgcat.toml', @original_config_text)
end
end
def test_reload_pool_recycling
admin_conn = PG::connect("postgres://admin_user:admin_pass@127.0.0.1:6432/pgcat")
server_conn = PG::connect("postgres://sharding_user:sharding_user@127.0.0.1:6432/sharded_db?application_name=testing_pgcat")
server_conn.async_exec("BEGIN")
conf_editor = ConfigEditor.new
new_configs = conf_editor.original_configs
# swap shards
new_configs["pools"]["sharded_db"]["shards"]["0"]["database"] = "shard1"
new_configs["pools"]["sharded_db"]["shards"]["1"]["database"] = "shard0"
raise StandardError if server_conn.async_exec("SELECT current_database();")[0]["current_database"] != 'shard0'
conf_editor.with_modified_configs(new_configs) { admin_conn.async_exec("RELOAD") }
raise StandardError if server_conn.async_exec("SELECT current_database();")[0]["current_database"] != 'shard0'
server_conn.async_exec("COMMIT;")
# Transaction finished, client should get new configs
raise StandardError if server_conn.async_exec("SELECT current_database();")[0]["current_database"] != 'shard1'
server_conn.close()
# New connection should get new configs
server_conn = PG::connect("postgres://sharding_user:sharding_user@127.0.0.1:6432/sharded_db?application_name=testing_pgcat")
raise StandardError if server_conn.async_exec("SELECT current_database();")[0]["current_database"] != 'shard1'
ensure
admin_conn.async_exec("RELOAD") # Go back to old state
admin_conn.close
server_conn.close
puts "Pool Recycling okay!"
end
test_reload_pool_recycling

View File

@@ -1,11 +1,12 @@
DROP DATABASE IF EXISTS shard0;
DROP DATABASE IF EXISTS shard1;
DROP DATABASE IF EXISTS shard2;
DROP DATABASE IF EXISTS some_db;
CREATE DATABASE shard0;
CREATE DATABASE shard1;
CREATE DATABASE shard2;
CREATE DATABASE some_db;
\c shard0
@@ -41,21 +42,51 @@ CREATE TABLE data (
CREATE TABLE data_shard_2 PARTITION OF data FOR VALUES WITH (MODULUS 3, REMAINDER 2);
DROP ROLE IF EXISTS sharding_user;
CREATE ROLE sharding_user ENCRYPTED PASSWORD 'sharding_user' LOGIN;
GRANT CONNECT ON DATABASE shard0 TO sharding_user;
GRANT CONNECT ON DATABASE shard1 TO sharding_user;
GRANT CONNECT ON DATABASE shard2 TO sharding_user;
\c some_db
DROP TABLE IF EXISTS data CASCADE;
CREATE TABLE data (
id BIGINT,
value VARCHAR
);
DROP ROLE IF EXISTS sharding_user;
DROP ROLE IF EXISTS other_user;
DROP ROLE IF EXISTS simple_user;
CREATE ROLE sharding_user ENCRYPTED PASSWORD 'sharding_user' LOGIN;
CREATE ROLE other_user ENCRYPTED PASSWORD 'other_user' LOGIN;
CREATE ROLE simple_user ENCRYPTED PASSWORD 'simple_user' LOGIN;
GRANT CONNECT ON DATABASE shard0 TO sharding_user;
GRANT CONNECT ON DATABASE shard1 TO sharding_user;
GRANT CONNECT ON DATABASE shard2 TO sharding_user;
GRANT CONNECT ON DATABASE shard0 TO other_user;
GRANT CONNECT ON DATABASE shard1 TO other_user;
GRANT CONNECT ON DATABASE shard2 TO other_user;
GRANT CONNECT ON DATABASE some_db TO simple_user;
\c shard0
GRANT ALL ON SCHEMA public TO sharding_user;
GRANT ALL ON TABLE data TO sharding_user;
GRANT ALL ON SCHEMA public TO other_user;
GRANT ALL ON TABLE data TO other_user;
\c shard1
GRANT ALL ON SCHEMA public TO sharding_user;
GRANT ALL ON TABLE data TO sharding_user;
GRANT ALL ON SCHEMA public TO other_user;
GRANT ALL ON TABLE data TO other_user;
\c shard2
GRANT ALL ON SCHEMA public TO sharding_user;
GRANT ALL ON TABLE data TO sharding_user;
GRANT ALL ON SCHEMA public TO other_user;
GRANT ALL ON TABLE data TO other_user;
\c some_db
GRANT ALL ON SCHEMA public TO simple_user;
GRANT ALL ON TABLE data TO simple_user;

View File

@@ -151,3 +151,12 @@ SELECT 1;
set server role to 'replica';
SeT SeRver Role TO 'PrImARY';
select 1;
SET PRIMARY READS TO 'on';
SELECT 1;
SET PRIMARY READS TO 'off';
SELECT 1;
SET PRIMARY READS TO 'default';
SELECT 1;