Compare commits

..

185 Commits

Author SHA1 Message Date
Jose Fernandez (magec)
16a2cece21 Allow failover for replicas 2024-11-12 17:43:36 +01:00
Mostafa
0ee59c0c40 Another no-op helm release (#853) 2024-11-08 06:07:12 -06:00
Mostafa
b61d2cc6f0 Use main branch for helm chart releases (#852) 2024-11-08 06:04:42 -06:00
Jose Fernández
c11418c083 Revert "Do not unban replicas if a primary is available" (#850)
Revert "Do not unban replicas if a primary is available (#843)"

This reverts commit cdcfa99fb9.
2024-11-07 22:00:43 +01:00
Jose Fernández
c9544bdff2 Fix default_role being ignored when query_parser_enabled was false (#847)
Fix default_role being ignored when query_parser_enabled was false
2024-11-07 11:11:49 -06:00
Jose Fernández
cdcfa99fb9 Do not unban replicas if a primary is available (#843)
Add `unban_replicas_when_all_banned` to control unbanning replicas behavior.
2024-11-07 11:11:11 -06:00
Víťa Tauer
f27dc6b483 Fixing invalid setting name in pgcat.toml (#849) 2024-11-07 06:17:09 -06:00
Mostafa
326efc22b3 Another no-op release for helm (#845)
Another no-op release
2024-11-02 18:05:41 -05:00
Mostafa
01c6afb2e5 Attempt a helm chart release (#844)
Attempt a release

Co-authored-by: Mostafa <no_reply@github.com>
2024-11-02 11:55:18 -05:00
Nicolas Vanelslande
a68071dd28 Bump bb8 from 0.8.1 to 0.8.6 (#709)
* Update bb8 to 0.8.6

To get https://github.com/djc/bb8/pull/186 and https://github.com/djc/bb8/pull/189
which fix potential deadlocks (https://github.com/djc/bb8/issues/154).

Also, this (https://github.com/djc/bb8/pull/225) was needed to prevent a connection
leak which was conveniently spotted in our integration tests.

* Ignore ./.bundle (created by dev console)

---------

Co-authored-by: Jose Fernandez (magec) <joseferper@gmail.com>
2024-10-28 06:49:36 -05:00
Mostafa
c27d801abf Rename a couple of variables (#839) 2024-10-23 06:38:07 -05:00
Javier Goday
186e72298f #829: read/write splitting on CTE mutable statements (#835) 2024-10-23 06:20:04 -05:00
Sebastian Serth
3935366d86 End Prometheus stats with a new line separator (#826)
End prometheus stats with a new line separator

According to the [OpenMetrics specification](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#overall-structure), each line MUST end with `\n`. Previously, the last line was not ending with `\n`, so that strict parsers had issues reading the Prometheus stats.
2024-09-22 17:14:04 -05:00
Sean McGivern
b575935b1d Improve documentation for connect_timeout and add min_pool_size (#822)
Currently, `connect_timeout` sounds like it should be for connections to
the Postgres server. It's actually used for obtaining a connection from
the pool.
2024-09-18 06:56:17 -05:00
Shijun Wang
efbab1c333 Helm chart improvements including allowing user password to be pulled from K8s secret (#753)
* Make user min_pool_size configurable

* Set user server_lifetime only if specified

* Increment chart version

* Use default instea of or

* Allow enabling server_tls

* statement_timeout default value

* Allow pulling password from existing secret

---------

Co-authored-by: Mostafa Abdelraouf <mostafa.mohmmed@gmail.com>
2024-09-14 09:57:17 -05:00
Mostafa Abdelraouf
9f12d7958e Fix Ruby tests (#819)
Build is failing with this error

Downloading activerecord-3.2.14 revealed dependencies not in the API or the
lockfile (activesupport (= 3.2.14), activemodel (= 3.2.14), arel (~> 3.0.2),
tzinfo (~> 0.3.29)).
Either installing with `--full-index` or running `bundle update activerecord`
should fix the problem.

After ActiveSupport was updated.

This PR fixes that
2024-09-13 20:02:38 -05:00
dependabot[bot]
e6634ef461 chore(deps): bump activesupport from 7.0.4.1 to 7.0.7.1 in /tests/ruby (#804)
Bumps [activesupport](https://github.com/rails/rails) from 7.0.4.1 to 7.0.7.1.
- [Release notes](https://github.com/rails/rails/releases)
- [Changelog](https://github.com/rails/rails/blob/v7.2.1/activesupport/CHANGELOG.md)
- [Commits](https://github.com/rails/rails/compare/v7.0.4.1...v7.0.7.1)

---
updated-dependencies:
- dependency-name: activesupport
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 19:43:26 -05:00
dependabot[bot]
dab2e58647 chore(deps): bump helm/chart-releaser-action from 1.5.0 to 1.6.0 (#812)
Bumps [helm/chart-releaser-action](https://github.com/helm/chart-releaser-action) from 1.5.0 to 1.6.0.
- [Release notes](https://github.com/helm/chart-releaser-action/releases)
- [Commits](be16258da8...a917fd15b2)

---
updated-dependencies:
- dependency-name: helm/chart-releaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 19:41:25 -05:00
dependabot[bot]
4aaa4378cf chore(deps): bump rexml from 3.2.8 to 3.3.6 in /tests/ruby (#803)
Bumps [rexml](https://github.com/ruby/rexml) from 3.2.8 to 3.3.6.
- [Release notes](https://github.com/ruby/rexml/releases)
- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)
- [Commits](https://github.com/ruby/rexml/compare/v3.2.8...v3.3.6)

---
updated-dependencies:
- dependency-name: rexml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 19:19:30 -05:00
Andrew Jackson
670311daf9 Implement Trust Authentication (#805)
* Implement Trust Authentication

* Remove remaining LDAP stuff

* Reverted LDAP changes, Cleaned up tests

---------

Co-authored-by: Andrew Jackson <andrewjackson2988@gmail.com>
Co-authored-by: CommanderKeynes <andrewjackson947@gmail.coma>
2024-09-10 09:29:45 -05:00
dependabot[bot]
b9ec7f8036 chore(deps): bump actions/setup-python from 4.1.0 to 5.1.0 (#715)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4.1.0 to 5.1.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v4.1.0...v5.1.0)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-07 12:21:21 -05:00
dependabot[bot]
d91d23848b chore(deps): bump helm/kind-action from 1.7.0 to 1.10.0 (#732)
Bumps [helm/kind-action](https://github.com/helm/kind-action) from 1.7.0 to 1.10.0.
- [Release notes](https://github.com/helm/kind-action/releases)
- [Commits](https://github.com/helm/kind-action/compare/v1.7.0...v1.10.0)

---
updated-dependencies:
- dependency-name: helm/kind-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-07 12:20:38 -05:00
dependabot[bot]
bbbc01a467 chore(deps): bump rexml from 3.2.5 to 3.2.8 in /tests/ruby (#743)
Bumps [rexml](https://github.com/ruby/rexml) from 3.2.5 to 3.2.8.
- [Release notes](https://github.com/ruby/rexml/releases)
- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)
- [Commits](https://github.com/ruby/rexml/compare/v3.2.5...v3.2.8)

---
updated-dependencies:
- dependency-name: rexml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-07 12:20:01 -05:00
Sebastian Serth
9bb71ede9d Automatically build deb package on a new version tag (#801)
In #796, I noticed that the deb package was not build since an automation was missing.

With this PR, I add the missing automation.

I tested the workflow in my repo...

    when starting the workflow manually: https://github.com/MrSerth/pgcat/actions/runs/10737879151/job/29780286094
    when drafting a new release: https://github.com/MrSerth/pgcat/actions/runs/10737835796/job/29780146212

Obviously, both workflows failed since I cannot upload to the APT repo. However, the version substitution for the workflow is working correctly (as shown when collapsing the first line of the "Build and release package" step).
2024-09-06 09:11:52 -05:00
Sebastian Serth
88b2afb19b Automatically start systemd service if config file is present (#800)
Previously, upgrading the deb package stopped the service but didn't reenable it after a successful upgrade. This made upgrading the package more difficult and required a second step to restart the service. With this commit, the systemd service is automatically started when the default config file is present.
2024-09-06 09:07:01 -05:00
Mostafa Abdelraouf
f0865ca616 Improve Prometheus exporter output (#795)
* Prometheus metrics updates:

 * Add username label to deconflict metrics that would otherwise
   have duplicate labels across different pools.
 * Group metrics by name and only print HELP and TYPE once per
   metric name.
 * Sort labels for a deterministic output.

---------

Co-authored-by: Curtis Myzie <curtis.myzie@gmail.com>
Co-authored-by: Towhid Khan
2024-09-05 08:58:18 -05:00
Andrew Jackson
7d047c6c19 Implemented python tests with pytest (#790)
Currently the python tests act as scripts. A lot of output is generated to stdout which makes it very hard to figure out where problems were. Also if you want to run only a single test you basically need to comment out code in order to accomplish this.

This PR modifies the python tests to us the pytest python testing framework. This framework allows individual tests to be targeted via the command line, without touching the source code. It also suppressed stdout by default making the test output much easier to read. Also after the tests run it will provide a summary of what failed, what succeded, etc.


Co-authored-by: CommanderKeynes <andrewjackson947@gmail.coma>
Co-authored-by: Andrew Jackson <andrewjackson2988@gmail.com>
2024-09-05 08:16:45 -05:00
Andrew Jackson
f73d15f82c Fix CI script to allow consecutive runs locally (#793)
Co-authored-by: CommanderKeynes <andrewjackson947@gmail.coma>
2024-09-05 08:01:33 -05:00
Mostafa Abdelraouf
69af6cc5e5 Make iterating on integration tests easier (#789)
Writing and iterating on integration tests are cumbersome, having to wait 10 minutes for the test-suite to run just to see if your test works or not is unacceptable.

In this PR, I added a detailed workflow for writing tests that should shorten the feedback cycle of modifying tests to be as low as a few seconds.

It will involve opening a shell into a long-lived container that has all the setup and dependencies necessary and then running your desired tests directly there. I added a convenience script that bootstraps the environment and then opens an interactive shell into the container and you can then run tests immediately in an environment that is more or less identical to what we have running in CircleCI
2024-09-03 11:15:53 -05:00
Mostafa Abdelraouf
ca34597002 Fix broken integration test #740 (#787) 2024-08-31 17:15:13 -05:00
Mostafa Abdelraouf
2def40ea6a Add test case for issue 776 (#786)
I am adding a tiny test that uses the SQL statement that was reported to break an older version of SQL parser library

#776
2024-08-31 10:52:33 -05:00
Mostafa Abdelraouf
c05129018d Improve Prometheus stats + Add Grafana dashboard (#785)
We were missing some labels on metrics generated by the Prometheus exporter so I fixed that. There are still some gaps that I want to address with respect to the metrics we track but this seems like a good start.

I also created a Grafana Dashboard and exported it to JSON. It is designed with the same metric names the Prometheus exporter uses.
2024-08-31 08:18:57 -05:00
Mostafa Abdelraouf
4a7a6a8e7a Cut 1.2.0 release (#783) 2024-08-30 08:30:16 -05:00
Mostafa Abdelraouf
29a476e190 QueryRouter: route to primary when locks exists (select for update) (#782)
Authored-by: Javier Goday <jgoday@gmail.com>
2024-08-30 04:26:36 -05:00
KwongTN
81933b918d Add linux/arm64 docker image build support (#774) 2024-08-29 13:50:38 -05:00
Saraj Munjal
7cbc9178d8 Bump the hyper crate to v1.4.1 and rework prometheus server handling (#778)
Bump hyper to v1.4.1 and rework prometheus server handling
2024-08-29 09:47:58 -05:00
Mostafa Abdelraouf
2c8b2f0776 Fix CI image build step (#780)
The docker CI build image is failing due to this error

249.5     Finished release [optimized] target(s) in 2m 49s
249.5   Installing /home/circleci/.cargo/bin/rustfilt
249.5    Installed package `rustfilt v0.2.1` (executable `rustfilt`)
249.5 error: failed to compile `cargo-binutils v0.3.6`, intermediate artifacts can be found at `/tmp/cargo-installrWENQG`
249.5 
249.5 Caused by:
249.5   package `cargo-platform v0.1.8` cannot be built because it requires rustc 1.73 or newer, while the currently active rustc version is 1.67.1
249.5   Try re-running cargo install with `--locked`
249.5      Summary Successfully installed rustfilt! Failed to install cargo-binutils (see error(s) above).
249.5 error: some crates failed to install

So I am bumping the version up
2024-08-29 08:37:13 -05:00
Mostafa Abdelraouf
8f9a2b8e6f Fix a Panic in admin commands (#779)
We have a panic when we send SHOW or ;;;;;;;;;;;;;;;;; to admin database.

This PR fixes these panics and adds a couple of tests
2024-08-28 21:29:40 -05:00
brandonpike
cbf4d58144 Fix lint warnings for rust-1.79 (#769)
2 things that are recommended by rust-lang - implementing `std::fmt::Display` rather than ToString (1) and using clone_from (2).

[1] https://rust-lang.github.io/rust-clippy/master/index.html#/to_string_trait_impl
[2] https://rust-lang.github.io/rust-clippy/master/index.html#assigning_clones

Signed-off-by: Brandon Pike <pikebrandon@att.net>
2024-07-15 20:30:26 -07:00
Олег Дулецкий
731aa047ba Add ExecReload option to pgcat.service for configuration reloads (#760) 2024-06-24 08:57:58 -07:00
Adrian Garcia Badaracco
88dbcc21d1 update rust version in docker image (#762) 2024-06-24 08:51:38 -07:00
Adrian Garcia Badaracco
c34b15bddc Add STOPSIGNAL to Dockerfile (#758) 2024-06-20 23:23:41 -07:00
Andrey Stikheev
0b034a6831 Add TCP_NODELAY option to improve performance for large response queries (#749)
This commit adds the TCP_NODELAY option to the socket configuration in
`configure_socket` function. Without this option, we observed significant
performance issues when executing SELECT queries with large responses.

Before the fix:
postgres=> SELECT repeat('a', 1); SELECT repeat('a', 8153);
Time: 1.368 ms
Time: 41.364 ms

After the fix:
postgres=> SELECT repeat('a', 1); SELECT repeat('a', 8153);
Time: 1.332 ms
Time: 1.528 ms

By setting TCP_NODELAY, we eliminate the Nagle's algorithm delay, which
results in a substantial improvement in response times for large queries.

This problem was discussed in https://github.com/postgresml/pgcat/issues/616.
2024-05-26 14:47:21 -07:00
Mostafa Abdelraouf
966b8e093c Report checkout error when all servers are down (#736)
We shouldn't report checkout_success when we are going to return Error.
2024-05-08 12:18:27 -05:00
Horacio
c9270a47d4 Use rust:bullseye as base image (#725)
Use rust:bullseye base image

With the original rust:1.70-bullseye image, the container cannot be
built:

17.06   Installing /usr/local/cargo/bin/rustfilt
17.06    Installed package `rustfilt v0.2.1` (executable `rustfilt`)
17.06 error: failed to compile `cargo-binutils v0.3.6`, intermediate artifacts can be found at `/tmp/cargo-installrc6mPb`
17.06
17.06 Caused by:
17.06   package `cargo-platform v0.1.8` cannot be built because it requires rustc 1.73 or newer, while the currently active rustc version is 1.70.0
17.06   Try re-running cargo install with `--locked`
17.06      Summary Successfully installed rustfilt! Failed to install cargo-binutils (see error(s) above).
17.06 error: some crates failed to install

This is the same base image used on tests/docker/Dockerfile
2024-04-19 09:12:57 -07:00
Toby Hede
0d94d0b90a Update sqlparser to 0.41 (#666) 2024-04-12 22:12:37 -07:00
David ALEXANDRE
358724f7a9 feat: add helm chart (#619)
* add workflow

* feat: add pgcat helm chart

* fix: set the right include into configmap

Signed-off-by: David ALEXANDRE <david.alexandre@w6d.io>

* update values and config

* prettifying config

---------

Signed-off-by: David ALEXANDRE <david.alexandre@w6d.io>
2024-02-22 09:26:58 -08:00
Mostafa Abdelraouf
e1e4929d43 Report waiting time only for currently waiting clients (#678)
The pool maxwait metric currently operates differently from Pgbouncer.

The way it operates today is that we keep track of max_wait on each connected client, when SHOW POOLS query is made, we go over the connected clients and we get the max of max_wait times among clients. This means the pool maxwait will never reset, it will always be monotonically increasing until the client with the highest maxwait disconnects.

This PR changes this behavior, by keeping track of the wait_start time on each client, when a client goes into WAITING state, we record the time offset from connect_time. When we either successfully or unsuccessfully checkout a connection from the pool, we reset the wait_start time.

When SHOW POOLS query is made, we go over all connected clients and we only consider clients whose wait_start is non-zero, for clients that have non-zero wait times, we compare them and report the maximum waiting time as maxwait for the pool.
2024-01-18 11:57:28 -06:00
Lev Kokotov
dc4d6edf17 Revert max_wait changes (#658)
* Revert "Reset wait times when checked out successfully (#656)"

This reverts commit ec3920d60f.

* Revert "Not sure how this sneaked past CI"

This reverts commit 4c5498b915.

* Revert "only report wait times from clients currently waiting to match behavior of pgbouncer (#655)"

This reverts commit 0e8064b049.
2023-12-05 01:47:38 -08:00
Lev Kokotov
ec3920d60f Reset wait times when checked out successfully (#656) 2023-12-04 18:33:08 -08:00
Lev
4c5498b915 Not sure how this sneaked past CI 2023-12-04 18:30:03 -08:00
Daniel Babiak
0e8064b049 only report wait times from clients currently waiting to match behavior of pgbouncer (#655)
* Change maxwait to only report wait times from clients currently waiting to match behavior of pgbouncer

* Fix tests
2023-12-04 18:19:51 -08:00
Alec
4dbef49ec9 Require a reason when marking a server bad (#654)
When calling mark_bad require a reason so it can be logged rather than
the generic message
2023-12-04 16:09:41 -08:00
Lev Kokotov
bc07dc9c81 Broken blog link 2 2023-12-03 21:01:23 -08:00
Lev Kokotov
9b8166b313 Broken blog link (#652)
Update README.md
2023-12-03 20:58:39 -08:00
Lev Kokotov
e58d69f3de Fix deb build overwriting config (#651) 2023-12-03 20:27:44 -08:00
Lev Kokotov
e76d720ffb Dont cache prepared statement with errors (#647)
* Fix prepared statement not found when prepared stmt has error

* cleanup debug

* remove more debug msgs

* sure debugged this..

* version bump

* add rust tests
2023-11-28 21:13:30 -08:00
Calvin Hughes
998cc16a3c Expose clients maxwait time in SHOW CLIENTS response via admin (#639)
* Expose clients maxwait time in SHOW CLIENTS response via PgCat admin
Displays the maxwait via maxwait_seconds and maxwait_us columns for each client that can be used to track down the wait time per client in a case where the overall pool stats shows waiting time. The maxwait_us, similar to the pool stats setup, is configured to display as a remainder alongside the maxwait_seconds.

* Use maxwait instead of maxwait_seconds to match pools column name

---------

Co-authored-by: Calvin Hughes <9379992+calvinhughes@users.noreply.github.com>
2023-11-13 11:24:39 -08:00
Jakob Schultz-Falk
7c37da2fad Support unnamed prepared statements (#635)
* Add golang test suite to reproduce issue with unnamed parameterized prepared statements

* Allow caching of unnamed prepared statements

* Passthrough describe on portals

* Remove unneeded kill

* Update Dockerfile.ci with golang

* Move out update of Dockerfiles to separate PR
2023-11-08 16:36:45 -08:00
Jakob Schultz-Falk
b45c6b1d23 Update Dockerfile.ci with golang (#637) 2023-11-08 08:25:49 -08:00
Lev Kokotov
dae240d30c Add connet_timeout and idle_timeout to the user (#634)
* Add connect_timeout to the user

* Allow user to override connect timeout

* version

* lock

* Add both timeouts to the user
2023-11-06 12:18:52 -08:00
Lev Kokotov
b52ea8e7f1 bump version (#629) 2023-10-26 10:50:45 -07:00
Zain Kabani
7d3003a16a Reimplement prepared statements with LRU cache and statement deduplication (#618)
* Initial commit

* Cleanup and add stats

* Use an arc instead of full clones to store the parse packets

* Use mutex instead

* fmt

* clippy

* fmt

* fix?

* fix?

* fmt

* typo

* Update docs

* Refactor custom protocol

* fmt

* move custom protocol handling to before parsing

* Support describe

* Add LRU for server side statement cache

* rename variable

* Refactoring

* Move docs

* Fix test

* fix

* Update tests

* trigger build

* Add more tests

* Reorder handling sync

* Support when a named describe is sent along with Parse (go pgx) and expecting results

* don't talk to client if not needed when client sends Parse

* fmt :(

* refactor tests

* nit

* Reduce hashing

* Reducing work done to decode describe and parse messages

* minor refactor

* Merge branch 'main' into zain/reimplment-prepared-statements-with-global-lru-cache

* Rewrite extended and prepared protocol message handling to better support mocking response packets and close

* An attempt to better handle if there are DDL changes that might break cached plans with ideas about how to further improve it

* fix

* Minor stats fixed and cleanup

* Cosmetic fixes (#64)

* Cosmetic fixes

* fix test

* Change server drop for statement cache error to a `deallocate all`

* Updated comments and added new idea for handling DDL changes impacting cached plans

* fix test?

* Revert test change

* trigger build, flakey test

* Avoid potential race conditions by changing get_or_insert to promote for pool LRU

* remove ps enabled variable on the server in favor of using an option

* Add close to the Extended Protocol buffer

---------

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
2023-10-25 15:11:57 -07:00
Zain Kabani
d37df43a90 Reduces the amount of time the get_pool operation takes (#625)
* Reduces the amount of time the get_pool operation takes

* trigger build

* Fix admin
2023-10-19 23:49:05 -07:00
Mohammad Dashti
2c7bf52c17 Removed unnecessary clippy overrides. (#614)
Removed unnecessary clippy overrides.
2023-10-11 10:13:23 -07:00
Mohammad Dashti
de8df29ca4 Added clippy to CI and fixed all clippy warnings (#613)
* Fixed all clippy warnings.

* Added `clippy` to CI.

* Reverted an unwanted change + Applied `cargo fmt`.

* Fixed the idiom version.

* Revert "Fixed the idiom version."

This reverts commit 6f78be0d42.

* Fixed clippy issues on CI.

* Revert "Fixed clippy issues on CI."

This reverts commit a9fa6ba189.

* Revert "Reverted an unwanted change + Applied `cargo fmt`."

This reverts commit 6bd37b6479.

* Revert "Fixed all clippy warnings."

This reverts commit d1f3b847e3.

* Removed Clippy

* Removed Lint

* `admin.rs` clippy fixes.

* Applied more clippy changes.

* Even more clippy changes.

* `client.rs` clippy fixes.

* `server.rs` clippy fixes.

* Revert "Removed Lint"

This reverts commit cb5042b144.

* Revert "Removed Clippy"

This reverts commit 6dec8bffb1.

* Applied lint.

* Revert "Revert "Fixed clippy issues on CI.""

This reverts commit 49164a733c.
2023-10-10 09:18:21 -07:00
Mohammad Dashti
c4fb72b9fc Added yj to dev Dockerfile (#612) 2023-10-05 18:13:22 -07:00
Mohammad Dashti
3371c01e0e Added a Plugin trait (#536)
* Improved logging

* Improved logging for more `Address` usages

* Fixed lint issues.

* Reverted the `Address` logging changes.

* Applied the PR comment by @levkk.

* Applied the PR comment by @levkk.

* Applied the PR comment by @levkk.

* Applied the PR comment by @levkk.
2023-10-03 13:13:21 -07:00
Mohammad Dashti
c2a483f36a Automatic sharding for INSERT, UPDATE, and DELETE statements. (#610)
Added support for INSERT, UPDATE, and DELETE for auto-sharding.
2023-10-03 09:36:13 -07:00
dependabot[bot]
51cd13b8b5 chore(deps): bump webpki from 0.22.0 to 0.22.2 in /tests/rust (#609)
Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.0 to 0.22.2.
- [Commits](https://github.com/briansmith/webpki/commits)

---
updated-dependencies:
- dependency-name: webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-02 15:30:42 -07:00
Nicolas Vanelslande
a054b454d2 Add psql to the container image. (#607)
It could be used to implement container health checks.
Example:
  PGPASSWORD="<some-password>" psql -U pgcat -p 6432 -h 127.0.0.1 -tA -c "show version;" -d pgcat >/dev/null
2023-09-27 09:03:39 -07:00
Kevin Elliott
04e9814770 Fix incorrect data output for plugin query_logger (#601)
Update query_logger.rs

Pool and user were incorrectly swapped and needed to be fixed.
2023-09-25 18:45:51 -07:00
Lev Kokotov
037d232fcd Mark admin clients as disconnected on error (#597) 2023-09-21 15:55:22 -07:00
Lev Kokotov
b2933762e7 Report maxwait for clients that end up not getting a connection (#596) 2023-09-21 14:50:18 -07:00
Mohammad Dashti
df8aa888f9 Add a cache layer to Docker for development (#594)
* Add a cache layer to Docker.

* Created a separate `dev` Docker file.

* Fixed `Docker.dev` to build in non-release mode.
2023-09-20 10:29:30 -07:00
Mohammad Dashti
7f5639c94a Include thread_id in the logs (#592)
Include `thread_id` in the logs.
2023-09-20 09:11:16 -07:00
Lev Kokotov
c0112f6f12 Revert "User-friendly error messages" (#587)
Revert "User-friendly error messages (#586)"

This reverts commit b7ceee2ddf.
2023-09-11 16:39:31 -07:00
Lev Kokotov
b7ceee2ddf User-friendly error messages (#586) 2023-09-11 16:39:11 -07:00
Mostafa Abdelraouf
0b01d70b55 Allow configuring routing decision when no shard is selected (#578)
The TL;DR for the change is that we allow QueryRouter to set the active shard to None. This signals to the Pool::get method that we have no shard selected. The get method follows a no_shard_specified_behavior config to know how to route the query.

Original PR description
Ruby-pg library makes a startup query to SET client_encoding to ... if Encoding.default_internal value is set (Code). This query is troublesome because we cannot possibly attach a routing comment to it. PgCat, by default, will route that query to the default shard.

Everything is fine until shard 0 has issues, Clients will all be attempting to send this query to shard0 which increases the connection latency significantly for all clients, even those not interested in shard0

This PR introduces no_shard_specified_behavior that defines the behavior in case we have routing-by-comment enabled but we get a query without a comment. The allowed behaviors are

random: Picks a shard at random
random_healthy: Picks a shard at random favoring shards with the least number of recent connection/checkout errors
shard_<number>: e.g. shard_0, shard_4, etc. picks a specific shard, everytime
In order to achieve this, this PR introduces an error_count on the Address Object that tracks the number of errors since the last checkout and uses that metric to sort shards by error count before making a routing decision.
I didn't want to use address stats to avoid introducing a routing dependency on internal stats (We might do that in the future but I prefer to avoid this for the time being.

I also made changes to the test environment to replace Ruby's TOML reader library, It appears to be abandoned and does not support mixed arrays (which we use in the config toml), and it also does not play nicely with single-quoted regular expressions. I opted for using yj which is a CLI tool that can convert from toml to JSON and back. So I refactor the tests to use that library.
2023-09-11 13:47:28 -05:00
hellower
33db0dffa8 stream.peer_addr() & auth_query (#575)
* Don't unwrap stream.peer_addr()

https://github.com/postgresml/pgcat/pull/562 (same code)
(another lines changed)

* auth_query (real sample)

# single quote need
auth_query="SELECT usename, passwd FROM pg_shadow WHERE usename='$1'"
2023-08-31 14:11:38 -07:00
hi019
7994a661d9 Fix Docker image runs erroring due to glibc incompatability (#572)
Fix Docker image builds breaking due to glibc incompatability
2023-08-30 16:51:31 -07:00
Tommy Li
9937193332 Allow pause/resuming all pools (#566)
support pausing all pools
2023-08-29 10:07:36 -07:00
Mostafa Abdelraouf
baa00ff546 Add yj to CI image (#568) 2023-08-28 21:20:53 -05:00
Zain Kabani
ffe820497f Don't unwrap stream.peer_addr() (#562) 2023-08-25 10:33:39 -07:00
Zain Kabani
be549f3faa Fixes try_execute_command message parsing bug (#560)
* Fixes try_execute_command message parsing bug

* Fix initial segment logic

* Add test
2023-08-24 11:25:43 -07:00
dependabot[bot]
4301ab0606 chore(deps): bump rustls-webpki from 0.100.1 to 0.100.2 (#555)
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.100.1 to 0.100.2.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.100.1...v/0.100.2)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-22 11:41:09 -07:00
Cluas
5143500c9a docs: complete the missing general items (#553)
docs: complete the missing general items.
2023-08-20 19:14:19 -07:00
Zain Kabani
3255323bff Adds option to log which parameter status is changed by the client (#550) 2023-08-16 11:01:21 -07:00
Zain Kabani
bb27586758 Reset instead of discard all (#549)
* Use reset all instead of discard all

* Move 'X' handling to before admin handle

* fix tests
2023-08-16 10:08:48 -07:00
Lev Kokotov
4f0f45b576 Add pgcat user (#546)
* Add pgcat user

* warn

* dev
2023-08-10 12:25:43 -07:00
Zain Kabani
f94ce97ebc Handle and track startup parameters (#478)
* User server parameters struct instead of server info bytesmut

* Refactor to use hashmap for all params and add server parameters to client

* Sync parameters on client server checkout

* minor refactor

* update client side parameters when changed

* Move the SET statement logic from the C packet to the S packet.

* trigger build

* revert validation changes

* remove comment

* Try fix

* Reset cleanup state after sync

* fix server version test

* Track application name through client life for stats

* Add tests

* minor refactoring

* fmt

* fix

* fmt
2023-08-10 08:18:46 -07:00
Sebastian Webber
9ab128579d parse server error messages (#543)
This commit adds a parser to the Postgres error message, providing better
error messages.

Implemented based in:
  https://www.postgresql.org/docs/12/protocol-error-fields.html

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
2023-08-09 09:14:05 -07:00
Lev Kokotov
1cde74f05e Revert "Preserve existing behavior" (#542)
Revert "Preserve existing behavior (#541)"

This reverts commit a4de6c1eb6.
2023-08-08 17:45:48 -07:00
Lev Kokotov
a4de6c1eb6 Preserve existing behavior (#541) 2023-08-08 13:48:52 -07:00
Zain Kabani
e14b283f0c Make infer role configurable and fix double parse bug (#533)
* Make infer role configurable and fix double parse bug

* Fix tests

* Enable infer_role_from query in toml for tests

* Fix test

* Add max length config, add logging for which application is failing to parse, and change config name

* fmt

* Update src/config.rs

---------

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
2023-08-08 13:10:03 -07:00
Lev Kokotov
7c3c90c38e Add systemd service (#540) 2023-08-08 11:51:38 -07:00
Lev Kokotov
2ca21b2bec pgcat deb package (#539) 2023-08-08 11:08:46 -07:00
Matthias Pfeil
3986eaa4b2 Add github tag as tag to image (#537) 2023-08-04 10:20:56 -07:00
Lev Kokotov
1f2c6507f7 debug -> release 2023-08-01 17:47:34 -07:00
Lev Kokotov
aefcf4281c Fix for #534 and #535 2023-08-01 17:46:34 -07:00
Bertrand Paquet
9d1c46a3e9 Fix typo in the config documentation (#532) 2023-07-28 00:31:53 -07:00
Spindel Ljungmark
328108aeb5 Restore the ability to filter spammy log messages (#530)
* Move connection checkin log messages to their own target

Under heavy load they can happen thousands of times per second, and
should generally be considered a nuisance at best. This marks the state
discard as an info rather than a warning, and moves all the messages
into their own log-target, so they can be filtered separately from the
more relevant warnings.

Signed-off-by: D.S. Ljungmark <spider@skuggor.se>

* Remove left-over env_logger dependencies

When moving to tracing-subscriber for logging, the env_logger
dependencies were left around, this cuts them out as dead code.

Signed-off-by: D.S. Ljungmark <spider@skuggor.se>

* Restore ability to filter log messages at runtime

This restores the RUST_LOG filters from env_logger but now with the
tracing subscriber setup. The filters are chained so commandline options
mark the default in case either option is set, which should be the path
of least confusion for users.  ( RUST_LOG setting level to debug, and
commandline to warning is an odd user case, and I don't know what a user
who does that is expecting. )

It also bumps the version number as a fix to see which versions have
which behaviour.

Signed-off-by: D.S. Ljungmark <spider@skuggor.se>

---------

Signed-off-by: D.S. Ljungmark <spider@skuggor.se>
2023-07-27 08:51:23 -07:00
Lev Kokotov
4cf54a6122 Release 1.1 (#526) 2023-07-25 10:27:04 -07:00
Mostafa Abdelraouf
2a8f3653a6 Fix COPY FROM and add tests (#522)
* Fix COPY FROM and add tests

* E

* fmt
2023-07-20 23:06:01 -07:00
Sebastian Webber
19cb8a3022 add --no-color option to disable colors in the terminal (#518)
add --no-color option to disable colors

this commit adds a new option to disable colors in the terminal and also
moves the logger configuration to a different crate.

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
2023-07-19 21:15:55 -07:00
Sebastian Webber
f85e5bd9e8 add support for multiple log formats (#517)
this commit adds the tracing-subscriber crate and use its formatters to
support multiple log formats.

More details in
https://github.com/postgresml/pgcat/issues/464#issuecomment-1641430299

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
2023-07-18 23:07:13 -07:00
Sebastian Webber
7bdb4e5cd9 Add cmd line parser (#512)
This commit adds the clap library and configures the necessary args to
parse from the command line,  expanding the current option of a single
file and adding support for environment variables.

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
2023-07-18 13:52:40 -07:00
Sebastian Webber
5d87e3781e push and build only in main and tags (#508)
this commit changes the CI behavior to only build and push when something is committed to main or is a new tag.
2023-07-14 10:30:49 -07:00
dependabot[bot]
3e08c6bd8d chore(deps): bump num_cpus from 1.15.0 to 1.16.0 (#507)
Bumps [num_cpus](https://github.com/seanmonstar/num_cpus) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/seanmonstar/num_cpus/releases)
- [Changelog](https://github.com/seanmonstar/num_cpus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/num_cpus/compare/v1.15.0...v1.16.0)

---
updated-dependencies:
- dependency-name: num_cpus
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-14 07:58:11 -07:00
Sebastian Webber
15b6db8e4e add "show help" command (#505)
This commit adds a new function to handle notify and use it
in the SHOW HELP command, which displays the available options
in the admin console.

Also, adding Fabrízio as a co-author for all the help with the
protocol and the help to structure this PR.

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
Co-authored-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
2023-07-13 22:40:04 -07:00
dependabot[bot]
b2e6dfd9bb chore(deps): bump rustls-pemfile from 1.0.2 to 1.0.3 (#504)
Bumps [rustls-pemfile](https://github.com/rustls/pemfile) from 1.0.2 to 1.0.3.
- [Commits](https://github.com/rustls/pemfile/commits)

---
updated-dependencies:
- dependency-name: rustls-pemfile
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-12 21:41:48 -07:00
Mostafa Abdelraouf
3c9565d351 Add support for tcp_user_timeout (#503)
* Add support for tcp_user_timeout

* option

* duration

* Some()

* docs

* fmt, compile
2023-07-12 11:24:30 -07:00
dependabot[bot]
67579c9af4 chore(deps): bump rustls from 0.21.1 to 0.21.5 (#501)
Bumps [rustls](https://github.com/rustls/rustls) from 0.21.1 to 0.21.5.
- [Release notes](https://github.com/rustls/rustls/releases)
- [Commits](https://github.com/rustls/rustls/compare/v/0.21.1...v/0.21.5)

---
updated-dependencies:
- dependency-name: rustls
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-12 05:46:31 -07:00
Cluas
cf7f6f35ab docs: fix general.autoreload description (#491)
* docs: fix autoreload description

Signed-off-by: Cluas <Cluas@live.cn>

* docs: add blank line

Signed-off-by: Cluas <Cluas@live.cn>

---------

Signed-off-by: Cluas <Cluas@live.cn>
2023-07-12 05:42:44 -07:00
Voldemarich
7205537b49 [BUG] Fix binding of NULL value parameters in prepared statements (#496)
Fix binding of NULL value parameters in prepared statements

Co-authored-by: anon <anon@non.existent>
2023-07-10 10:35:43 +02:00
Zain Kabani
1ed6e925ed Fixes the default for round robing in General (#488) 2023-06-23 09:15:44 -07:00
Lev Kokotov
4b78af9676 Implement Close for prepared statements (#482)
* Partial support for Close

* Close

* respect config value

* prepared spec

* Hmm

* Print cache size
2023-06-18 23:02:34 -07:00
Lev Kokotov
73500c0c96 Fix build (#481) 2023-06-17 09:09:54 -07:00
Lev Kokotov
b167de5aa3 fmt (#480) 2023-06-17 08:57:33 -07:00
Juraj Bubniak
473bb3d17d Log not implemented messages as debug in prometheus metrics. (#477) 2023-06-16 18:48:38 -07:00
Lev Kokotov
c7d6273037 Support for prepared statements (#474)
* Start prepared statements

* parse

* Ok

* optional

* dont rewrite anonymous prepared stmts

* Dont rewrite anonymous prep statements

* hm?

* prep statements

* I see!

* comment

* Print config value

* Rewrite bind and add sqlx test

* fmt

* ok

* Fix

* Fix stats

* its late

* clean up PREPARE
2023-06-16 12:57:44 -07:00
Jeff Chen
94c781881f Report min_pool_size correctly (#471) 2023-06-12 09:23:56 -07:00
dependabot[bot]
a8c81e5df6 chore(deps): bump pin-project from 1.0.12 to 1.1.0 (#440)
Bumps [pin-project](https://github.com/taiki-e/pin-project) from 1.0.12 to 1.1.0.
- [Release notes](https://github.com/taiki-e/pin-project/releases)
- [Changelog](https://github.com/taiki-e/pin-project/blob/main/CHANGELOG.md)
- [Commits](https://github.com/taiki-e/pin-project/compare/v1.0.12...v1.1.0)

---
updated-dependencies:
- dependency-name: pin-project
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 00:51:24 -07:00
dependabot[bot]
1d3746ec9e chore(deps): bump sqlparser from 0.33.0 to 0.34.0 (#448)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.33.0 to 0.34.0.
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.33.0...v0.34.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 00:51:05 -07:00
dependabot[bot]
b5489dc1e6 chore(deps): bump regex from 1.8.1 to 1.8.4 (#466)
Bumps [regex](https://github.com/rust-lang/regex) from 1.8.1 to 1.8.4.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.8.1...1.8.4)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 00:50:46 -07:00
dependabot[bot]
557b425fb1 chore(deps): bump log from 0.4.17 to 0.4.19 (#470)
Bumps [log](https://github.com/rust-lang/log) from 0.4.17 to 0.4.19.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/compare/0.4.17...0.4.19)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-12 00:50:26 -07:00
Zain Kabani
aca9738821 Make queue strategy configurable and default to Fifo (#463)
* Change idle timeout default to 10 minutes

* Revert lifo for now while we investigate connection thrashing issues

* Make queue strategy configurable

* test revert idle time out

* Add pgcat start to python test
2023-06-09 11:35:20 -07:00
Zain Kabani
0bc453a771 Change default server lifetime and bump bb8 version to use LIFO correctly (#462)
Change default server lifetime and idle timeouts and bump bb8 version to use LIFO correctly
2023-05-31 08:25:42 -07:00
Zain Kabani
b67c33b6d0 Use latest bb8 and use Lifo as the queue strategy in the pool (#455)
* Use git bb8

* Use latest bb8 and change pool is use stack
2023-05-28 19:46:13 -07:00
Mostafa Abdelraouf
a8a30ad43b Refactor Pool Stats to be based off of Server/Client stats (#445)
What is wrong
Stats reported by SHOW POOLS seem to be leaking. We see lingering cl_idle , cl_waiting, and similarly for sv_idle , sv_active. We confirmed that these are reporting issues not actual lingering clients.

This behavior is readily reproducible by running

while true; do
    psql "postgres://sharding_user:sharding_user@localhost:6432/sharded_db" -c "SELECT 1" > /dev/null 2>&1  &
done

Why it happens
I wasn't able to get to figure our the reason for the bug but my best guess is that we have race conditions when updating pool-level stats. So even though individual update operations are atomic, we perform a check then update sequence which is not protected by a guard.
https://github.com/postgresml/pgcat/blob/main/src/stats/pool.rs#L174-L179

I am also suspecting that using Relaxed ordering might allow this behavior (I changed all operations to use Ordering::SeqCst but still got lingering clients)

How to fix
Since SHOW POOLS/SHOW SERVER/SHOW CLIENTS all show the current state of the proxy (as opposed to SHOW STATS which show aggregate values), this PR refactors SHOW POOLS to have it construct the results directly from SHOW SERVER and SHOW CLIENT datasets. This reduces the complexity of stat updates and eliminates the need for having locks when updating pool stats as we only care about updating individual client/server states.

This will change the semantics of maxwait, so instead of it holding the maxwait time ever encountered by a client (connected or disconnected), it will only consider connected clients which should be okay given PgCat tends to hold on to client connections more than Pgbouncer.
2023-05-23 08:44:49 -05:00
dependabot[bot]
d63be9b93a chore(deps): bump toml from 0.7.3 to 0.7.4 (#447)
Bumps [toml](https://github.com/toml-rs/toml) from 0.7.3 to 0.7.4.
- [Commits](https://github.com/toml-rs/toml/compare/toml-v0.7.3...toml-v0.7.4)

---
updated-dependencies:
- dependency-name: toml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-19 08:58:13 -07:00
Lev Kokotov
100778670c Ensure data makes it to the client (#446)
* Ensure data makes it to the client

* flush all buffers
2023-05-18 16:41:22 -07:00
Lev Kokotov
37e3349c24 Optionally clean up server connections (#444)
* Optionally clean up server connections

* move setting to pool

* fix test

* Print setting to screen

* fmt

* Fix pool_settings override in tests
2023-05-18 10:46:55 -07:00
Zain Kabani
7f57a89d75 Fix time based average stats (#442)
* keep track of current stats and zero them after updating averages

* Try tests

* typo

* remove commented test stuff

* Avoid dividing by zero

* Fix test

* refactor, get rid of iterator. do it manually

* trigger build

* Fix
2023-05-17 21:38:10 -07:00
Lev Kokotov
0898461c01 Allow to deploy pools without checking (#438) 2023-05-12 12:48:37 -07:00
Lev Kokotov
52b1b43850 Prewarmer (#435)
* Prewarmer

* hmm

* Tests

* default

* fix test

* Correct configuration

* Added minimal config example

* remove connect_timeout
2023-05-12 09:50:52 -07:00
Zain Kabani
0907f1b77f Improve logging for connection cleanup (#428)
* initial commit

* fix

* fmt
2023-05-11 17:40:10 -07:00
Zain Kabani
73260690b0 Fixes average stats bug (#436)
* Add test

* Fix test

* Add fix
2023-05-11 17:37:58 -07:00
Mostafa Abdelraouf
5056cbe8ed Fix docker-compose dev stack for Apple silicon (#432)
The docker-compose dev setup is broken under Apple silicon, starting the stack fails with the following error. Switching to a different docker image fixes the issue.
2023-05-10 10:24:35 -05:00
Lev Kokotov
571b02e178 Calculate averages correctly and preserve totals like before (#429)
* Reset totals after avg calculation

* like it used to be
2023-05-08 10:06:16 -07:00
Andrew Tanner
159eb89bf0 First try with role reset (#427)
* First try with role rest

* update

* extra line

* Update src/server.rs

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

* Update tests/ruby/misc_spec.rb

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>

---------

Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
2023-05-05 15:31:27 -07:00
Lev Kokotov
389993bf3e Accurate log messages (#425) 2023-05-05 08:27:19 -07:00
Lev Kokotov
ba5243b6dd Optionally validate config on boot (#423) 2023-05-03 17:07:23 -07:00
Lev Kokotov
128ef72911 lowercase config query (#422)
* lowercase config query

* remove debug
2023-05-03 16:47:20 -07:00
Lev Kokotov
811885f464 Actually plugins (#421)
* more plugins

* clean up

* fix tests

* fix flakey test
2023-05-03 16:13:45 -07:00
dependabot[bot]
d5e329fec5 chore(deps): bump regex from 1.8.0 to 1.8.1 (#413)
Bumps [regex](https://github.com/rust-lang/regex) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits/1.8.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-03 10:00:05 -07:00
Lev Kokotov
09e54e1175 Plugins! (#420)
* Some queries

* Plugins!!

* cleanup

* actual names

* the actual plugins

* comment

* fix tests

* Tests

* unused errors

* Increase reaper rate to actually enforce settings

* ok
2023-05-03 09:13:05 -07:00
dependabot[bot]
23819c8549 chore(deps): bump rustls from 0.21.0 to 0.21.1 (#419)
Bumps [rustls](https://github.com/rustls/rustls) from 0.21.0 to 0.21.1.
- [Release notes](https://github.com/rustls/rustls/releases)
- [Changelog](https://github.com/rustls/rustls/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/rustls/rustls/compare/v/0.21.0...v/0.21.1)

---
updated-dependencies:
- dependency-name: rustls
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-02 07:32:44 -07:00
Jose Fernández
7dfbd993f2 Add dns_cache for server addresses as in pgbouncer (#249)
* Add dns_cache so server addresses are cached and invalidated when DNS changes.

Adds a module to deal with dns_cache feature. It's
main struct is CachedResolver, which is a simple thread safe
hostname <-> Ips cache with the ability to refresh resolutions
every `dns_max_ttl` seconds. This way, a client can check whether its
ip address has changed.

* Allow reloading dns cached

* Add documentation for dns_cached
2023-05-02 10:26:40 +02:00
Lev Kokotov
3601130ba1 Readme update (#418)
* Readme update

* m

* wording
2023-04-30 09:44:25 -07:00
Lev Kokotov
0d504032b2 Server TLS (#417)
* Server TLS

* Finish up TLS

* thats it

* diff

* remove dead code

* maybe?

* dirty shutdown

* skip flakey test

* remove unused error

* fetch config once
2023-04-30 09:41:46 -07:00
Lev Kokotov
4a87b4807d Add more pool settings (#416)
* Add some pool settings

* fmt
2023-04-26 16:33:26 -07:00
Shawn
cb5ff40a59 fix typo (#415)
chore: typo
2023-04-26 08:28:54 -07:00
dependabot[bot]
62b2d994c1 chore(deps): bump regex from 1.7.3 to 1.8.0 (#411)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.3 to 1.8.0.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-21 06:33:52 -07:00
Lev Kokotov
66805d7e77 README updates (#409)
* Better table

* add image

* promote auth passthrough to stable

* fmt
2023-04-20 07:53:55 -07:00
Lev Kokotov
4ccc1e7fa3 Fix CONFIG (#408)
Fix readme
2023-04-19 07:45:26 -07:00
Lev Kokotov
3dae3d0777 Separate server and client passwords optionally (#407)
* Separate server and user passwords

* config
2023-04-18 09:57:17 -07:00
dependabot[bot]
a18eb42df5 chore(deps): bump serde from 1.0.159 to 1.0.160 (#404)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.159 to 1.0.160.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.159...v1.0.160)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 10:25:00 -07:00
dependabot[bot]
6aacf1fa19 chore(deps): bump serde_derive from 1.0.159 to 1.0.160 (#403)
Bumps [serde_derive](https://github.com/serde-rs/serde) from 1.0.159 to 1.0.160.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.159...v1.0.160)

---
updated-dependencies:
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 10:24:52 -07:00
dependabot[bot]
8e99e65215 chore(deps): bump sqlparser from 0.32.0 to 0.33.0 (#399)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.32.0 to 0.33.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.32.0...v0.33.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 10:24:44 -07:00
dependabot[bot]
5dfbc102a9 chore(deps): bump hyper from 0.14.25 to 0.14.26 (#406)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.25 to 0.14.26.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/v0.14.26/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.25...v0.14.26)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 10:23:52 -07:00
Cluas
bae12fca99 feat: set keepalive for pgcat server itself (#402)
* feat: set keepalive for pgcat server self

* docs: note also set for client
2023-04-12 09:29:43 -07:00
Lev Kokotov
421c5d4b64 Load config on client connect (#401) 2023-04-11 10:32:48 -07:00
Kian-Meng Ang
d568739db9 Fix typos (#398)
Found via `typos --format brief`
2023-04-10 18:37:16 -07:00
Lev Kokotov
692353c839 A couple things (#397)
* Format cleanup

* fmt

* finally
2023-04-10 14:51:01 -07:00
Lev Kokotov
a62f6b0eea Fix port; add user pool mode (#395)
* Fix port; add user pool mode

* will probably break our session/transaction mode tests
2023-04-05 15:06:19 -07:00
dependabot[bot]
89e15f09b5 chore(deps): bump tokio-rustls from 0.23.4 to 0.24.0 (#394)
Bumps [tokio-rustls](https://github.com/tokio-rs/tls) from 0.23.4 to 0.24.0.
- [Release notes](https://github.com/tokio-rs/tls/releases)
- [Commits](https://github.com/tokio-rs/tls/commits)

---
updated-dependencies:
- dependency-name: tokio-rustls
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-02 23:00:09 -07:00
Mostafa Abdelraouf
7ddd23b514 Protocol-level test helpers (#393)
I needed to have granular control over protocol message testing. For example, being able to send protocol messages one-by-one and then be able to inspect the results.

In order to do that, I created this low-level ruby client that can be used to send protocol messages in any order without blocking and also allows inspection of response messages.
2023-04-01 15:27:57 -05:00
dependabot[bot]
faa9c1f64a chore(deps): bump futures from 0.3.27 to 0.3.28 (#392)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.27 to 0.3.28.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.27...0.3.28)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-31 09:35:01 -07:00
dependabot[bot]
9094988491 chore(deps): bump postgres-protocol from 0.6.4 to 0.6.5 (#391)
Bumps [postgres-protocol](https://github.com/sfackler/rust-postgres) from 0.6.4 to 0.6.5.
- [Release notes](https://github.com/sfackler/rust-postgres/releases)
- [Commits](https://github.com/sfackler/rust-postgres/compare/postgres-protocol-v0.6.4...postgres-protocol-v0.6.5)

---
updated-dependencies:
- dependency-name: postgres-protocol
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-31 09:34:51 -07:00
Jose Fernández
6f768a84ce Auth passthrough (auth_query) (#266)
* Add a new exec_simple_query method

This adds a new `exec_simple_query` method so we can make 'out of band'
queries to servers that don't interfere with pools at all.
In order to reuse startup code for making these simple queries,
we need to set the stats (`Reporter`) optional, so using these
simple queries wont interfere with stats.

* Add auth passthough (auth_query)

Adds a feature that allows setting auth passthrough for md5 auth.

It adds 3 new (general and pool) config parameters:

- `auth_query`: An string containing a query that will be executed on boot
to obtain the hash of a given user. This query have to use a placeholder `$1`,
so pgcat can replace it with the user its trying to fetch the hash from.
- `auth_query_user`: The user to use for connecting to the server and executing the
auth_query.
- `auth_query_password`: The password to use for connecting to the server and executing the
auth_query.

The configuration can be done either on the general config (so pools share them) or in a per-pool basis.

The behavior is, at boot time, when validating server connections, a hash is fetched per server
and stored in the pool. When new server connections are created, and no cleartext password is specified,
the obtained hash is used for creating them, if the hash could not be obtained for whatever reason, it retries
it.

When client authentication is tried, it uses cleartext passwords if specified, it not, it checks whether
we have query_auth set up, if so, it tries to use the obtained hash for making client auth. If there is no
hash (we could not obtain one when validating the connection), a new fetch is tried.

Once we have a hash, we authenticate using it against whathever the client has sent us, if there is a failure
we refetch the hash and retry auth (so password changes can be done).

The idea with this 'retrial' mechanism is to make it fault tolerant, so if for whatever reason hash could not be
obtained during connection validation, or the password has change, we can still connect later.

* Add documentation for Auth passthrough
2023-03-30 13:29:23 -07:00
dependabot[bot]
0757d7f3a0 chore(deps): bump serde from 1.0.158 to 1.0.159 (#386)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.158 to 1.0.159.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.158...v1.0.159)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-28 09:54:39 -07:00
dependabot[bot]
568f04feee chore(deps): bump serde_derive from 1.0.154 to 1.0.159 (#387)
Bumps [serde_derive](https://github.com/serde-rs/serde) from 1.0.154 to 1.0.159.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.154...v1.0.159)

---
updated-dependencies:
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-28 09:54:31 -07:00
Jose Fernández
58ce76d9b9 Refactor stats to use atomics (#375)
* Refactor stats to use atomics

When we are dealing with a high number of connections, generated
stats cannot be consumed fast enough by the stats collector loop.
This makes the stats subsystem inconsistent and a log of
warning messages are thrown due to unregistered server/clients.

This change refactors the stats subsystem so it uses atomics:

- Now counters are handled using U64 atomics
- Event system is dropped and averages are calculated using a loop
  every 15 seconds.
- Now, instead of snapshots being generated ever second we keep track of servers/clients
  that have registered. Each pool/server/client has its own instance of the counter and
  makes changes directly, instead of adding an event that gets processed later.

* Manually mplement Hash/Eq in `config::Address` ignoring stats

* Add tests for client connection counters

* Allow connecting to dockerized dev pgcat from the host

* stats: Decrease cl_idle when idle socket disconnects
2023-03-28 17:19:37 +02:00
dependabot[bot]
9a2076a9eb chore(deps): bump futures from 0.3.26 to 0.3.27 (#356)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.26 to 0.3.27.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.26...0.3.27)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:13:45 -07:00
dependabot[bot]
e7e7118725 chore(deps): bump hyper from 0.14.24 to 0.14.25 (#358)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.24 to 0.14.25.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/v0.14.25/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.24...v0.14.25)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:13:36 -07:00
dependabot[bot]
99f790cacf chore(deps): bump toml from 0.7.2 to 0.7.3 (#360)
Bumps [toml](https://github.com/toml-rs/toml) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/toml-rs/toml/releases)
- [Commits](https://github.com/toml-rs/toml/compare/toml-v0.7.2...toml-v0.7.3)

---
updated-dependencies:
- dependency-name: toml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:13:25 -07:00
dependabot[bot]
434b0bb69e chore(deps): bump serde from 1.0.154 to 1.0.158 (#376)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.154 to 1.0.158.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.154...v1.0.158)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:13:13 -07:00
dependabot[bot]
714e043ef0 chore(deps): bump async-trait from 0.1.66 to 0.1.68 (#382)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.66 to 0.1.68.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.66...0.1.68)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:12:54 -07:00
dependabot[bot]
863104aadd chore(deps): bump regex from 1.7.1 to 1.7.3 (#385)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.1 to 1.7.3.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.1...1.7.3)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-27 09:12:43 -07:00
Lev Kokotov
7dd96141e3 Update README.md 2023-03-26 00:33:05 -07:00
Lev Kokotov
0d5feac4b2 Contributors (#384) 2023-03-24 17:12:12 -07:00
Lev Kokotov
90aba9c011 V1 (#383) 2023-03-24 17:10:12 -07:00
Montana Low
0f34b49503 point CI at updated repo 2023-03-24 12:59:03 -07:00
Zain Kabani
ca4431b67e Add idle client in transaction configuration (#380)
* Add idle client in transaction configuration

* fmt

* Update docs

* trigger build

* Add tests

* Make the config dynamic from reloads

* fmt

* comments

* trigger build

* fix config.md

* remove error
2023-03-24 08:20:30 -07:00
118 changed files with 17179 additions and 3870 deletions

View File

@@ -9,7 +9,7 @@ jobs:
# Specify the execution environment. You can specify an image from Dockerhub or use one of our Convenience Images from CircleCI's Developer Hub.
# See: https://circleci.com/docs/2.0/configuration-reference/#docker-machine-macos-windows-executor
docker:
- image: ghcr.io/levkk/pgcat-ci:1.67
- image: ghcr.io/postgresml/pgcat-ci:latest
environment:
RUST_LOG: info
LLVM_PROFILE_FILE: /tmp/pgcat-%m-%p.profraw
@@ -46,6 +46,14 @@ jobs:
POSTGRES_PASSWORD: postgres
POSTGRES_INITDB_ARGS: --auth-local=scram-sha-256 --auth-host=scram-sha-256 --auth=scram-sha-256
- image: postgres:14
command: ["postgres", "-p", "10432", "-c", "shared_preload_libraries=pg_stat_statements"]
environment:
POSTGRES_USER: postgres
POSTGRES_DB: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_INITDB_ARGS: --auth-local=md5 --auth-host=md5 --auth=md5
# Add steps to the job
# See: https://circleci.com/docs/2.0/configuration-reference/#steps
steps:
@@ -55,6 +63,9 @@ jobs:
- run:
name: "Lint"
command: "cargo fmt --check"
- run:
name: "Clippy"
command: "cargo clippy --all --all-targets -- -Dwarnings"
- run:
name: "Tests"
command: "cargo clean && cargo build && cargo test && bash .circleci/run_tests.sh && .circleci/generate_coverage.sh"

View File

@@ -39,7 +39,7 @@ log_client_connections = false
log_client_disconnections = false
# Reload config automatically if it changes.
autoreload = true
autoreload = 15000
# TLS
tls_certificate = ".circleci/server.cert"
@@ -59,6 +59,7 @@ admin_password = "admin_pass"
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
prepared_statements_cache_size = 500
# If the client doesn't specify, route traffic to
# this role by default.
@@ -74,6 +75,10 @@ default_role = "any"
# we'll direct it to the primary.
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, we'll attempt to
# infer the role from the query itself.
query_parser_read_write_splitting = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
# queries. The primary can always be explicitely selected with our custom protocol.
@@ -134,8 +139,10 @@ database = "shard2"
pool_mode = "session"
default_role = "primary"
query_parser_enabled = true
query_parser_read_write_splitting = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
prepared_statements_cache_size = 500
[pools.simple_db.users.0]
username = "simple_user"

View File

@@ -19,12 +19,14 @@ PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 5432 -U postgres -f tests/sharding/q
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 7432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 8432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 9432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 10432 -U postgres -f tests/sharding/query_routing_setup.sql
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard0 -i
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard1 -i
PGPASSWORD=sharding_user pgbench -h 127.0.0.1 -U sharding_user shard2 -i
# Start Toxiproxy
kill -9 $(pgrep toxiproxy) || true
LOG_LEVEL=error toxiproxy-server &
sleep 1
@@ -105,10 +107,26 @@ cd ../..
# These tests will start and stop the pgcat server so it will need to be restarted after the tests
#
pip3 install -r tests/python/requirements.txt
python3 tests/python/tests.py || exit 1
pytest || exit 1
#
# Go tests
# Starts its own pgcat server
#
pushd tests/go
/usr/local/go/bin/go test || exit 1
popd
start_pgcat "info"
#
# Rust tests
#
cd tests/rust
cargo run
cd ../../
# Admin tests
export PGPASSWORD=admin_pass
psql -U admin_user -e -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW STATS' > /dev/null
@@ -160,3 +178,6 @@ killall pgcat -s SIGINT
# Allow for graceful shutdown
sleep 1
kill -9 $(pgrep toxiproxy)
sleep 1

14
.editorconfig Normal file
View File

@@ -0,0 +1,14 @@
root = true
[*]
trim_trailing_whitespace = true
insert_final_newline = true
[*.rs]
indent_style = space
indent_size = 4
max_line_length = 120
[*.toml]
indent_style = space
indent_size = 2

View File

@@ -10,3 +10,7 @@ updates:
commit-message:
prefix: "chore(deps)"
open-pull-requests-limit: 10
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"

View File

@@ -1,6 +1,13 @@
name: Build and Push
on: push
on:
push:
paths:
- '!charts/**.md'
branches:
- main
tags:
- v*
env:
registry: ghcr.io
@@ -16,33 +23,40 @@ jobs:
steps:
- name: Checkout Repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: Determine tags
id: metadata
uses: docker/metadata-action@v4
uses: docker/metadata-action@v5
with:
images: ${{ env.registry }}/${{ env.image-name }}
tags: |
type=sha,prefix=,format=long
type=schedule
type=ref,event=tag
type=ref,event=branch
type=ref,event=pr
type=raw,value=latest,enable={{ is_default_branch }}
- name: Log in to the Container registry
uses: docker/login-action@v2.1.0
uses: docker/login-action@v3
with:
registry: ${{ env.registry }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push ${{ env.image-name }}
uses: docker/build-push-action@v3
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64,linux/arm64
provenance: false
push: true
tags: ${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}

50
.github/workflows/chart-lint-test.yaml vendored Normal file
View File

@@ -0,0 +1,50 @@
name: Lint and Test Charts
on:
pull_request:
paths:
- charts/**
- '!charts/**.md'
jobs:
lint-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3.1.0
with:
fetch-depth: 0
- name: Set up Helm
uses: azure/setup-helm@v3
with:
version: v3.8.1
# Python is required because `ct lint` runs Yamale (https://github.com/23andMe/Yamale) and
# yamllint (https://github.com/adrienverge/yamllint) which require Python
- name: Set up Python
uses: actions/setup-python@v5.1.0
with:
python-version: 3.7
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.2.1
with:
version: v3.5.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config ct.yaml)
if [[ -n "$changed" ]]; then
echo "changed=true" >> $GITHUB_OUTPUT
fi
- name: Run chart-testing (lint)
run: ct lint --config ct.yaml
- name: Create kind cluster
uses: helm/kind-action@v1.10.0
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (install)
run: ct install --config ct.yaml

40
.github/workflows/chart-release.yaml vendored Normal file
View File

@@ -0,0 +1,40 @@
name: Release Charts
on:
push:
paths:
- charts/**
- '!**.md'
branches:
- main
jobs:
release:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout
uses: actions/checkout@8ade135a41bc03ea155e62e844d188df1ea18608 # v4.1.0
with:
fetch-depth: 0
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Install Helm
uses: azure/setup-helm@5119fcb9089d432beecbf79bb2c7915207344b78 # v3.5
with:
version: v3.13.0
- name: Run chart-releaser
uses: helm/chart-releaser-action@a917fd15b20e8b64b94d9158ad54cd6345335584 # v1.6.0
with:
charts_dir: charts
config: cr.yaml
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"

View File

@@ -0,0 +1,48 @@
name: '[CI/CD] Update README metadata'
on:
pull_request_target:
branches:
- main
paths:
- 'charts/*/values.yaml'
# Remove all permissions by default
permissions: {}
jobs:
update-readme-metadata:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Install readme-generator-for-helm
run: npm install -g @bitnami/readme-generator-for-helm
- name: Checkout
uses: actions/checkout@8ade135a41bc03ea155e62e844d188df1ea18608
with:
path: charts
ref: ${{github.event.pull_request.head.ref}}
repository: ${{github.event.pull_request.head.repo.full_name}}
token: ${{ secrets.GITHUB_TOKEN }}
- name: Execute readme-generator-for-helm
env:
DIFF_URL: "${{github.event.pull_request.diff_url}}"
TEMP_FILE: "${{runner.temp}}/pr-${{github.event.number}}.diff"
run: |
# This request doesn't consume API calls.
curl -Lkso $TEMP_FILE $DIFF_URL
files_changed="$(sed -nr 's/[\-\+]{3} [ab]\/(.*)/\1/p' $TEMP_FILE | sort | uniq)"
# Adding || true to avoid "Process exited with code 1" errors
charts_dirs_changed="$(echo "$files_changed" | xargs dirname | grep -o "pgcat/[^/]*" | sort | uniq || true)"
for chart in ${charts_dirs_changed}; do
echo "Updating README.md for ${chart}"
readme-generator --values "charts/${chart}/values.yaml" --readme "charts/${chart}/README.md" --schema "/tmp/schema.json"
done
- name: Push changes
run: |
# Push all the changes
cd charts
if git status -s | grep pgcat; then
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
git add . && git commit -am "Update README.md with readme-generator-for-helm" --signoff && git push
fi

View File

@@ -15,6 +15,6 @@ jobs:
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build CI Docker image
run: |
docker build . -f Dockerfile.ci --tag ghcr.io/levkk/pgcat-ci:latest
docker run ghcr.io/levkk/pgcat-ci:latest
docker push ghcr.io/levkk/pgcat-ci:latest
docker build . -f Dockerfile.ci --tag ghcr.io/postgresml/pgcat-ci:latest
docker run ghcr.io/postgresml/pgcat-ci:latest
docker push ghcr.io/postgresml/pgcat-ci:latest

View File

@@ -0,0 +1,59 @@
name: pgcat package (deb)
on:
push:
tags:
- v*
workflow_dispatch:
inputs:
packageVersion:
default: "1.1.2-dev1"
jobs:
build:
strategy:
max-parallel: 1
fail-fast: false # Let the other job finish, or they can lock each other out
matrix:
os: ["buildjet-4vcpu-ubuntu-2204", "buildjet-4vcpu-ubuntu-2204-arm"]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- name: Set package version
if: github.event_name == 'push' # For push event
run: |
TAG=${{ github.ref_name }}
echo "packageVersion=${TAG#v}" >> "$GITHUB_ENV"
- name: Set package version (manual dispatch)
if: github.event_name == 'workflow_dispatch' # For manual dispatch
run: echo "packageVersion=${{ github.event.inputs.packageVersion }}" >> "$GITHUB_ENV"
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Install dependencies
env:
DEBIAN_FRONTEND: noninteractive
TZ: Etc/UTC
run: |
curl -sLO https://github.com/deb-s3/deb-s3/releases/download/0.11.4/deb-s3-0.11.4.gem
sudo gem install deb-s3-0.11.4.gem
dpkg-deb --version
- name: Build and release package
env:
AWS_ACCESS_KEY_ID: ${{ vars.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: ${{ vars.AWS_DEFAULT_REGION }}
run: |
if [[ $(arch) == "x86_64" ]]; then
export ARCH=amd64
else
export ARCH=arm64
fi
bash utilities/deb.sh ${{ env.packageVersion }}
deb-s3 upload \
--lock \
--bucket apt.postgresml.org \
pgcat-${{ env.packageVersion }}-ubuntu22.04-${ARCH}.deb \
--codename $(lsb_release -cs)

3
.gitignore vendored
View File

@@ -10,3 +10,6 @@ lcov.info
dev/.bash_history
dev/cache
!dev/cache/.keepme
.venv
**/__pycache__
.bundle

2
.rustfmt.toml Normal file
View File

@@ -0,0 +1,2 @@
edition = "2021"
hard_tabs = false

224
CONFIG.md
View File

@@ -1,4 +1,4 @@
# PgCat Configurations
# PgCat Configurations
## `general` Section
### host
@@ -36,10 +36,11 @@ Port at which prometheus exporter listens on.
### connect_timeout
```
path: general.connect_timeout
default: 5000 # milliseconds
default: 1000 # milliseconds
```
How long to wait before aborting a server connection (ms).
How long the client waits to obtain a server connection before aborting (ms).
This is similar to PgBouncer's `query_wait_timeout`.
### idle_timeout
```
@@ -49,6 +50,54 @@ default: 30000 # milliseconds
How long an idle connection with a server is left open (ms).
### server_lifetime
```
path: general.server_lifetime
default: 86400000 # 24 hours
```
Max connection lifetime before it's closed, even if actively used.
### server_round_robin
```
path: general.server_round_robin
default: false
```
Whether to use round robin for server selection or not.
### server_tls
```
path: general.server_tls
default: false
```
Whether to use TLS for server connections or not.
### verify_server_certificate
```
path: general.verify_server_certificate
default: false
```
Whether to verify server certificate or not.
### verify_config
```
path: general.verify_config
default: true
```
Whether to verify config or not.
### idle_client_in_transaction_timeout
```
path: general.idle_client_in_transaction_timeout
default: 0 # milliseconds
```
How long a client is allowed to be idle while in a transaction (ms).
### healthcheck_timeout
```
path: general.healthcheck_timeout
@@ -100,10 +149,10 @@ If we should log client disconnections
### autoreload
```
path: general.autoreload
default: false
default: 15000 # milliseconds
```
When set to true, PgCat reloads configs if it detects a change in the config file.
When set, PgCat automatically reloads its configurations at the specified interval (in milliseconds) if it detects changes in the configuration file. The default interval is 15000 milliseconds or 15 seconds.
### worker_threads
```
@@ -135,7 +184,13 @@ path: general.tcp_keepalives_interval
default: 5
```
Number of seconds between keepalive packets.
### tcp_user_timeout
```
path: general.tcp_user_timeout
default: 10000
```
A linux-only parameters that defines the amount of time in milliseconds that transmitted data may remain unacknowledged or buffered data may remain untransmitted (due to zero window size) before TCP will forcibly disconnect
### tls_certificate
```
@@ -144,7 +199,7 @@ default: <UNSET>
example: "server.cert"
```
Path to TLS Certficate file to use for TLS connections
Path to TLS Certificate file to use for TLS connections
### tls_private_key
```
@@ -172,6 +227,55 @@ default: "admin_pass"
Password to access the virtual administrative database
### auth_query
```
path: general.auth_query
default: <UNSET>
example: "SELECT $1"
```
Query to be sent to servers to obtain the hash used for md5 authentication. The connection will be
established using the database configured in the pool. This parameter is inherited by every pool
and can be redefined in pool configuration.
### auth_query_user
```
path: general.auth_query_user
default: <UNSET>
example: "sharding_user"
```
User to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
specified in `auth_query_user`. The connection will be established using the database configured in the pool.
This parameter is inherited by every pool and can be redefined in pool configuration.
### auth_query_password
```
path: general.auth_query_password
default: <UNSET>
example: "sharding_user"
```
Password to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
specified in `auth_query_user`. The connection will be established using the database configured in the pool.
This parameter is inherited by every pool and can be redefined in pool configuration.
### dns_cache_enabled
```
path: general.dns_cache_enabled
default: false
```
When enabled, ip resolutions for server connections specified using hostnames will be cached
and checked for changes every `dns_max_ttl` seconds. If a change in the host resolution is found
old ip connections are closed (gracefully) and new connections will start using new ip.
### dns_max_ttl
```
path: general.dns_max_ttl
default: 30
```
Specifies how often (in seconds) cached ip addresses for servers are rechecked (see `dns_cache_enabled`).
## `pools.<pool_name>` Section
### pool_mode
@@ -192,7 +296,7 @@ default: "random"
Load balancing mode
`random` selects the server at random
`loc` selects the server with the least outstanding busy conncetions
`loc` selects the server with the least outstanding busy connections
### default_role
```
@@ -205,7 +309,25 @@ If the client doesn't specify, PgCat routes traffic to this role by default.
`replica` round-robin between replicas only without touching the primary,
`primary` all queries go to the primary unless otherwise specified.
### query_parser_enabled (experimental)
### replica_to_primary_failover_enabled
```
path: pools.<pool_name>.replica_to_primary_failover_enabled
default: "false"
```
If set to true, when the specified role is `replica` (either by setting `default_role` or manually)
and all replicas are banned, queries will be sent to the primary (until a replica is back online).
### prepared_statements_cache_size
```
path: general.prepared_statements_cache_size
default: 0
```
Size of the prepared statements cache. 0 means disabled.
TODO: update documentation
### query_parser_enabled
```
path: pools.<pool_name>.query_parser_enabled
default: true
@@ -226,7 +348,7 @@ If the query parser is enabled and this setting is enabled, the primary will be
load balancing of read queries. Otherwise, the primary will only be used for write
queries. The primary can always be explicitly selected with our custom protocol.
### sharding_key_regex (experimental)
### sharding_key_regex
```
path: pools.<pool_name>.sharding_key_regex
default: <UNSET>
@@ -248,7 +370,40 @@ Current options:
`pg_bigint_hash`: PARTITION BY HASH (Postgres hashing function)
`sha1`: A hashing function based on SHA1
### automatic_sharding_key (experimental)
### auth_query
```
path: pools.<pool_name>.auth_query
default: <UNSET>
example: "SELECT $1"
```
Query to be sent to servers to obtain the hash used for md5 authentication. The connection will be
established using the database configured in the pool. This parameter is inherited by every pool
and can be redefined in pool configuration.
### auth_query_user
```
path: pools.<pool_name>.auth_query_user
default: <UNSET>
example: "sharding_user"
```
User to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
specified in `auth_query_user`. The connection will be established using the database configured in the pool.
This parameter is inherited by every pool and can be redefined in pool configuration.
### auth_query_password
```
path: pools.<pool_name>.auth_query_password
default: <UNSET>
example: "sharding_user"
```
Password to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
specified in `auth_query_user`. The connection will be established using the database configured in the pool.
This parameter is inherited by every pool and can be redefined in pool configuration.
### automatic_sharding_key
```
path: pools.<pool_name>.automatic_sharding_key
default: <UNSET>
@@ -281,7 +436,8 @@ path: pools.<pool_name>.users.<user_index>.username
default: "sharding_user"
```
Postgresql username
PostgreSQL username used to authenticate the user and connect to the server
if `server_username` is not set.
### password
```
@@ -289,7 +445,26 @@ path: pools.<pool_name>.users.<user_index>.password
default: "sharding_user"
```
Postgresql password
PostgreSQL password used to authenticate the user and connect to the server
if `server_password` is not set.
### server_username
```
path: pools.<pool_name>.users.<user_index>.server_username
default: <UNSET>
example: "another_user"
```
PostgreSQL username used to connect to the server.
### server_password
```
path: pools.<pool_name>.users.<user_index>.server_password
default: <UNSET>
example: "another_password"
```
PostgreSQL password used to connect to the server.
### pool_size
```
@@ -297,10 +472,18 @@ path: pools.<pool_name>.users.<user_index>.pool_size
default: 9
```
Maximum number of server connections that can be established for this user
Maximum number of server connections that can be established for this user.
The maximum number of connection from a single Pgcat process to any database in the cluster
is the sum of pool_size across all users.
### min_pool_size
```
path: pools.<pool_name>.users.<user_index>.min_pool_size
default: 0
```
Minimum number of idle server connections to retain for this pool.
### statement_timeout
```
path: pools.<pool_name>.users.<user_index>.statement_timeout
@@ -310,6 +493,16 @@ default: 0
Maximum query duration. Dangerous, but protects against DBs that died in a non-obvious way.
0 means it is disabled.
### connect_timeout
```
path: pools.<pool_name>.users.<user_index>.connect_timeout
default: <UNSET> # milliseconds
```
How long the client waits to obtain a server connection before aborting (ms).
This is similar to PgBouncer's `query_wait_timeout`.
If unset, uses the `connect_timeout` defined globally.
## `pools.<pool_name>.shards.<shard_index>` Section
### servers
@@ -320,7 +513,7 @@ default: [["127.0.0.1", 5432, "primary"], ["localhost", 5432, "replica"]]
Array of servers in the shard, each server entry is an array of `[host, port, role]`
### mirrors (experimental)
### mirrors
```
path: pools.<pool_name>.shards.<shard_index>.mirrors
default: <UNSET>
@@ -337,4 +530,3 @@ default: "shard0"
```
Database name (e.g. "postgres")

View File

@@ -2,10 +2,36 @@
Thank you for contributing! Just a few tips here:
1. `cargo fmt` your code before opening up a PR
1. `cargo fmt` and `cargo clippy` your code before opening up a PR
2. Run the test suite (e.g. `pgbench`) to make sure everything still works. The tests are in `.circleci/run_tests.sh`.
3. Performance is important, make sure there are no regressions in your branch vs. `main`.
## How to run the integration tests locally and iterate on them
We have integration tests written in Ruby, Python, Go and Rust.
Below are the steps to run them in a developer-friendly way that allows iterating and quick turnaround.
Hear me out, this should be easy, it will involve opening a shell into a container with all the necessary dependancies available for you and you can modify the test code and immediately rerun your test in the interactive shell.
Quite simply, make sure you have docker installed and then run
`./start_test_env.sh`
That is it!
Within this test environment you can modify the file in your favorite IDE and rerun the tests without having to bootstrap the entire environment again.
Once the environment is ready, you can run the tests by running
Ruby: `cd /app/tests/ruby && bundle exec ruby <test_name>.rb --format documentation`
Python: `cd /app/ && pytest`
Rust: `cd /app/tests/rust && cargo run`
Go: `cd /app/tests/go && /usr/local/go/bin/go test`
You can also rebuild PgCat directly within the environment and the tests will run against the newly built binary
To rebuild PgCat, just run `cargo build` within the container under `/app`
![Animated gif showing how to run tests](https://github.com/user-attachments/assets/2258fde3-2aed-4efb-bdc5-e4f12dcd4d33)
Happy hacking!
## TODOs

1389
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,42 +1,60 @@
[package]
name = "pgcat"
version = "0.6.0-alpha1"
version = "1.2.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
tokio = { version = "1", features = ["full"] }
bytes = "1"
md-5 = "0.10"
bb8 = "0.8.0"
bb8 = "=0.8.6"
async-trait = "0.1"
rand = "0.8"
chrono = "0.4"
sha-1 = "0.10"
toml = "0.7"
serde = "1"
serde = { version = "1", features = ["derive"] }
serde_derive = "1"
regex = "1"
num_cpus = "1"
once_cell = "1"
sqlparser = "0.32.0"
sqlparser = { version = "0.41", features = ["visitor"] }
log = "0.4"
arc-swap = "1"
env_logger = "0.10"
parking_lot = "0.12.1"
hmac = "0.12"
sha2 = "0.10"
base64 = "0.21"
stringprep = "0.1"
tokio-rustls = "0.23"
tokio-rustls = "0.24"
rustls-pemfile = "1"
hyper = { version = "0.14", features = ["full"] }
http-body-util = "0.1.2"
hyper = { version = "1.4.1", features = ["full"] }
hyper-util = { version = "0.1.7", features = ["tokio"] }
phf = { version = "0.11.1", features = ["macros"] }
exitcode = "1.1.2"
futures = "0.3"
socket2 = { version = "0.4.7", features = ["all"] }
nix = "0.26.2"
atomic_enum = "0.2.0"
postgres-protocol = "0.6.5"
fallible-iterator = "0.2"
pin-project = "1"
webpki-roots = "0.23"
rustls = { version = "0.21", features = ["dangerous_configuration"] }
trust-dns-resolver = "0.22.0"
tokio-test = "0.4.2"
serde_json = "1"
itertools = "0.10"
clap = { version = "4.3.1", features = ["derive", "env"] }
tracing = "0.1.37"
tracing-subscriber = { version = "0.3.17", features = [
"json",
"env-filter",
"std",
] }
lru = "0.12.0"
[target.'cfg(not(target_env = "msvc"))'.dependencies]
jemallocator = "0.5.0"

View File

@@ -1,11 +1,22 @@
FROM rust:1 AS builder
FROM rust:1.79.0-slim-bookworm AS builder
RUN apt-get update && \
apt-get install -y build-essential
COPY . /app
WORKDIR /app
RUN cargo build --release
FROM debian:bullseye-slim
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -o Dpkg::Options::=--force-confdef -yq --no-install-recommends \
postgresql-client \
# Clean up layer
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& truncate -s 0 /var/log/*log
COPY --from=builder /app/target/release/pgcat /usr/bin/pgcat
COPY --from=builder /app/pgcat.toml /etc/pgcat/pgcat.toml
WORKDIR /etc/pgcat
ENV RUST_LOG=info
CMD ["pgcat"]
STOPSIGNAL SIGINT

View File

@@ -1,4 +1,6 @@
FROM cimg/rust:1.67.1
FROM cimg/rust:1.79.0
COPY --from=sclevine/yj /bin/yj /bin/yj
RUN /bin/yj -h
RUN sudo apt-get update && \
sudo apt-get install -y \
psmisc postgresql-contrib-14 postgresql-client-14 libpq-dev \
@@ -7,6 +9,9 @@ RUN sudo apt-get update && \
sudo apt-get upgrade curl && \
cargo install cargo-binutils rustfilt && \
rustup component add llvm-tools-preview && \
pip3 install psycopg2 && sudo gem install bundler && \
pip3 install psycopg2 && sudo gem install bundler && \
wget -O /tmp/toxiproxy-2.4.0.deb https://github.com/Shopify/toxiproxy/releases/download/v2.4.0/toxiproxy_2.4.0_linux_$(dpkg --print-architecture).deb && \
sudo dpkg -i /tmp/toxiproxy-2.4.0.deb
RUN wget -O /tmp/go1.21.3.linux-$(dpkg --print-architecture).tar.gz https://go.dev/dl/go1.21.3.linux-$(dpkg --print-architecture).tar.gz && \
sudo tar -C /usr/local -xzf /tmp/go1.21.3.linux-$(dpkg --print-architecture).tar.gz && \
rm /tmp/go1.21.3.linux-$(dpkg --print-architecture).tar.gz

25
Dockerfile.dev Normal file
View File

@@ -0,0 +1,25 @@
FROM lukemathwalker/cargo-chef:latest-rust-1 AS chef
RUN apt-get update && \
apt-get install -y build-essential
WORKDIR /app
FROM chef AS planner
COPY . .
RUN cargo chef prepare --recipe-path recipe.json
FROM chef AS builder
COPY --from=planner /app/recipe.json recipe.json
# Build dependencies - this is the caching Docker layer!
RUN cargo chef cook --release --recipe-path recipe.json
# Build application
COPY . .
RUN cargo build
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/pgcat /usr/bin/pgcat
COPY --from=builder /app/pgcat.toml /etc/pgcat/pgcat.toml
WORKDIR /etc/pgcat
ENV RUST_LOG=info
CMD ["pgcat"]

View File

@@ -1,4 +1,4 @@
Copyright (c) 2022 Lev Kokotov <lev@levthe.dev>
Copyright (c) 2023 PgCat Contributors
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the

433
README.md
View File

@@ -1,33 +1,80 @@
##### PgCat: PostgreSQL at petabyte scale
## PgCat: Nextgen PostgreSQL Pooler
[![CircleCI](https://circleci.com/gh/levkk/pgcat/tree/main.svg?style=svg)](https://circleci.com/gh/levkk/pgcat/tree/main)
[![CircleCI](https://circleci.com/gh/postgresml/pgcat/tree/main.svg?style=svg)](https://circleci.com/gh/postgresml/pgcat/tree/main)
<a href="https://discord.gg/DmyJP3qJ7U" target="_blank">
<img src="https://img.shields.io/discord/1013868243036930099" alt="Join our Discord!" />
</a>
PostgreSQL pooler (like PgBouncer) with sharding, load balancing and failover support.
**Beta**: looking for beta testers, see [#35](https://github.com/levkk/pgcat/issues/35).
PostgreSQL pooler and proxy (like PgBouncer) with support for sharding, load balancing, failover and mirroring.
## Features
| **Feature** | **Status** | **Comments** |
|--------------------------------|-----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :white_check_mark: | Identical to PgBouncer. |
| Session pooling | :white_check_mark: | Identical to PgBouncer. |
| `COPY` support | :white_check_mark: | Both `COPY TO` and `COPY FROM` are supported. |
| Query cancellation | :white_check_mark: | Supported both in transaction and session pooling modes. |
| Load balancing of read queries | :white_check_mark: | Using random between replicas. Primary is included when `primary_reads_enabled` is enabled (default). |
| Sharding | :white_check_mark: | Transactions are sharded using `SET SHARD TO` and `SET SHARDING KEY TO` syntax extensions; see examples below. |
| Failover | :white_check_mark: | Replicas are tested with a health check. If a health check fails, remaining replicas are attempted; see below for algorithm description and examples. |
| Statistics | :white_check_mark: | Statistics available in the admin database (`pgcat` and `pgbouncer`) with `SHOW STATS`, `SHOW POOLS` and others. |
| Live configuration reloading | :white_check_mark: | Reload supported settings with a `SIGHUP` to the process, e.g. `kill -s SIGHUP $(pgrep pgcat)` or `RELOAD` query issued to the admin database. |
| Client authentication | :white_check_mark: :wrench: | MD5 password authentication is supported, SCRAM is on the roadmap; one user is used to connect to Postgres with both SCRAM and MD5 supported. |
| Admin database | :white_check_mark: | The admin database, similar to PgBouncer's, allows to query for statistics and reload the configuration. |
| **Feature** | **Status** | **Comments** |
|-------------|------------|--------------|
| Transaction pooling | **Stable** | Identical to PgBouncer with notable improvements for handling bad clients and abandoned transactions. |
| Session pooling | **Stable** | Identical to PgBouncer. |
| Multi-threaded runtime | **Stable** | Using Tokio asynchronous runtime, the pooler takes advantage of multicore machines. |
| Load balancing of read queries | **Stable** | Queries are automatically load balanced between replicas and the primary. |
| Failover | **Stable** | Queries are automatically rerouted around broken replicas, validated by regular health checks. |
| Admin database statistics | **Stable** | Pooler statistics and administration via the `pgbouncer` and `pgcat` databases. |
| Prometheus statistics | **Stable** | Statistics are reported via a HTTP endpoint for Prometheus. |
| SSL/TLS | **Stable** | Clients can connect to the pooler using TLS. Pooler can connect to Postgres servers using TLS. |
| Client/Server authentication | **Stable** | Clients can connect using MD5 authentication, supported by `libpq` and all Postgres client drivers. PgCat can connect to Postgres using MD5 and SCRAM-SHA-256. |
| Live configuration reloading | **Stable** | Identical to PgBouncer; all settings can be reloaded dynamically (except `host` and `port`). |
| Auth passthrough | **Stable** | MD5 password authentication can be configured to use an `auth_query` so no cleartext passwords are needed in the config file.|
| Sharding using extended SQL syntax | **Experimental** | Clients can dynamically configure the pooler to route queries to specific shards. |
| Sharding using comments parsing/Regex | **Experimental** | Clients can include shard information (sharding key, shard ID) in the query comments. |
| Automatic sharding | **Experimental** | PgCat can parse queries, detect sharding keys automatically, and route queries to the correct shard. |
| Mirroring | **Experimental** | Mirror queries between multiple databases in order to test servers with realistic production traffic. |
## Status
PgCat is stable and used in production to serve hundreds of thousands of queries per second.
<table>
<tr>
<td>
<a href="https://tech.instacart.com/adopting-pgcat-a-nextgen-postgres-proxy-3cf284e68c2f">
<img src="./images/instacart.webp" height="70" width="auto">
</a>
</td>
<td>
<a href="https://postgresml.org/blog/scaling-postgresml-to-1-million-requests-per-second">
<img src="./images/postgresml.webp" height="70" width="auto">
</a>
</td>
<td>
<a href="https://onesignal.com">
<img src="./images/one_signal.webp" height="70" width="auto">
</a>
</td>
</tr>
<tr>
<td>
<a href="https://tech.instacart.com/adopting-pgcat-a-nextgen-postgres-proxy-3cf284e68c2f">
Instacart
</a>
</td>
<td>
<a href="https://postgresml.org/blog/scaling-postgresml-to-1-million-requests-per-second">
PostgresML
</a>
</td>
<td>
OneSignal
</td>
</tr>
</table>
Some features remain experimental and are being actively developed. They are optional and can be enabled through configuration.
## Deployment
See `Dockerfile` for example deployment using Docker. The pooler is configured to spawn 4 workers so 4 CPUs are recommended for optimal performance. That setting can be adjusted to spawn as many (or as little) workers as needed.
A Docker image is available from `docker pull ghcr.io/postgresml/pgcat:latest`. See our [Github packages repository](https://github.com/postgresml/pgcat/pkgs/container/pgcat).
For quick local example, use the Docker Compose environment provided:
```bash
@@ -39,9 +86,13 @@ PGPASSWORD=postgres psql -h 127.0.0.1 -p 6432 -U postgres -c 'SELECT 1'
### Config
See [Configurations page](https://github.com/levkk/pgcat/blob/main/CONFIG.md)
See **[Configuration](https://github.com/levkk/pgcat/blob/main/CONFIG.md)**.
## Local development
## Contributing
The project is being actively developed and looking for additional contributors and production deployments.
### Local development
1. Install Rust (latest stable will work great).
2. `cargo build --release` (to get better benchmarks).
@@ -51,7 +102,7 @@ See [Configurations page](https://github.com/levkk/pgcat/blob/main/CONFIG.md)
### Tests
Quickest way to test your changes is to use pgbench:
When making substantial modifications to the protocol implementation, make sure to test them with pgbench:
```
pgbench -i -h 127.0.0.1 -p 6432 && \
@@ -61,36 +112,26 @@ pgbench -t 1000 -p 6432 -h 127.0.0.1 --protocol extended
See [sharding README](./tests/sharding/README.md) for sharding logic testing.
Run `cargo test` to run Rust tests.
Additionally, all features are tested with Ruby, Python, and Rust unit and integration tests.
Run `cargo test` to run Rust unit tests.
Run the following commands to run Ruby and Python integration tests:
Run the following commands to run Integration tests locally.
```
cd tests/docker/
docker compose up --exit-code-from main # This will also produce coverage report under ./cov/
```
| **Feature** | **Tested in CI** | **Tested manually** | **Comments** |
|-----------------------|--------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :white_check_mark: | :white_check_mark: | Used by default for all tests. |
| Session pooling | :white_check_mark: | :white_check_mark: | Tested by running pgbench with `--protocol prepared` which only works in session mode. |
| `COPY` | :white_check_mark: | :white_check_mark: | `pgbench -i` uses `COPY`. `COPY FROM` is tested as well. |
| Query cancellation | :white_check_mark: | :white_check_mark: | `psql -c 'SELECT pg_sleep(1000);'` and press `Ctrl-C`. |
| Load balancing | :white_check_mark: | :white_check_mark: | We could test this by emitting statistics for each replica and compare them. |
| Failover | :white_check_mark: | :white_check_mark: | Misconfigure a replica in `pgcat.toml` and watch it forward queries to spares. CI testing is using Toxiproxy. |
| Sharding | :white_check_mark: | :white_check_mark: | See `tests/sharding` and `tests/ruby` for an Rails/ActiveRecord example. |
| Statistics | :white_check_mark: | :white_check_mark: | Query the admin database with `psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW STATS'`. |
| Live config reloading | :white_check_mark: | :white_check_mark: | Run `kill -s SIGHUP $(pgrep pgcat)` and watch the config reload. |
### Docker-based local development
### Dev
Also, you can open a 'dev' environment where you can debug tests easier by running the following command:
You can open a Docker development environment where you can debug tests easier. Run the following command to spin it up:
```
./dev/script/console
```
This will open a terminal in an environment similar to that used in tests. In there you can compile, run tests, do some debugging with the test environment, etc. Objects
compiled inside the contaner (and bundled gems) will be placed in `dev/cache` so they don't interfere with what you have in your host.
This will open a terminal in an environment similar to that used in tests. In there, you can compile the pooler, run tests, do some debugging with the test environment, etc. Objects compiled inside the container (and bundled gems) will be placed in `dev/cache` so they don't interfere with what you have on your machine.
## Usage
@@ -105,11 +146,9 @@ In transaction mode, a client talks to one server for the duration of a single t
This mode is enabled by default.
### Load balancing of read queries
All queries are load balanced against the configured servers using the random algorithm. The most straight forward configuration example would be to put this pooler in front of several replicas and let it load balance all queries.
All queries are load balanced against the configured servers using either the random or least open connections algorithms. The most straightforward configuration example would be to put this pooler in front of several replicas and let it load balance all queries.
If the configuration includes a primary and replicas, the queries can be separated with the built-in query parser. The query parser will interpret the query and route all `SELECT` queries to a replica, while all other queries including explicit transactions will be routed to the primary.
The query parser is disabled by default.
If the configuration includes a primary and replicas, the queries can be separated with the built-in query parser. The query parser, implemented with the `sqlparser` crate, will interpret the query and route all `SELECT` queries to a replica, while all other queries including explicit transactions will be routed to the primary.
#### Query parser
The query parser will do its best to determine where the query should go, but sometimes that's not possible. In that case, the client can select which server it wants using this custom SQL syntax:
@@ -136,38 +175,14 @@ The setting will persist until it's changed again or the client disconnects.
By default, all queries are routed to the first available server; `default_role` setting controls this behavior.
### Failover
All servers are checked with a `SELECT 1` query before being given to a client. If the server is not reachable, it will be banned and cannot serve any more transactions for the duration of the ban. The queries are routed to the remaining servers. If all servers become banned, the ban list is cleared: this is a safety precaution against false positives. The primary can never be banned.
All servers are checked with a `;` (very fast) query before being given to a client. Additionally, the server health is monitored with every client query that it processes. If the server is not reachable, it will be banned and cannot serve any more transactions for the duration of the ban. The queries are routed to the remaining servers. If `replica_to_primary_failover_enabled` is set to true and all replicas become banned, the query will be routed to the primary. If `replica_to_primary_failover_enabled` is false and all servers (replicas) become banned, the ban list is cleared: this is a safety precaution against false positives. The primary can never be banned.
The ban time can be changed with `ban_time`. The default is 60 seconds.
Failover behavior can get pretty interesting (read complex) when multiple configurations and factors are involved. The table below will try to explain what PgCat does in each scenario:
| **Query** | **`SET SERVER ROLE TO`** | **`query_parser_enabled`** | **`primary_reads_enabled`** | **Target state** | **Outcome** |
|---------------------------|--------------------------|----------------------------|-----------------------------|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Read query, i.e. `SELECT` | unset (any) | false | false | up | Query is routed to the first instance in the random loop. |
| Read query | unset (any) | true | false | up | Query is routed to the first replica instance in the random loop. |
| Read query | unset (any) | true | true | up | Query is routed to the first instance in the random loop. |
| Read query | replica | false | false | up | Query is routed to the first replica instance in the random loop. |
| Read query | primary | false | false | up | Query is routed to the primary. |
| Read query | unset (any) | false | false | down | First instance is banned for reads. Next target in the random loop is attempted. |
| Read query | unset (any) | true | false | down | First replica instance is banned. Next replica instance is attempted in the random loop. |
| Read query | unset (any) | true | true | down | First instance (even if primary) is banned for reads. Next instance is attempted in the random loop. |
| Read query | replica | false | false | down | First replica instance is banned. Next replica instance is attempted in the random loop. |
| Read query | primary | false | false | down | The query is attempted against the primary and fails. The client receives an error. |
| | | | | | |
| Write query e.g. `INSERT` | unset (any) | false | false | up | The query is attempted against the first available instance in the random loop. If the instance is a replica, the query fails and the client receives an error. |
| Write query | unset (any) | true | false | up | The query is routed to the primary. |
| Write query | unset (any) | true | true | up | The query is routed to the primary. |
| Write query | primary | false | false | up | The query is routed to the primary. |
| Write query | replica | false | false | up | The query is routed to the replica and fails. The client receives an error. |
| Write query | unset (any) | true | false | down | The query is routed to the primary and fails. The client receives an error. |
| Write query | unset (any) | true | true | down | The query is routed to the primary and fails. The client receives an error. |
| Write query | primary | false | false | down | The query is routed to the primary and fails. The client receives an error. |
| | | | | | |
### Sharding
We use the `PARTITION BY HASH` hashing function, the same as used by Postgres for declarative partitioning. This allows to shard the database using Postgres partitions and place the partitions on different servers (shards). Both read and write queries can be routed to the shards using this pooler.
#### Extended syntax
To route queries to a particular shard, we use this custom SQL syntax:
```sql
@@ -182,7 +197,8 @@ The active shard will last until it's changed again or the client disconnects. B
For hash function implementation, see `src/sharding.rs` and `tests/sharding/partition_hash_test_setup.sql`.
#### ActiveRecord/Rails
##### ActiveRecord/Rails
```ruby
class User < ActiveRecord::Base
@@ -210,7 +226,7 @@ User.connection.execute "SET SERVER ROLE TO 'auto'"
User.find_by_email("test@example.com")
```
#### Raw SQL
##### Raw SQL
```sql
-- Grab a bunch of users from shard 1
@@ -230,268 +246,47 @@ SET SERVER ROLE TO 'auto'; -- let the query router figure out where the query sh
SELECT * FROM users WHERE email = 'test@example.com'; -- shard setting lasts until set again; we are reading from the primary
```
#### With comments
Issuing queries to the pooler can cause additional latency. To reduce its impact, it's possible to include sharding information inside SQL comments sent via the query. This is reasonably easy to implement with ORMs like [ActiveRecord](https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-annotate) and [SQLAlchemy](https://docs.sqlalchemy.org/en/20/core/events.html#sql-execution-and-connection-events).
```
/* shard_id: 5 */ SELECT * FROM foo WHERE id = 1234;
/* sharding_key: 1234 */ SELECT * FROM foo WHERE id = 1234;
```
#### Automatic query parsing
PgCat can use the `sqlparser` crate to parse SQL queries and extract the sharding key. This is configurable with the `automatic_sharding_key` setting. This feature is still experimental, but it's the ideal implementation for sharding, requiring no client modifications.
### Statistics reporting
The stats are very similar to what Pgbouncer reports and the names are kept to be comparable. They are accessible by querying the admin database `pgcat`, and `pgbouncer` for compatibility.
The stats are very similar to what PgBouncer reports and the names are kept to be comparable. They are accessible by querying the admin database `pgcat`, and `pgbouncer` for compatibility.
```
psql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW DATABASES'
```
Additionally, Prometheus statistics are available at `/metrics` via HTTP.
We also have a [basic Grafana dashboard](https://github.com/postgresml/pgcat/blob/main/grafana_dashboard.json) based on Prometheus metrics that you can import into Grafana and build on it or use it for monitoring.
### Live configuration reloading
The config can be reloaded by sending a `kill -s SIGHUP` to the process or by querying `RELOAD` to the admin database. Not all settings are currently supported by live reload:
The config can be reloaded by sending a `kill -s SIGHUP` to the process or by querying `RELOAD` to the admin database. All settings except the `host` and `port` can be reloaded without restarting the pooler, including sharding and replicas configurations.
| **Config** | **Requires restart** |
|-------------------------|----------------------|
| `host` | yes |
| `port` | yes |
| `pool_mode` | no |
| `connect_timeout` | yes |
| `healthcheck_timeout` | no |
| `shutdown_timeout` | no |
| `healthcheck_delay` | no |
| `ban_time` | no |
| `user` | yes |
| `shards` | yes |
| `default_role` | no |
| `primary_reads_enabled` | no |
| `query_parser_enabled` | no |
### Mirroring
Mirroring allows to route queries to multiple databases at the same time. This is useful for prewarning replicas before placing them into the active configuration, or for testing different versions of Postgres with live traffic.
## Benchmarks
## License
You can setup PgBench locally through PgCat:
PgCat is free and open source, released under the MIT license.
```
pgbench -h 127.0.0.1 -p 6432 -i
```
## Contributors
Coincidenly, this uses `COPY` so you can test if that works. Additionally, we'll be running the following PgBench configurations:
Many thanks to our amazing contributors!
1. 16 clients, 2 threads
2. 32 clients, 2 threads
3. 64 clients, 2 threads
4. 128 clients, 2 threads
<a href = "https://github.com/postgresml/pgcat/graphs/contributors">
<img src = "https://contrib.rocks/image?repo=postgresml/pgcat"/>
</a>
All queries will be `SELECT` only (`-S`) just so disks don't get in the way, since the dataset will be effectively all in RAM.
My setup:
- 8 cores, 16 hyperthreaded (AMD Ryzen 5800X)
- 32GB RAM (doesn't matter for this benchmark, except to prove that Postgres will fit the whole dataset into RAM)
### PgBouncer
#### Config
```ini
[databases]
shard0 = host=localhost port=5432 user=sharding_user password=sharding_user
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
```
Everything else stays default.
#### Runs
```
$ pgbench -t 1000 -c 16 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended shard0
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 16
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 16000/16000
latency average = 0.155 ms
tps = 103417.377469 (including connections establishing)
tps = 103510.639935 (excluding connections establishing)
$ pgbench -t 1000 -c 32 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended shard0
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 32
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
latency average = 0.290 ms
tps = 110325.939785 (including connections establishing)
tps = 110386.513435 (excluding connections establishing)
$ pgbench -t 1000 -c 64 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended shard0
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 64
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 64000/64000
latency average = 0.692 ms
tps = 92470.427412 (including connections establishing)
tps = 92618.389350 (excluding connections establishing)
$ pgbench -t 1000 -c 128 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended shard0
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 128
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 128000/128000
latency average = 1.406 ms
tps = 91013.429985 (including connections establishing)
tps = 91067.583928 (excluding connections establishing)
```
### PgCat
#### Config
The only thing that matters here is the number of workers in the Tokio pool. Make sure to set it to < than the number of your CPU cores.
Also account for hyper-threading, so if you have that, take the number you got above and divide it by two, that way only "real" cores serving
requests.
My setup is 16 threads, 8 cores (`htop` shows as 16 CPUs), so I set the `max_workers` in Tokio to 4. Too many, and it starts conflicting with PgBench
which is also running on the same system.
#### Runs
```
$ pgbench -t 1000 -c 16 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 16
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 16000/16000
latency average = 0.164 ms
tps = 97705.088232 (including connections establishing)
tps = 97872.216045 (excluding connections establishing)
$ pgbench -t 1000 -c 32 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 32
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
latency average = 0.288 ms
tps = 111300.488119 (including connections establishing)
tps = 111413.107800 (excluding connections establishing)
$ pgbench -t 1000 -c 64 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 64
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 64000/64000
latency average = 0.556 ms
tps = 115190.496139 (including connections establishing)
tps = 115247.521295 (excluding connections establishing)
$ pgbench -t 1000 -c 128 -j 2 -p 6432 -h 127.0.0.1 -S --protocol extended
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 128
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 128000/128000
latency average = 1.135 ms
tps = 112770.562239 (including connections establishing)
tps = 112796.502381 (excluding connections establishing)
```
### Direct Postgres
Always good to have a base line.
#### Runs
```
$ pgbench -t 1000 -c 16 -j 2 -p 5432 -h 127.0.0.1 -S --protocol extended shard0
Password:
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 16
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 16000/16000
latency average = 0.115 ms
tps = 139443.955722 (including connections establishing)
tps = 142314.859075 (excluding connections establishing)
$ pgbench -t 1000 -c 32 -j 2 -p 5432 -h 127.0.0.1 -S --protocol extended shard0
Password:
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 32
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
latency average = 0.212 ms
tps = 150644.840891 (including connections establishing)
tps = 152218.499430 (excluding connections establishing)
$ pgbench -t 1000 -c 64 -j 2 -p 5432 -h 127.0.0.1 -S --protocol extended shard0
Password:
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 64
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 64000/64000
latency average = 0.420 ms
tps = 152517.663404 (including connections establishing)
tps = 153319.188482 (excluding connections establishing)
$ pgbench -t 1000 -c 128 -j 2 -p 5432 -h 127.0.0.1 -S --protocol extended shard0
Password:
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 1
query mode: extended
number of clients: 128
number of threads: 2
number of transactions per client: 1000
number of transactions actually processed: 128000/128000
latency average = 0.854 ms
tps = 149818.594087 (including connections establishing)
tps = 150200.603049 (excluding connections establishing)
```

23
charts/pgcat/.helmignore Normal file
View File

@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

8
charts/pgcat/Chart.yaml Normal file
View File

@@ -0,0 +1,8 @@
apiVersion: v2
name: pgcat
description: A Helm chart for PgCat a PostgreSQL pooler and proxy (like PgBouncer) with support for sharding, load balancing, failover and mirroring.
maintainers:
- name: Wildcard
email: support@w6d.io
appVersion: "1.2.0"
version: 0.2.4

View File

@@ -0,0 +1,22 @@
1. Get the application URL by running these commands:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{- range .paths }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "pgcat.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "pgcat.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "pgcat.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "pgcat.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
{{- end }}

View File

@@ -0,0 +1,3 @@
{{/*
Configuration template definition
*/}}

View File

@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "pgcat.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "pgcat.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "pgcat.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "pgcat.labels" -}}
helm.sh/chart: {{ include "pgcat.chart" . }}
{{ include "pgcat.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "pgcat.selectorLabels" -}}
app.kubernetes.io/name: {{ include "pgcat.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "pgcat.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "pgcat.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,66 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "pgcat.fullname" . }}
labels:
{{- include "pgcat.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "pgcat.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/secret: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "pgcat.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.image.pullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "pgcat.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.containerSecurityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: pgcat
containerPort: {{ .Values.configuration.general.port }}
protocol: TCP
livenessProbe:
tcpSocket:
port: pgcat
readinessProbe:
tcpSocket:
port: pgcat
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumeMounts:
- mountPath: /etc/pgcat
name: config
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
volumes:
- secret:
defaultMode: 420
secretName: {{ include "pgcat.fullname" . }}
name: config

View File

@@ -0,0 +1,61 @@
{{- if .Values.ingress.enabled -}}
{{- $fullName := include "pgcat.fullname" . -}}
{{- $svcPort := .Values.service.port -}}
{{- if and .Values.ingress.className (not (semverCompare ">=1.18-0" .Capabilities.KubeVersion.GitVersion)) }}
{{- if not (hasKey .Values.ingress.annotations "kubernetes.io/ingress.class") }}
{{- $_ := set .Values.ingress.annotations "kubernetes.io/ingress.class" .Values.ingress.className}}
{{- end }}
{{- end }}
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1
{{- else if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}
labels:
{{- include "pgcat.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if and .Values.ingress.className (semverCompare ">=1.18-0" .Capabilities.KubeVersion.GitVersion) }}
ingressClassName: {{ .Values.ingress.className }}
{{- end }}
{{- if .Values.ingress.tls }}
tls:
{{- range .Values.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
{{- if and .pathType (semverCompare ">=1.18-0" $.Capabilities.KubeVersion.GitVersion) }}
pathType: {{ .pathType }}
{{- end }}
backend:
{{- if semverCompare ">=1.19-0" $.Capabilities.KubeVersion.GitVersion }}
service:
name: {{ $fullName }}
port:
number: {{ $svcPort }}
{{- else }}
serviceName: {{ $fullName }}
servicePort: {{ $svcPort }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,97 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "pgcat.fullname" . }}
labels:
{{- include "pgcat.labels" . | nindent 4 }}
type: Opaque
stringData:
pgcat.toml: |
[general]
host = {{ .Values.configuration.general.host | quote }}
port = {{ .Values.configuration.general.port }}
enable_prometheus_exporter = {{ .Values.configuration.general.enable_prometheus_exporter }}
prometheus_exporter_port = {{ .Values.configuration.general.prometheus_exporter_port }}
connect_timeout = {{ .Values.configuration.general.connect_timeout }}
idle_timeout = {{ .Values.configuration.general.idle_timeout | int }}
server_lifetime = {{ .Values.configuration.general.server_lifetime | int }}
server_tls = {{ .Values.configuration.general.server_tls }}
idle_client_in_transaction_timeout = {{ .Values.configuration.general.idle_client_in_transaction_timeout | int }}
healthcheck_timeout = {{ .Values.configuration.general.healthcheck_timeout }}
healthcheck_delay = {{ .Values.configuration.general.healthcheck_delay }}
shutdown_timeout = {{ .Values.configuration.general.shutdown_timeout }}
ban_time = {{ .Values.configuration.general.ban_time }}
log_client_connections = {{ .Values.configuration.general.log_client_connections }}
log_client_disconnections = {{ .Values.configuration.general.log_client_disconnections }}
tcp_keepalives_idle = {{ .Values.configuration.general.tcp_keepalives_idle }}
tcp_keepalives_count = {{ .Values.configuration.general.tcp_keepalives_count }}
tcp_keepalives_interval = {{ .Values.configuration.general.tcp_keepalives_interval }}
{{- if and (ne .Values.configuration.general.tls_certificate "-") (ne .Values.configuration.general.tls_private_key "-") }}
tls_certificate = "{{ .Values.configuration.general.tls_certificate }}"
tls_private_key = "{{ .Values.configuration.general.tls_private_key }}"
{{- end }}
admin_username = {{ .Values.configuration.general.admin_username | quote }}
admin_password = {{ .Values.configuration.general.admin_password | quote }}
{{- if and .Values.configuration.general.auth_query_user .Values.configuration.general.auth_query_password .Values.configuration.general.auth_query }}
auth_query = {{ .Values.configuration.general.auth_query | quote }}
auth_query_user = {{ .Values.configuration.general.auth_query_user | quote }}
auth_query_password = {{ .Values.configuration.general.auth_query_password | quote }}
{{- end }}
{{- range $pool := .Values.configuration.pools }}
##
## pool for {{ $pool.name }}
##
[pools.{{ $pool.name | quote }}]
pool_mode = {{ default "transaction" $pool.pool_mode | quote }}
load_balancing_mode = {{ default "random" $pool.load_balancing_mode | quote }}
default_role = {{ default "any" $pool.default_role | quote }}
prepared_statements_cache_size = {{ default 500 $pool.prepared_statements_cache_size }}
query_parser_enabled = {{ default true $pool.query_parser_enabled }}
query_parser_read_write_splitting = {{ default true $pool.query_parser_read_write_splitting }}
primary_reads_enabled = {{ default true $pool.primary_reads_enabled }}
sharding_function = {{ default "pg_bigint_hash" $pool.sharding_function | quote }}
{{- range $index, $user := $pool.users }}
## pool {{ $pool.name }} user {{ $user.username | quote }}
##
[pools.{{ $pool.name | quote }}.users.{{ $index }}]
username = {{ $user.username | quote }}
{{- if $user.password }}
password = {{ $user.password | quote }}
{{- else if and $user.passwordSecret.name $user.passwordSecret.key }}
{{- $secret := (lookup "v1" "Secret" $.Release.Namespace $user.passwordSecret.name) }}
{{- if $secret }}
{{- $password := index $secret.data $user.passwordSecret.key | b64dec }}
password = {{ $password | quote }}
{{- end }}
{{- end }}
pool_size = {{ $user.pool_size }}
statement_timeout = {{ default 0 $user.statement_timeout }}
min_pool_size = {{ default 3 $user.min_pool_size }}
{{- if $user.server_lifetime }}
server_lifetime = {{ $user.server_lifetime }}
{{- end }}
{{- if and $user.server_username $user.server_password }}
server_username = {{ $user.server_username | quote }}
server_password = {{ $user.server_password | quote }}
{{- end }}
{{- end }}
{{- range $index, $shard := $pool.shards }}
## pool {{ $pool.name }} database {{ $shard.database }}
##
[pools.{{ $pool.name | quote }}.shards.{{ $index }}]
{{- if gt (len $shard.servers) 0}}
servers = [
{{- range $server := $shard.servers }}
[ {{ $server.host | quote }}, {{ $server.port }}, {{ $server.role | quote }} ],
{{- end }}
]
{{- end }}
database = {{ $shard.database | quote }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: {{ include "pgcat.fullname" . }}
labels:
{{- include "pgcat.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: pgcat
protocol: TCP
name: pgcat
selector:
{{- include "pgcat.selectorLabels" . | nindent 4 }}

View File

@@ -0,0 +1,12 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "pgcat.serviceAccountName" . }}
labels:
{{- include "pgcat.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

374
charts/pgcat/values.yaml Normal file
View File

@@ -0,0 +1,374 @@
## String to partially override aspnet-core.fullname template (will maintain the release name)
## @param nameOverride String to partially override common.names.fullname
##
nameOverride: ""
## String to fully override aspnet-core.fullname template
## @param fullnameOverride String to fully override common.names.fullname
##
fullnameOverride: ""
## Number of PgCat replicas to deploy
## @param replicaCount Number of PgCat replicas to deploy
replicaCount: 1
## Bitnami PgCat image version
## ref: https://hub.docker.com/r/bitnami/kubewatch/tags/
##
## @param image.registry PgCat image registry
## @param image.repository PgCat image name
## @param image.tag PgCat image tag
## @param image.pullPolicy PgCat image tag
## @param image.pullSecrets Specify docker-registry secret names as an array
image:
repository: ghcr.io/postgresml/pgcat
# Overrides the image tag whose default is the chart appVersion.
tag: "main"
## Specify a imagePullPolicy
## Defaults to 'Always' if image tag is 'latest', else set to 'IfNotPresent'
## ref: http://kubernetes.io/docs/user-guide/images/#pre-pulling-images
##
pullPolicy: IfNotPresent
## Optionally specify an array of imagePullSecrets.
## Secrets must be manually created in the namespace.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
## Example:
## pullSecrets:
## - myRegistryKeySecretName
##
pullSecrets: []
## Specifies whether a ServiceAccount should be created
##
## @param serviceAccount.create Enable the creation of a ServiceAccount for PgCat pods
## @param serviceAccount.name Name of the created ServiceAccount
##
serviceAccount:
## Specifies whether a service account should be created
create: true
## Annotations to add to the service account
annotations: {}
## The name of the service account to use.
## If not set and create is true, a name is generated using the fullname template
name: ""
## Annotations for server pods.
## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
##
## @param podAnnotations Annotations for PgCat pods
##
podAnnotations: {}
## PgCat containers' SecurityContext
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod
##
## @param podSecurityContext.enabled Enabled PgCat pods' Security Context
## @param podSecurityContext.fsGroup Set PgCat pod's Security Context fsGroup
##
podSecurityContext: {}
# fsGroup: 2000
## PgCat pods' Security Context
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
##
## @param containerSecurityContext.enabled Enabled PgCat containers' Security Context
## @param containerSecurityContext.runAsUser Set PgCat container's Security Context runAsUser
## @param containerSecurityContext.runAsNonRoot Set PgCat container's Security Context runAsNonRoot
##
containerSecurityContext: {}
# capabilities:
# drop:
# - ALL
# readOnlyRootFilesystem: true
# runAsNonRoot: true
# runAsUser: 1000
## PgCat service
##
## @param service.type PgCat service type
## @param service.port PgCat service port
service:
type: ClusterIP
port: 6432
ingress:
enabled: false
className: ""
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
hosts:
- host: chart-example.local
paths:
- path: /
pathType: ImplementationSpecific
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
## PgCat resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
## @skip resources Optional description
## @disabled-param resources.limits The resources limits for the PgCat container
## @disabled-param resources.requests The requested resources for the PgCat container
##
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
limits: {}
# cpu: 100m
# memory: 128Mi
requests: {}
# cpu: 100m
# memory: 128Mi
## Node labels for pod assignment. Evaluated as a template.
## ref: https://kubernetes.io/docs/user-guide/node-selection/
##
## @param nodeSelector Node labels for pod assignment
##
nodeSelector: {}
## Tolerations for pod assignment. Evaluated as a template.
## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
##
## @param tolerations Tolerations for pod assignment
##
tolerations: []
## Affinity for pod assignment. Evaluated as a template.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
## Note: podAffinityPreset, podAntiAffinityPreset, and nodeAffinityPreset will be ignored when it's set
##
## @param affinity Affinity for pod assignment
##
affinity: {}
## PgCat configuration
## @param configuration [object]
configuration:
## General pooler settings
## @param [object]
general:
## @param configuration.general.host What IP to run on, 0.0.0.0 means accessible from everywhere.
host: "0.0.0.0"
## @param configuration.general.port Port to run on, same as PgBouncer used in this example.
port: 6432
## @param configuration.general.enable_prometheus_exporter Whether to enable prometheus exporter or not.
enable_prometheus_exporter: false
## @param configuration.general.prometheus_exporter_port Port at which prometheus exporter listens on.
prometheus_exporter_port: 9930
# @param configuration.general.connect_timeout How long to wait before aborting a server connection (ms).
connect_timeout: 5000
# How long an idle connection with a server is left open (ms).
idle_timeout: 30000 # milliseconds
# Max connection lifetime before it's closed, even if actively used.
server_lifetime: 86400000 # 24 hours
# Whether to use TLS for server connections or not.
server_tls: false
# How long a client is allowed to be idle while in a transaction (ms).
idle_client_in_transaction_timeout: 0 # milliseconds
# @param configuration.general.healthcheck_timeout How much time to give `SELECT 1` health check query to return with a result (ms).
healthcheck_timeout: 1000
# @param configuration.general.healthcheck_delay How long to keep connection available for immediate re-use, without running a healthcheck query on it
healthcheck_delay: 30000
# @param configuration.general.shutdown_timeout How much time to give clients during shutdown before forcibly killing client connections (ms).
shutdown_timeout: 60000
# @param configuration.general.ban_time For how long to ban a server if it fails a health check (seconds).
ban_time: 60 # seconds
# @param configuration.general.log_client_connections If we should log client connections
log_client_connections: false
# @param configuration.general.log_client_disconnections If we should log client disconnections
log_client_disconnections: false
# TLS
# tls_certificate: "server.cert"
# tls_private_key: "server.key"
tls_certificate: "-"
tls_private_key: "-"
# Credentials to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
admin_username: "postgres"
admin_password: "postgres"
# Query to be sent to servers to obtain the hash used for md5 authentication. The connection will be
# established using the database configured in the pool. This parameter is inherited by every pool and
# can be redefined in pool configuration.
auth_query: null
# User to be used for connecting to servers to obtain the hash used for md5 authentication by sending
# the query specified in auth_query_user. The connection will be established using the database configured
# in the pool. This parameter is inherited by every pool and can be redefined in pool configuration.
#
# @param configuration.general.auth_query_user
auth_query_user: null
# Password to be used for connecting to servers to obtain the hash used for md5 authentication by sending
# the query specified in auth_query_user. The connection will be established using the database configured
# in the pool. This parameter is inherited by every pool and can be redefined in pool configuration.
#
# @param configuration.general.auth_query_password
auth_query_password: null
# Number of seconds of connection idleness to wait before sending a keepalive packet to the server.
tcp_keepalives_idle: 5
# Number of unacknowledged keepalive packets allowed before giving up and closing the connection.
tcp_keepalives_count: 5
# Number of seconds between keepalive packets.
tcp_keepalives_interval: 5
## pool
## configs are structured as pool.<pool_name>
## the pool_name is what clients use as database name when connecting
## For the example below a client can connect using "postgres://sharding_user:sharding_user@pgcat_host:pgcat_port/sharded"
## @param [object]
pools:
[{
name: "simple", pool_mode: "transaction",
users: [{username: "user", password: "pass", pool_size: 5, statement_timeout: 0}],
shards: [{
servers: [{host: "postgres", port: 5432, role: "primary"}],
database: "postgres"
}]
}]
# - ## default values
# ##
# ##
# ##
# name: "db"
# ## Pool mode (see PgBouncer docs for more).
# ## session: one server connection per connected client
# ## transaction: one server connection per client transaction
# ## @param configuration.poolsPostgres.pool_mode
# pool_mode: "transaction"
# ## Load balancing mode
# ## `random` selects the server at random
# ## `loc` selects the server with the least outstanding busy connections
# ##
# ## @param configuration.poolsPostgres.load_balancing_mode
# load_balancing_mode: "random"
# ## Prepared statements cache size.
# ## TODO: update documentation
# ##
# ## @param configuration.poolsPostgres.prepared_statements_cache_size
# prepared_statements_cache_size: 500
# ## If the client doesn't specify, route traffic to
# ## this role by default.
# ##
# ## any: round-robin between primary and replicas,
# ## replica: round-robin between replicas only without touching the primary,
# ## primary: all queries go to the primary unless otherwise specified.
# ## @param configuration.poolsPostgres.default_role
# default_role: "any"
# ## Query parser. If enabled, we'll attempt to parse
# ## every incoming query to determine if it's a read or a write.
# ## If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# ## we'll direct it to the primary.
# ## @param configuration.poolsPostgres.query_parser_enabled
# query_parser_enabled: true
# ## If the query parser is enabled and this setting is enabled, we'll attempt to
# ## infer the role from the query itself.
# ## @param configuration.poolsPostgres.query_parser_read_write_splitting
# query_parser_read_write_splitting: true
# ## If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# ## load balancing of read queries. Otherwise, the primary will only be used for write
# ## queries. The primary can always be explicitly selected with our custom protocol.
# ## @param configuration.poolsPostgres.primary_reads_enabled
# primary_reads_enabled: true
# ## So what if you wanted to implement a different hashing function,
# ## or you've already built one and you want this pooler to use it?
# ##
# ## Current options:
# ##
# ## pg_bigint_hash: PARTITION BY HASH (Postgres hashing function)
# ## sha1: A hashing function based on SHA1
# ##
# ## @param configuration.poolsPostgres.sharding_function
# sharding_function: "pg_bigint_hash"
# ## Credentials for users that may connect to this cluster
# ## @param users [array]
# ## @param users[0].username Name of the env var (required)
# ## @param users[0].password Value for the env var (required) leave empty to use existing secret see passwordSecret.name and passwordSecret.key
# ## @param users[0].passwordSecret.name Name of the secret containing the password
# ## @param users[0].passwordSecret.key Key in the secret containing the password
# ## @param users[0].pool_size Maximum number of server connections that can be established for this user
# ## @param users[0].statement_timeout Maximum query duration. Dangerous, but protects against DBs that died in a non-obvious way.
# users: []
# # - username: "user"
# # password: "pass"
# #
# # # The maximum number of connection from a single Pgcat process to any database in the cluster
# # # is the sum of pool_size across all users.
# # pool_size: 9
# #
# # # Maximum query duration. Dangerous, but protects against DBs that died in a non-obvious way.
# # statement_timeout: 0
# #
# # # PostgreSQL username used to connect to the server.
# # server_username: "postgres
# #
# # # PostgreSQL password used to connect to the server.
# # server_password: "postgres
# ## @param shards [array]
# ## @param shards[0].server[0].host Host for this shard
# ## @param shards[0].server[0].port Port for this shard
# ## @param shards[0].server[0].role Role for this shard
# shards: []
# # [ host, port, role ]
# # - servers:
# # - host: "postgres"
# # port: 5432
# # role: "primary"
# # - host: "postgres"
# # port: 5432
# # role: "replica"
# # database: "postgres"
# # # [ host, port, role ]
# # - servers:
# # - host: "postgres"
# # port: 5432
# # role: "primary"
# # - host: "postgres"
# # port: 5432
# # role: "replica"
# # database: "postgres"
# # # [ host, port, role ]
# # - servers:
# # - host: "postgres"
# # port: 5432
# # role: "primary"
# # - host: "postgres"
# # port: 5432
# # role: "replica"
# # database: "postgres"

9
control Normal file
View File

@@ -0,0 +1,9 @@
Package: pgcat
Version: ${PACKAGE_VERSION}
Section: database
Priority: optional
Architecture: ${ARCH}
Maintainer: PostgresML <team@postgresml.org>
Homepage: https://postgresml.org
Description: PgCat - NextGen PostgreSQL Pooler
PostgreSQL pooler and proxy (like PgBouncer) with support for sharding, load balancing, failover and mirroring.

2
cr.yaml Normal file
View File

@@ -0,0 +1,2 @@
sign: false
pages_branch: main

5
ct.yaml Normal file
View File

@@ -0,0 +1,5 @@
remote: origin
target-branch: main
chart-dirs:
- charts

View File

@@ -1,6 +1,8 @@
FROM rust:bullseye
# Dependencies
COPY --from=sclevine/yj /bin/yj /bin/yj
RUN /bin/yj -h
RUN apt-get update -y \
&& apt-get install -y \
llvm-11 psmisc postgresql-contrib postgresql-client \

View File

@@ -25,7 +25,9 @@ x-common-env-pg:
services:
main:
image: kubernetes/pause
image: gcr.io/google_containers/pause:3.2
ports:
- 6432
pg1:
<<: *common-definition-pg
@@ -56,6 +58,13 @@ services:
POSTGRES_INITDB_ARGS: --auth-local=scram-sha-256 --auth-host=scram-sha-256 --auth=scram-sha-256
PGPORT: 9432
command: ["postgres", "-p", "9432", "-c", "shared_preload_libraries=pg_stat_statements", "-c", "pg_stat_statements.track=all", "-c", "pg_stat_statements.max=100000"]
pg5:
<<: *common-definition-pg
environment:
<<: *common-env-pg
POSTGRES_INITDB_ARGS: --auth-local=md5 --auth-host=md5 --auth=md5
PGPORT: 10432
command: ["postgres", "-p", "10432", "-c", "shared_preload_libraries=pg_stat_statements", "-c", "pg_stat_statements.track=all", "-c", "pg_stat_statements.max=100000"]
toxiproxy:
build: .
@@ -69,6 +78,7 @@ services:
- pg2
- pg3
- pg4
- pg5
pgcat-shell:
stdin_open: true

View File

@@ -38,9 +38,6 @@ log_client_connections = false
# If we should log client disconnections
log_client_disconnections = false
# Reload config automatically if it changes.
autoreload = false
# TLS
# tls_certificate = "server.cert"
# tls_private_key = "server.key"
@@ -74,9 +71,13 @@ default_role = "any"
# we'll direct it to the primary.
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, we'll attempt to
# infer the role from the query itself.
query_parser_read_write_splitting = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
# queries. The primary can always be explicitely selected with our custom protocol.
# queries. The primary can always be explicitly selected with our custom protocol.
primary_reads_enabled = true
# So what if you wanted to implement a different hashing function,

2124
grafana_dashboard.json Normal file

File diff suppressed because it is too large Load Diff

BIN
images/instacart.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 KiB

BIN
images/one_signal.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

BIN
images/postgresml.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

22
pgcat.minimal.toml Normal file
View File

@@ -0,0 +1,22 @@
# This is an example of the most basic config
# that will mimic what PgBouncer does in transaction mode with one server.
[general]
host = "0.0.0.0"
port = 6433
admin_username = "pgcat"
admin_password = "pgcat"
[pools.pgml.users.0]
username = "postgres"
password = "postgres"
pool_size = 10
min_pool_size = 1
pool_mode = "transaction"
[pools.pgml.shards.0]
servers = [
["127.0.0.1", 28815, "primary"]
]
database = "postgres"

17
pgcat.service Normal file
View File

@@ -0,0 +1,17 @@
[Unit]
Description=PgCat pooler
After=network.target
StartLimitIntervalSec=0
[Service]
User=pgcat
Type=simple
Restart=always
RestartSec=1
Environment=RUST_LOG=info
LimitNOFILE=65536
ExecStart=/usr/bin/pgcat /etc/pgcat.toml
ExecReload=/bin/kill -SIGHUP $MAINPID
[Install]
WantedBy=multi-user.target

View File

@@ -23,6 +23,12 @@ connect_timeout = 5000 # milliseconds
# How long an idle connection with a server is left open (ms).
idle_timeout = 30000 # milliseconds
# Max connection lifetime before it's closed, even if actively used.
server_lifetime = 86400000 # 24 hours
# How long a client is allowed to be idle while in a transaction (ms).
idle_client_in_transaction_timeout = 0 # milliseconds
# How much time to give the health check query to return with a result (ms).
healthcheck_timeout = 1000 # milliseconds
@@ -42,7 +48,7 @@ log_client_connections = false
log_client_disconnections = false
# When set to true, PgCat reloads configs if it detects a change in the config file.
autoreload = false
autoreload = 15000
# Number of worker threads the Runtime will use (4 by default).
worker_threads = 5
@@ -54,10 +60,16 @@ tcp_keepalives_count = 5
# Number of seconds between keepalive packets.
tcp_keepalives_interval = 5
# Path to TLS Certficate file to use for TLS connections
# tls_certificate = "server.cert"
# Path to TLS Certificate file to use for TLS connections
# tls_certificate = ".circleci/server.cert"
# Path to TLS private key file to use for TLS connections
# tls_private_key = "server.key"
# tls_private_key = ".circleci/server.key"
# Enable/disable server TLS
server_tls = false
# Verify server certificate is completely authentic.
verify_server_certificate = false
# User name to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
@@ -65,6 +77,58 @@ admin_username = "admin_user"
# Password to access the virtual administrative database
admin_password = "admin_pass"
# Default plugins that are configured on all pools.
[plugins]
# Prewarmer plugin that runs queries on server startup, before giving the connection
# to the client.
[plugins.prewarmer]
enabled = false
queries = [
"SELECT pg_prewarm('pgbench_accounts')",
]
# Log all queries to stdout.
[plugins.query_logger]
enabled = false
# Block access to tables that Postgres does not allow us to control.
[plugins.table_access]
enabled = false
tables = [
"pg_user",
"pg_roles",
"pg_database",
]
# Intercept user queries and give a fake reply.
[plugins.intercept]
enabled = true
[plugins.intercept.queries.0]
query = "select current_database() as a, current_schemas(false) as b"
schema = [
["a", "text"],
["b", "text"],
]
result = [
["${DATABASE}", "{public}"],
]
[plugins.intercept.queries.1]
query = "select current_database(), current_schema(), current_user"
schema = [
["current_database", "text"],
["current_schema", "text"],
["current_user", "text"],
]
result = [
["${DATABASE}", "public", "${USER}"],
]
# pool configs are structured as pool.<pool_name>
# the pool_name is what clients use as database name when connecting.
# For a pool named `sharded_db`, clients access that pool using connection string like
@@ -86,12 +150,20 @@ load_balancing_mode = "random"
# `primary` all queries go to the primary unless otherwise specified.
default_role = "any"
# Prepared statements cache size.
# TODO: update documentation
prepared_statements_cache_size = 500
# If Query Parser is enabled, we'll attempt to parse
# every incoming query to determine if it's a read or a write.
# If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# we'll direct it to the primary.
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, we'll attempt to
# infer the role from the query itself.
query_parser_read_write_splitting = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
# queries. The primary can always be explicitly selected with our custom protocol.
@@ -103,6 +175,12 @@ primary_reads_enabled = true
# shard_id_regex = '/\* shard_id: (\d+) \*/'
# regex_search_limit = 1000 # only look at the first 1000 characters of SQL statements
# Defines the behavior when no shard is selected in a sharded system.
# `random`: picks a shard at random
# `random_healthy`: picks a shard at random favoring shards with the least number of recent errors
# `shard_<number>`: e.g. shard_0, shard_4, etc. picks a specific shard, everytime
# default_shard = "shard_0"
# So what if you wanted to implement a different hashing function,
# or you've already built one and you want this pooler to use it?
# Current options:
@@ -110,6 +188,21 @@ primary_reads_enabled = true
# `sha1`: A hashing function based on SHA1
sharding_function = "pg_bigint_hash"
# Query to be sent to servers to obtain the hash used for md5 authentication. The connection will be
# established using the database configured in the pool. This parameter is inherited by every pool
# and can be redefined in pool configuration.
# auth_query="SELECT usename, passwd FROM pg_shadow WHERE usename='$1'"
# User to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
# specified in `auth_query_user`. The connection will be established using the database configured in the pool.
# This parameter is inherited by every pool and can be redefined in pool configuration.
# auth_query_user = "sharding_user"
# Password to be used for connecting to servers to obtain the hash used for md5 authentication by sending the query
# specified in `auth_query_user`. The connection will be established using the database configured in the pool.
# This parameter is inherited by every pool and can be redefined in pool configuration.
# auth_query_password = "sharding_user"
# Automatically parse this from queries and route queries to the right shard!
# automatic_sharding_key = "data.id"
@@ -119,18 +212,86 @@ idle_timeout = 40000
# Connect timeout can be overwritten in the pool
connect_timeout = 3000
# When enabled, ip resolutions for server connections specified using hostnames will be cached
# and checked for changes every `dns_max_ttl` seconds. If a change in the host resolution is found
# old ip connections are closed (gracefully) and new connections will start using new ip.
# dns_cache_enabled = false
# Specifies how often (in seconds) cached ip addresses for servers are rechecked (see `dns_cache_enabled`).
# dns_max_ttl = 30
# Plugins can be configured on a pool-per-pool basis. This overrides the global plugins setting,
# so all plugins have to be configured here again.
[pool.sharded_db.plugins]
[pools.sharded_db.plugins.prewarmer]
enabled = true
queries = [
"SELECT pg_prewarm('pgbench_accounts')",
]
[pools.sharded_db.plugins.query_logger]
enabled = false
[pools.sharded_db.plugins.table_access]
enabled = false
tables = [
"pg_user",
"pg_roles",
"pg_database",
]
[pools.sharded_db.plugins.intercept]
enabled = true
[pools.sharded_db.plugins.intercept.queries.0]
query = "select current_database() as a, current_schemas(false) as b"
schema = [
["a", "text"],
["b", "text"],
]
result = [
["${DATABASE}", "{public}"],
]
[pools.sharded_db.plugins.intercept.queries.1]
query = "select current_database(), current_schema(), current_user"
schema = [
["current_database", "text"],
["current_schema", "text"],
["current_user", "text"],
]
result = [
["${DATABASE}", "public", "${USER}"],
]
# User configs are structured as pool.<pool_name>.users.<user_index>
# This secion holds the credentials for users that may connect to this cluster
# This section holds the credentials for users that may connect to this cluster
[pools.sharded_db.users.0]
# Postgresql username
# PostgreSQL username used to authenticate the user and connect to the server
# if `server_username` is not set.
username = "sharding_user"
# Postgresql password
# PostgreSQL password used to authenticate the user and connect to the server
# if `server_password` is not set.
password = "sharding_user"
pool_mode = "transaction"
# PostgreSQL username used to connect to the server.
# server_username = "another_user"
# PostgreSQL password used to connect to the server.
# server_password = "another_password"
# Maximum number of server connections that can be established for this user
# The maximum number of connection from a single Pgcat process to any database in the cluster
# is the sum of pool_size across all users.
pool_size = 9
# Maximum query duration. Dangerous, but protects against DBs that died in a non-obvious way.
# 0 means it is disabled.
statement_timeout = 0
@@ -140,6 +301,8 @@ username = "other_user"
password = "other_user"
pool_size = 21
statement_timeout = 15000
connect_timeout = 1000
idle_timeout = 1000
# Shard configs are structured as pool.<pool_name>.shards.<shard_id>
# Each shard config contains a list of servers that make up the shard
@@ -175,6 +338,8 @@ sharding_function = "pg_bigint_hash"
username = "simple_user"
password = "simple_user"
pool_size = 5
min_pool_size = 3
server_lifetime = 60000
statement_timeout = 0
[pools.simple_db.shards.0]

13
postinst Normal file
View File

@@ -0,0 +1,13 @@
#!/bin/bash
set -e
systemctl daemon-reload
systemctl enable pgcat
if ! id pgcat 2> /dev/null; then
useradd -s /usr/bin/false pgcat
fi
if [ -f /etc/pgcat.toml ]; then
systemctl start pgcat
fi

4
postrm Normal file
View File

@@ -0,0 +1,4 @@
#!/bin/bash
set -e
systemctl daemon-reload

5
prerm Normal file
View File

@@ -0,0 +1,5 @@
#!/bin/bash
set -e
systemctl stop pgcat
systemctl disable pgcat

View File

@@ -1,32 +1,33 @@
use crate::pool::BanReason;
/// Admin database.
use crate::server::ServerParameters;
use crate::stats::pool::PoolStats;
use bytes::{Buf, BufMut, BytesMut};
use log::{error, info, trace};
use nix::sys::signal::{self, Signal};
use nix::unistd::Pid;
use std::collections::HashMap;
/// Admin database.
use std::sync::atomic::Ordering;
use std::time::{SystemTime, UNIX_EPOCH};
use tokio::time::Instant;
use crate::config::{get_config, reload_config, VERSION};
use crate::errors::Error;
use crate::messages::*;
use crate::pool::ClientServerMap;
use crate::pool::{get_all_pools, get_pool};
use crate::stats::{
get_address_stats, get_client_stats, get_pool_stats, get_server_stats, ClientState, ServerState,
};
use crate::ClientServerMap;
use crate::stats::{get_client_stats, get_server_stats, ClientState, ServerState};
pub fn generate_server_info_for_admin() -> BytesMut {
let mut server_info = BytesMut::new();
pub fn generate_server_parameters_for_admin() -> ServerParameters {
let mut server_parameters = ServerParameters::new();
server_info.put(server_parameter_message("application_name", ""));
server_info.put(server_parameter_message("client_encoding", "UTF8"));
server_info.put(server_parameter_message("server_encoding", "UTF8"));
server_info.put(server_parameter_message("server_version", VERSION));
server_info.put(server_parameter_message("DateStyle", "ISO, MDY"));
server_parameters.set_param("application_name".to_string(), "".to_string(), true);
server_parameters.set_param("client_encoding".to_string(), "UTF8".to_string(), true);
server_parameters.set_param("server_encoding".to_string(), "UTF8".to_string(), true);
server_parameters.set_param("server_version".to_string(), VERSION.to_string(), true);
server_parameters.set_param("DateStyle".to_string(), "ISO, MDY".to_string(), true);
server_info
server_parameters
}
/// Handle admin client.
@@ -54,7 +55,12 @@ where
let query_parts: Vec<&str> = query.trim_end_matches(';').split_whitespace().collect();
match query_parts[0].to_ascii_uppercase().as_str() {
match query_parts
.first()
.unwrap_or(&"")
.to_ascii_uppercase()
.as_str()
{
"BAN" => {
trace!("BAN");
ban(stream, query_parts).await
@@ -73,17 +79,26 @@ where
}
"PAUSE" => {
trace!("PAUSE");
pause(stream, query_parts[1]).await
pause(stream, query_parts).await
}
"RESUME" => {
trace!("RESUME");
resume(stream, query_parts[1]).await
resume(stream, query_parts).await
}
"SHUTDOWN" => {
trace!("SHUTDOWN");
shutdown(stream).await
}
"SHOW" => match query_parts[1].to_ascii_uppercase().as_str() {
"SHOW" => match query_parts
.get(1)
.unwrap_or(&"")
.to_ascii_uppercase()
.as_str()
{
"HELP" => {
trace!("SHOW HELP");
show_help(stream).await
}
"BANS" => {
trace!("SHOW BANS");
show_bans(stream).await
@@ -158,7 +173,14 @@ where
"free_clients".to_string(),
client_stats
.keys()
.filter(|client_id| client_stats.get(client_id).unwrap().state == ClientState::Idle)
.filter(|client_id| {
client_stats
.get(client_id)
.unwrap()
.state
.load(Ordering::Relaxed)
== ClientState::Idle
})
.count()
.to_string(),
]));
@@ -166,7 +188,14 @@ where
"used_clients".to_string(),
client_stats
.keys()
.filter(|client_id| client_stats.get(client_id).unwrap().state == ClientState::Active)
.filter(|client_id| {
client_stats
.get(client_id)
.unwrap()
.state
.load(Ordering::Relaxed)
== ClientState::Active
})
.count()
.to_string(),
]));
@@ -178,7 +207,14 @@ where
"free_servers".to_string(),
server_stats
.keys()
.filter(|server_id| server_stats.get(server_id).unwrap().state == ServerState::Idle)
.filter(|server_id| {
server_stats
.get(server_id)
.unwrap()
.state
.load(Ordering::Relaxed)
== ServerState::Idle
})
.count()
.to_string(),
]));
@@ -186,7 +222,14 @@ where
"used_servers".to_string(),
server_stats
.keys()
.filter(|server_id| server_stats.get(server_id).unwrap().state == ServerState::Active)
.filter(|server_id| {
server_stats
.get(server_id)
.unwrap()
.state
.load(Ordering::Relaxed)
== ServerState::Active
})
.count()
.to_string(),
]));
@@ -227,52 +270,50 @@ async fn show_pools<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let all_pool_stats = get_pool_stats();
let pool_lookup = PoolStats::construct_pool_lookup();
let mut res = BytesMut::new();
res.put(row_description(&PoolStats::generate_header()));
pool_lookup.iter().for_each(|(_identifier, pool_stats)| {
res.put(data_row(&pool_stats.generate_row()));
});
res.put(command_complete("SHOW"));
let columns = vec![
("database", DataType::Text),
("user", DataType::Text),
("pool_mode", DataType::Text),
("cl_idle", DataType::Numeric),
("cl_active", DataType::Numeric),
("cl_waiting", DataType::Numeric),
("cl_cancel_req", DataType::Numeric),
("sv_active", DataType::Numeric),
("sv_idle", DataType::Numeric),
("sv_used", DataType::Numeric),
("sv_tested", DataType::Numeric),
("sv_login", DataType::Numeric),
("maxwait", DataType::Numeric),
("maxwait_us", DataType::Numeric),
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
/// Show all available options.
async fn show_help<T>(stream: &mut T) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let mut res = BytesMut::new();
let detail_msg = [
"",
"SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION",
// "SHOW PEERS|PEER_POOLS", // missing PEERS|PEER_POOLS
// "SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM|STATE", // missing FDS|SOCKETS|ACTIVE_SOCKETS|MEM|STATE
"SHOW LISTS",
// "SHOW DNS_HOSTS|DNS_ZONES", // missing DNS_HOSTS|DNS_ZONES
"SHOW STATS", // missing STATS_TOTALS|STATS_AVERAGES|TOTALS
"SET key = arg",
"RELOAD",
"PAUSE [<db>, <user>]",
"RESUME [<db>, <user>]",
// "DISABLE <db>", // missing
// "ENABLE <db>", // missing
// "RECONNECT [<db>]", missing
// "KILL <db>",
// "SUSPEND",
"SHUTDOWN",
];
let mut res = BytesMut::new();
res.put(row_description(&columns));
for (user_pool, pool) in get_all_pools() {
let def = HashMap::default();
let pool_stats = all_pool_stats
.get(&(user_pool.db.clone(), user_pool.user.clone()))
.unwrap_or(&def);
let pool_config = &pool.settings;
let mut row = vec![
user_pool.db.clone(),
user_pool.user.clone(),
pool_config.pool_mode.to_string(),
];
for column in &columns[3..columns.len()] {
let value = match column.0 {
"maxwait" => (pool_stats.get("maxwait_us").unwrap_or(&0) / 1_000_000).to_string(),
"maxwait_us" => {
(pool_stats.get("maxwait_us").unwrap_or(&0) % 1_000_000).to_string()
}
_other_values => pool_stats.get(column.0).unwrap_or(&0).to_string(),
};
row.push(value);
}
res.put(data_row(&row));
}
res.put(notify("Console usage", detail_msg.join("\n\t")));
res.put(command_complete("SHOW"));
// ReadyForQuery
@@ -320,17 +361,17 @@ where
let paused = pool.paused();
res.put(data_row(&vec![
address.name(), // name
address.host.to_string(), // host
address.port.to_string(), // port
database_name.to_string(), // database
pool_config.user.username.to_string(), // force_user
pool_config.user.pool_size.to_string(), // pool_size
"0".to_string(), // min_pool_size
"0".to_string(), // reserve_pool
pool_config.pool_mode.to_string(), // pool_mode
pool_config.user.pool_size.to_string(), // max_connections
pool_state.connections.to_string(), // current_connections
address.name(), // name
address.host.to_string(), // host
address.port.to_string(), // port
database_name.to_string(), // database
pool_config.user.username.to_string(), // force_user
pool_config.user.pool_size.to_string(), // pool_size
pool_config.user.min_pool_size.unwrap_or(0).to_string(), // min_pool_size
"0".to_string(), // reserve_pool
pool_config.pool_mode.to_string(), // pool_mode
pool_config.user.pool_size.to_string(), // max_connections
pool_state.connections.to_string(), // current_connections
match paused {
// paused
true => "1".to_string(),
@@ -400,7 +441,7 @@ where
for (id, pool) in get_all_pools().iter() {
for address in pool.get_addresses_from_host(host) {
if !pool.is_banned(&address) {
pool.ban(&address, BanReason::AdminBan(duration_seconds), -1);
pool.ban(&address, BanReason::AdminBan(duration_seconds), None);
res.put(data_row(&vec![
id.db.clone(),
id.user.clone(),
@@ -617,7 +658,6 @@ where
("avg_wait_time", DataType::Numeric),
];
let all_stats = get_address_stats();
let mut res = BytesMut::new();
res.put(row_description(&columns));
@@ -625,15 +665,10 @@ where
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let stats = match all_stats.get(&address.id) {
Some(stats) => stats.clone(),
None => HashMap::new(),
};
let mut row = vec![address.name(), user_pool.db.clone(), user_pool.user.clone()];
for column in &columns[3..] {
row.push(stats.get(column.0).unwrap_or(&0).to_string());
}
let stats = address.stats.clone();
stats.populate_row(&mut row);
res.put(data_row(&row));
}
@@ -665,6 +700,8 @@ where
("query_count", DataType::Numeric),
("error_count", DataType::Numeric),
("age_seconds", DataType::Numeric),
("maxwait", DataType::Numeric),
("maxwait_us", DataType::Numeric),
];
let new_map = get_client_stats();
@@ -672,19 +709,22 @@ where
res.put(row_description(&columns));
for (_, client) in new_map {
let max_wait = client.max_wait_time.load(Ordering::Relaxed);
let row = vec![
format!("{:#010X}", client.client_id),
client.pool_name,
client.username,
client.application_name.clone(),
client.state.to_string(),
client.transaction_count.to_string(),
client.query_count.to_string(),
client.error_count.to_string(),
format!("{:#010X}", client.client_id()),
client.pool_name(),
client.username(),
client.application_name(),
client.state.load(Ordering::Relaxed).to_string(),
client.transaction_count.load(Ordering::Relaxed).to_string(),
client.query_count.load(Ordering::Relaxed).to_string(),
client.error_count.load(Ordering::Relaxed).to_string(),
Instant::now()
.duration_since(client.connect_time)
.duration_since(client.connect_time())
.as_secs()
.to_string(),
(max_wait / 1_000_000).to_string(),
(max_wait % 1_000_000).to_string(),
];
res.put(data_row(&row));
@@ -717,6 +757,10 @@ where
("bytes_sent", DataType::Numeric),
("bytes_received", DataType::Numeric),
("age_seconds", DataType::Numeric),
("prepare_cache_hit", DataType::Numeric),
("prepare_cache_miss", DataType::Numeric),
("prepare_cache_eviction", DataType::Numeric),
("prepare_cache_size", DataType::Numeric),
];
let new_map = get_server_stats();
@@ -724,21 +768,38 @@ where
res.put(row_description(&columns));
for (_, server) in new_map {
let application_name = server.application_name.read();
let row = vec![
format!("{:#010X}", server.server_id),
server.pool_name,
server.username,
server.address_name,
server.application_name,
server.state.to_string(),
server.transaction_count.to_string(),
server.query_count.to_string(),
server.bytes_sent.to_string(),
server.bytes_received.to_string(),
format!("{:#010X}", server.server_id()),
server.pool_name(),
server.username(),
server.address_name(),
application_name.clone(),
server.state.load(Ordering::Relaxed).to_string(),
server.transaction_count.load(Ordering::Relaxed).to_string(),
server.query_count.load(Ordering::Relaxed).to_string(),
server.bytes_sent.load(Ordering::Relaxed).to_string(),
server.bytes_received.load(Ordering::Relaxed).to_string(),
Instant::now()
.duration_since(server.connect_time)
.duration_since(server.connect_time())
.as_secs()
.to_string(),
server
.prepared_hit_count
.load(Ordering::Relaxed)
.to_string(),
server
.prepared_miss_count
.load(Ordering::Relaxed)
.to_string(),
server
.prepared_eviction_count
.load(Ordering::Relaxed)
.to_string(),
server
.prepared_cache_size
.load(Ordering::Relaxed)
.to_string(),
];
res.put(data_row(&row));
@@ -755,96 +816,128 @@ where
}
/// Pause a pool. It won't pass any more queries to the backends.
async fn pause<T>(stream: &mut T, query: &str) -> Result<(), Error>
async fn pause<T>(stream: &mut T, tokens: Vec<&str>) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let parts: Vec<&str> = query.split(",").map(|part| part.trim()).collect();
let parts: Vec<&str> = match tokens.len() == 2 {
true => tokens[1].split(',').map(|part| part.trim()).collect(),
false => Vec::new(),
};
if parts.len() != 2 {
error_response(
stream,
"PAUSE requires a database and a user, e.g. PAUSE my_db,my_user",
)
.await
} else {
let database = parts[0];
let user = parts[1];
match get_pool(database, user) {
Some(pool) => {
match parts.len() {
0 => {
for (_, pool) in get_all_pools() {
pool.pause();
let mut res = BytesMut::new();
res.put(command_complete(&format!("PAUSE {},{}", database, user)));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
None => {
error_response(
stream,
&format!(
"No pool configured for database: {}, user: {}",
database, user
),
)
.await
let mut res = BytesMut::new();
res.put(command_complete("PAUSE"));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
2 => {
let database = parts[0];
let user = parts[1];
match get_pool(database, user) {
Some(pool) => {
pool.pause();
let mut res = BytesMut::new();
res.put(command_complete(&format!("PAUSE {},{}", database, user)));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
None => {
error_response(
stream,
&format!(
"No pool configured for database: {}, user: {}",
database, user
),
)
.await
}
}
}
_ => error_response(stream, "usage: PAUSE [db, user]").await,
}
}
/// Resume a pool. Queries are allowed again.
async fn resume<T>(stream: &mut T, query: &str) -> Result<(), Error>
async fn resume<T>(stream: &mut T, tokens: Vec<&str>) -> Result<(), Error>
where
T: tokio::io::AsyncWrite + std::marker::Unpin,
{
let parts: Vec<&str> = query.split(",").map(|part| part.trim()).collect();
let parts: Vec<&str> = match tokens.len() == 2 {
true => tokens[1].split(',').map(|part| part.trim()).collect(),
false => Vec::new(),
};
if parts.len() != 2 {
error_response(
stream,
"RESUME requires a database and a user, e.g. RESUME my_db,my_user",
)
.await
} else {
let database = parts[0];
let user = parts[1];
match get_pool(database, user) {
Some(pool) => {
match parts.len() {
0 => {
for (_, pool) in get_all_pools() {
pool.resume();
let mut res = BytesMut::new();
res.put(command_complete(&format!("RESUME {},{}", database, user)));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
None => {
error_response(
stream,
&format!(
"No pool configured for database: {}, user: {}",
database, user
),
)
.await
let mut res = BytesMut::new();
res.put(command_complete("RESUME"));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
2 => {
let database = parts[0];
let user = parts[1];
match get_pool(database, user) {
Some(pool) => {
pool.resume();
let mut res = BytesMut::new();
res.put(command_complete(&format!("RESUME {},{}", database, user)));
// ReadyForQuery
res.put_u8(b'Z');
res.put_i32(5);
res.put_u8(b'I');
write_all_half(stream, &res).await
}
None => {
error_response(
stream,
&format!(
"No pool configured for database: {}, user: {}",
database, user
),
)
.await
}
}
}
_ => error_response(stream, "usage: RESUME [db, user]").await,
}
}

138
src/auth_passthrough.rs Normal file
View File

@@ -0,0 +1,138 @@
use crate::config::AuthType;
use crate::errors::Error;
use crate::pool::ConnectionPool;
use crate::server::Server;
use log::debug;
#[derive(Clone, Debug)]
pub struct AuthPassthrough {
password: String,
query: String,
user: String,
}
impl AuthPassthrough {
/// Initializes an AuthPassthrough.
pub fn new(query: &str, user: &str, password: &str) -> Self {
AuthPassthrough {
password: password.to_string(),
query: query.to_string(),
user: user.to_string(),
}
}
/// Returns an AuthPassthrough given the pool configuration.
/// If any of required values is not set, None is returned.
pub fn from_pool_config(pool_config: &crate::config::Pool) -> Option<Self> {
if pool_config.is_auth_query_configured() {
return Some(AuthPassthrough::new(
pool_config.auth_query.as_ref().unwrap(),
pool_config.auth_query_user.as_ref().unwrap(),
pool_config.auth_query_password.as_ref().unwrap(),
));
}
None
}
/// Returns an AuthPassthrough given the pool settings.
/// If any of required values is not set, None is returned.
pub fn from_pool_settings(pool_settings: &crate::pool::PoolSettings) -> Option<Self> {
let pool_config = crate::config::Pool {
auth_query: pool_settings.auth_query.clone(),
auth_query_password: pool_settings.auth_query_password.clone(),
auth_query_user: pool_settings.auth_query_user.clone(),
..Default::default()
};
AuthPassthrough::from_pool_config(&pool_config)
}
/// Connects to server and executes auth_query for the specified address.
/// If the response is a row with two columns containing the username set in the address.
/// and its MD5 hash, the MD5 hash returned.
///
/// Note that the query is executed, changing $1 with the name of the user
/// this is so we only hold in memory (and transfer) the least amount of 'sensitive' data.
/// Also, it is compatible with pgbouncer.
///
/// # Arguments
///
/// * `address` - An Address of the server we want to connect to. The username for the hash will be obtained from this value.
///
/// # Examples
///
/// ```
/// use pgcat::auth_passthrough::AuthPassthrough;
/// use pgcat::config::Address;
/// let auth_passthrough = AuthPassthrough::new("SELECT * FROM public.user_lookup('$1');", "postgres", "postgres");
/// auth_passthrough.fetch_hash(&Address::default());
/// ```
///
pub async fn fetch_hash(&self, address: &crate::config::Address) -> Result<String, Error> {
let auth_user = crate::config::User {
username: self.user.clone(),
auth_type: AuthType::MD5,
password: Some(self.password.clone()),
server_username: None,
server_password: None,
pool_size: 1,
statement_timeout: 0,
pool_mode: None,
server_lifetime: None,
min_pool_size: None,
connect_timeout: None,
idle_timeout: None,
};
let user = &address.username;
debug!("Connecting to server to obtain auth hashes");
let auth_query = self.query.replace("$1", user);
match Server::exec_simple_query(address, &auth_user, &auth_query).await {
Ok(password_data) => {
if password_data.len() == 2 && password_data.first().unwrap() == user {
if let Some(stripped_hash) = password_data
.last()
.unwrap()
.to_string()
.strip_prefix("md5") {
Ok(stripped_hash.to_string())
}
else {
Err(Error::AuthPassthroughError(
"Obtained hash from auth_query does not seem to be in md5 format.".to_string(),
))
}
} else {
Err(Error::AuthPassthroughError(
"Data obtained from query does not follow the scheme 'user','hash'."
.to_string(),
))
}
}
Err(err) => {
Err(Error::AuthPassthroughError(
format!("Error trying to obtain password from auth_query, ignoring hash for user '{}'. Error: {:?}",
user, err))
)
}
}
}
}
pub async fn refetch_auth_hash(pool: &ConnectionPool) -> Result<String, Error> {
let address = pool.address(0, 0);
if let Some(apt) = AuthPassthrough::from_pool_settings(&pool.settings) {
let hash = apt.fetch_hash(address).await?;
return Ok(hash);
}
Err(Error::ClientError(format!(
"Could not obtain hash for {{ username: {:?}, database: {:?} }}. Auth passthrough not enabled.",
address.username, address.database
)))
}

File diff suppressed because it is too large Load Diff

36
src/cmd_args.rs Normal file
View File

@@ -0,0 +1,36 @@
use clap::{Parser, ValueEnum};
use tracing::Level;
/// PgCat: Nextgen PostgreSQL Pooler
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
pub struct Args {
#[arg(default_value_t = String::from("pgcat.toml"), env)]
pub config_file: String,
#[arg(short, long, default_value_t = tracing::Level::INFO, env)]
pub log_level: Level,
#[clap(short='F', long, value_enum, default_value_t=LogFormat::Text, env)]
pub log_format: LogFormat,
#[arg(
short,
long,
default_value_t = false,
env,
help = "disable colors in the log output"
)]
pub no_color: bool,
}
pub fn parse() -> Args {
Args::parse()
}
#[derive(ValueEnum, Clone, Debug)]
pub enum LogFormat {
Text,
Structured,
Debug,
}

File diff suppressed because it is too large Load Diff

410
src/dns_cache.rs Normal file
View File

@@ -0,0 +1,410 @@
use crate::config::get_config;
use crate::errors::Error;
use arc_swap::ArcSwap;
use log::{debug, error, info, warn};
use once_cell::sync::Lazy;
use std::collections::{HashMap, HashSet};
use std::io;
use std::net::IpAddr;
use std::sync::Arc;
use std::sync::RwLock;
use tokio::time::{sleep, Duration};
use trust_dns_resolver::error::{ResolveError, ResolveResult};
use trust_dns_resolver::lookup_ip::LookupIp;
use trust_dns_resolver::TokioAsyncResolver;
/// Cached Resolver Globally available
pub static CACHED_RESOLVER: Lazy<ArcSwap<CachedResolver>> =
Lazy::new(|| ArcSwap::from_pointee(CachedResolver::default()));
// Ip addressed are returned as a set of addresses
// so we can compare.
#[derive(Clone, PartialEq, Debug)]
pub struct AddrSet {
set: HashSet<IpAddr>,
}
impl AddrSet {
fn new() -> AddrSet {
AddrSet {
set: HashSet::new(),
}
}
}
impl From<LookupIp> for AddrSet {
fn from(lookup_ip: LookupIp) -> Self {
let mut addr_set = AddrSet::new();
for address in lookup_ip.iter() {
addr_set.set.insert(address);
}
addr_set
}
}
///
/// A CachedResolver is a DNS resolution cache mechanism with customizable expiration time.
///
/// The system works as follows:
///
/// When a host is to be resolved, if we have not resolved it before, a new resolution is
/// executed and stored in the internal cache. Concurrently, every `dns_max_ttl` time, the
/// cache is refreshed.
///
/// # Example:
///
/// ```
/// use pgcat::dns_cache::{CachedResolverConfig, CachedResolver};
///
/// # tokio_test::block_on(async {
/// let config = CachedResolverConfig::default();
/// let resolver = CachedResolver::new(config, None).await.unwrap();
/// let addrset = resolver.lookup_ip("www.example.com.").await.unwrap();
/// # })
/// ```
///
/// // Now the ip resolution is stored in local cache and subsequent
/// // calls will be returned from cache. Also, the cache is refreshed
/// // and updated every 10 seconds.
///
/// // You can now check if an 'old' lookup differs from what it's currently
/// // store in cache by using `has_changed`.
/// resolver.has_changed("www.example.com.", addrset)
#[derive(Default)]
pub struct CachedResolver {
// The configuration of the cached_resolver.
config: CachedResolverConfig,
// This is the hash that contains the hash.
data: Option<RwLock<HashMap<String, AddrSet>>>,
// The resolver to be used for DNS queries.
resolver: Option<TokioAsyncResolver>,
// The RefreshLoop
refresh_loop: RwLock<Option<tokio::task::JoinHandle<()>>>,
}
///
/// Configuration
#[derive(Clone, Debug, Default, PartialEq)]
pub struct CachedResolverConfig {
/// Amount of time in secods that a resolved dns address is considered stale.
dns_max_ttl: u64,
/// Enabled or disabled? (this is so we can reload config)
enabled: bool,
}
impl CachedResolverConfig {
fn new(dns_max_ttl: u64, enabled: bool) -> Self {
CachedResolverConfig {
dns_max_ttl,
enabled,
}
}
}
impl From<crate::config::Config> for CachedResolverConfig {
fn from(config: crate::config::Config) -> Self {
CachedResolverConfig::new(config.general.dns_max_ttl, config.general.dns_cache_enabled)
}
}
impl CachedResolver {
///
/// Returns a new Arc<CachedResolver> based on passed configuration.
/// It also starts the loop that will refresh cache entries.
///
/// # Arguments:
///
/// * `config` - The `CachedResolverConfig` to be used to create the resolver.
///
/// # Example:
///
/// ```
/// use pgcat::dns_cache::{CachedResolverConfig, CachedResolver};
///
/// # tokio_test::block_on(async {
/// let config = CachedResolverConfig::default();
/// let resolver = CachedResolver::new(config, None).await.unwrap();
/// # })
/// ```
///
pub async fn new(
config: CachedResolverConfig,
data: Option<HashMap<String, AddrSet>>,
) -> Result<Arc<Self>, io::Error> {
// Construct a new Resolver with default configuration options
let resolver = Some(TokioAsyncResolver::tokio_from_system_conf()?);
let data = if let Some(hash) = data {
Some(RwLock::new(hash))
} else {
Some(RwLock::new(HashMap::new()))
};
let instance = Arc::new(Self {
config,
resolver,
data,
refresh_loop: RwLock::new(None),
});
if instance.enabled() {
info!("Scheduling DNS refresh loop");
let refresh_loop = tokio::task::spawn({
let instance = instance.clone();
async move {
instance.refresh_dns_entries_loop().await;
}
});
*(instance.refresh_loop.write().unwrap()) = Some(refresh_loop);
}
Ok(instance)
}
pub fn enabled(&self) -> bool {
self.config.enabled
}
// Schedules the refresher
async fn refresh_dns_entries_loop(&self) {
let resolver = TokioAsyncResolver::tokio_from_system_conf().unwrap();
let interval = Duration::from_secs(self.config.dns_max_ttl);
loop {
debug!("Begin refreshing cached DNS addresses.");
// To minimize the time we hold the lock, we first create
// an array with keys.
let mut hostnames: Vec<String> = Vec::new();
{
if let Some(ref data) = self.data {
for hostname in data.read().unwrap().keys() {
hostnames.push(hostname.clone());
}
}
}
for hostname in hostnames.iter() {
let addrset = self
.fetch_from_cache(hostname.as_str())
.expect("Could not obtain expected address from cache, this should not happen");
match resolver.lookup_ip(hostname).await {
Ok(lookup_ip) => {
let new_addrset = AddrSet::from(lookup_ip);
debug!(
"Obtained address for host ({}) -> ({:?})",
hostname, new_addrset
);
if addrset != new_addrset {
debug!(
"Addr changed from {:?} to {:?} updating cache.",
addrset, new_addrset
);
self.store_in_cache(hostname, new_addrset);
}
}
Err(err) => {
error!(
"There was an error trying to resolv {}: ({}).",
hostname, err
);
}
}
}
debug!("Finished refreshing cached DNS addresses.");
sleep(interval).await;
}
}
/// Returns a `AddrSet` given the specified hostname.
///
/// This method first tries to fetch the value from the cache, if it misses
/// then it is resolved and stored in the cache. TTL from records is ignored.
///
/// # Arguments
///
/// * `host` - A string slice referencing the hostname to be resolved.
///
/// # Example:
///
/// ```
/// use pgcat::dns_cache::{CachedResolverConfig, CachedResolver};
///
/// # tokio_test::block_on(async {
/// let config = CachedResolverConfig::default();
/// let resolver = CachedResolver::new(config, None).await.unwrap();
/// let response = resolver.lookup_ip("www.google.com.");
/// # })
/// ```
///
pub async fn lookup_ip(&self, host: &str) -> ResolveResult<AddrSet> {
debug!("Lookup up {} in cache", host);
match self.fetch_from_cache(host) {
Some(addr_set) => {
debug!("Cache hit!");
Ok(addr_set)
}
None => {
debug!("Not found, executing a dns query!");
if let Some(ref resolver) = self.resolver {
let addr_set = AddrSet::from(resolver.lookup_ip(host).await?);
debug!("Obtained: {:?}", addr_set);
self.store_in_cache(host, addr_set.clone());
Ok(addr_set)
} else {
Err(ResolveError::from("No resolver available"))
}
}
}
}
//
// Returns true if the stored host resolution differs from the AddrSet passed.
pub fn has_changed(&self, host: &str, addr_set: &AddrSet) -> bool {
if let Some(fetched_addr_set) = self.fetch_from_cache(host) {
return fetched_addr_set != *addr_set;
}
false
}
// Fetches an AddrSet from the inner cache adquiring the read lock.
fn fetch_from_cache(&self, key: &str) -> Option<AddrSet> {
if let Some(ref hash) = self.data {
if let Some(addr_set) = hash.read().unwrap().get(key) {
return Some(addr_set.clone());
}
}
None
}
// Sets up the global CACHED_RESOLVER static variable so we can globally use DNS
// cache.
pub async fn from_config() -> Result<(), Error> {
let cached_resolver = CACHED_RESOLVER.load();
let desired_config = CachedResolverConfig::from(get_config());
if cached_resolver.config != desired_config {
if let Some(ref refresh_loop) = *(cached_resolver.refresh_loop.write().unwrap()) {
warn!("Killing Dnscache refresh loop as its configuration is being reloaded");
refresh_loop.abort()
}
let new_resolver = if let Some(ref data) = cached_resolver.data {
let data = Some(data.read().unwrap().clone());
CachedResolver::new(desired_config, data).await
} else {
CachedResolver::new(desired_config, None).await
};
match new_resolver {
Ok(ok) => {
CACHED_RESOLVER.store(ok);
Ok(())
}
Err(err) => {
let message = format!("Error setting up cached_resolver. Error: {:?}, will continue without this feature.", err);
Err(Error::DNSCachedError(message))
}
}
} else {
Ok(())
}
}
// Stores the AddrSet in cache adquiring the write lock.
fn store_in_cache(&self, host: &str, addr_set: AddrSet) {
if let Some(ref data) = self.data {
data.write().unwrap().insert(host.to_string(), addr_set);
} else {
error!("Could not insert, Hash not initialized");
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use trust_dns_resolver::error::ResolveError;
#[tokio::test]
async fn new() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await;
assert!(resolver.is_ok());
}
#[tokio::test]
async fn lookup_ip() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await.unwrap();
let response = resolver.lookup_ip("www.google.com.").await;
assert!(response.is_ok());
}
#[tokio::test]
async fn has_changed() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await.unwrap();
let hostname = "www.google.com.";
let response = resolver.lookup_ip(hostname).await;
let addr_set = response.unwrap();
assert!(!resolver.has_changed(hostname, &addr_set));
}
#[tokio::test]
async fn unknown_host() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await.unwrap();
let hostname = "www.idontexists.";
let response = resolver.lookup_ip(hostname).await;
assert!(matches!(response, Err(ResolveError { .. })));
}
#[tokio::test]
async fn incorrect_address() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await.unwrap();
let hostname = "w ww.idontexists.";
let response = resolver.lookup_ip(hostname).await;
assert!(matches!(response, Err(ResolveError { .. })));
assert!(!resolver.has_changed(hostname, &AddrSet::new()));
}
#[tokio::test]
// Ok, this test is based on the fact that google does DNS RR
// and does not responds with every available ip everytime, so
// if I cache here, it will miss after one cache iteration or two.
async fn thread() {
let config = CachedResolverConfig {
dns_max_ttl: 10,
enabled: true,
};
let resolver = CachedResolver::new(config, None).await.unwrap();
let hostname = "www.google.com.";
let response = resolver.lookup_ip(hostname).await;
let addr_set = response.unwrap();
assert!(!resolver.has_changed(hostname, &addr_set));
let resolver_for_refresher = resolver.clone();
let _thread_handle = tokio::task::spawn(async move {
resolver_for_refresher.refresh_dns_entries_loop().await;
});
assert!(!resolver.has_changed(hostname, &addr_set));
}
}

View File

@@ -1,18 +1,133 @@
/// Errors.
//! Errors.
/// Various errors.
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone)]
pub enum Error {
SocketError(String),
ClientSocketError(String, ClientIdentifier),
ClientGeneralError(String, ClientIdentifier),
ClientAuthImpossible(String),
ClientAuthPassthroughError(String, ClientIdentifier),
ClientBadStartup,
ProtocolSyncError(String),
BadQuery(String),
ServerError,
ServerMessageParserError(String),
ServerStartupError(String, ServerIdentifier),
ServerAuthError(String, ServerIdentifier),
BadConfig,
AllServersDown,
ClientError(String),
TlsError,
StatementTimeout,
DNSCachedError(String),
ShuttingDown,
ParseBytesError(String),
AuthError(String),
AuthPassthroughError(String),
UnsupportedStatement,
QueryRouterParserError(String),
QueryRouterError(String),
InvalidShardId(usize),
PreparedStatementError,
}
#[derive(Clone, PartialEq, Debug)]
pub struct ClientIdentifier {
pub application_name: String,
pub username: String,
pub pool_name: String,
}
impl ClientIdentifier {
pub fn new(application_name: &str, username: &str, pool_name: &str) -> ClientIdentifier {
ClientIdentifier {
application_name: application_name.into(),
username: username.into(),
pool_name: pool_name.into(),
}
}
}
impl std::fmt::Display for ClientIdentifier {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(
f,
"{{ application_name: {}, username: {}, pool_name: {} }}",
self.application_name, self.username, self.pool_name
)
}
}
#[derive(Clone, PartialEq, Debug)]
pub struct ServerIdentifier {
pub username: String,
pub database: String,
}
impl ServerIdentifier {
pub fn new(username: &str, database: &str) -> ServerIdentifier {
ServerIdentifier {
username: username.into(),
database: database.into(),
}
}
}
impl std::fmt::Display for ServerIdentifier {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(
f,
"{{ username: {}, database: {} }}",
self.username, self.database
)
}
}
impl std::fmt::Display for Error {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match &self {
&Error::ClientSocketError(error, client_identifier) => write!(
f,
"Error reading {} from client {}",
error, client_identifier
),
&Error::ClientGeneralError(error, client_identifier) => {
write!(f, "{} {}", error, client_identifier)
}
&Error::ClientAuthImpossible(username) => write!(
f,
"Client auth not possible, \
no cleartext password set for username: {} \
in config and auth passthrough (query_auth) \
is not set up.",
username
),
&Error::ClientAuthPassthroughError(error, client_identifier) => write!(
f,
"No cleartext password set, \
and no auth passthrough could not \
obtain the hash from server for {}, \
the error was: {}",
client_identifier, error
),
&Error::ServerStartupError(error, server_identifier) => write!(
f,
"Error reading {} on server startup {}",
error, server_identifier,
),
&Error::ServerAuthError(error, server_identifier) => {
write!(f, "{} for {}", error, server_identifier,)
}
// The rest can use Debug.
err => write!(f, "{:?}", err),
}
}
}
impl From<std::ffi::NulError> for Error {
fn from(err: std::ffi::NulError) -> Self {
Error::QueryRouterError(err.to_string())
}
}

View File

@@ -1,10 +1,18 @@
pub mod admin;
pub mod auth_passthrough;
pub mod client;
pub mod cmd_args;
pub mod config;
pub mod constants;
pub mod dns_cache;
pub mod errors;
pub mod logger;
pub mod messages;
pub mod mirrors;
pub mod multi_logger;
pub mod plugins;
pub mod pool;
pub mod prometheus;
pub mod query_router;
pub mod scram;
pub mod server;
pub mod sharding;

20
src/logger.rs Normal file
View File

@@ -0,0 +1,20 @@
use crate::cmd_args::{Args, LogFormat};
use tracing_subscriber;
use tracing_subscriber::EnvFilter;
pub fn init(args: &Args) {
// Iniitalize a default filter, and then override the builtin default "warning" with our
// commandline, (default: "info")
let filter = EnvFilter::from_default_env().add_directive(args.log_level.into());
let trace_sub = tracing_subscriber::fmt()
.with_thread_ids(true)
.with_env_filter(filter)
.with_ansi(!args.no_color);
match args.log_format {
LogFormat::Structured => trace_sub.json().init(),
LogFormat::Debug => trace_sub.pretty().init(),
_ => trace_sub.init(),
};
}

View File

@@ -23,7 +23,6 @@ extern crate arc_swap;
extern crate async_trait;
extern crate bb8;
extern crate bytes;
extern crate env_logger;
extern crate exitcode;
extern crate log;
extern crate md5;
@@ -36,6 +35,7 @@ extern crate sqlparser;
extern crate tokio;
extern crate tokio_rustls;
extern crate toml;
extern crate trust_dns_resolver;
#[cfg(not(target_env = "msvc"))]
use jemallocator::Jemalloc;
@@ -60,52 +60,32 @@ use std::str::FromStr;
use std::sync::Arc;
use tokio::sync::broadcast;
mod admin;
mod client;
mod config;
mod constants;
mod errors;
mod messages;
mod mirrors;
mod multi_logger;
mod pool;
mod prometheus;
mod query_router;
mod scram;
mod server;
mod sharding;
mod stats;
mod tls;
use crate::config::{get_config, reload_config, VERSION};
use crate::pool::{ClientServerMap, ConnectionPool};
use crate::prometheus::start_metric_server;
use crate::stats::{Collector, Reporter, REPORTER};
use pgcat::cmd_args;
use pgcat::config::{get_config, reload_config, VERSION};
use pgcat::dns_cache;
use pgcat::logger;
use pgcat::messages::configure_socket;
use pgcat::pool::{ClientServerMap, ConnectionPool};
use pgcat::prometheus::start_metric_server;
use pgcat::stats::{Collector, Reporter, REPORTER};
fn main() -> Result<(), Box<dyn std::error::Error>> {
multi_logger::MultiLogger::init().unwrap();
let args = cmd_args::parse();
logger::init(&args);
info!("Welcome to PgCat! Meow. (Version {})", VERSION);
if !query_router::QueryRouter::setup() {
if !pgcat::query_router::QueryRouter::setup() {
error!("Could not setup query router");
std::process::exit(exitcode::CONFIG);
}
let args = std::env::args().collect::<Vec<String>>();
let config_file = if args.len() == 2 {
args[1].to_string()
} else {
String::from("pgcat.toml")
};
// Create a transient runtime for loading the config for the first time.
{
let runtime = Builder::new_multi_thread().worker_threads(1).build()?;
runtime.block_on(async {
match config::parse(&config_file).await {
match pgcat::config::parse(args.config_file.as_str()).await {
Ok(_) => (),
Err(err) => {
error!("Config parse error: {:?}", err);
@@ -162,8 +142,13 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let client_server_map: ClientServerMap = Arc::new(Mutex::new(HashMap::new()));
// Statistics reporting.
let (stats_tx, stats_rx) = mpsc::channel(500_000);
REPORTER.store(Arc::new(Reporter::new(stats_tx.clone())));
REPORTER.store(Arc::new(Reporter::default()));
// Starts (if enabled) dns cache before pools initialization
match dns_cache::CachedResolver::from_config().await {
Ok(_) => (),
Err(err) => error!("DNS cache initialization error: {:?}", err),
};
// Connection pool that allows to query all shards and replicas.
match ConnectionPool::from_config(client_server_map.clone()).await {
@@ -175,20 +160,23 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
};
tokio::task::spawn(async move {
let mut stats_collector = Collector::new(stats_rx, stats_tx.clone());
let mut stats_collector = Collector::default();
stats_collector.collect().await;
});
info!("Config autoreloader: {}", config.general.autoreload);
info!("Config autoreloader: {}", match config.general.autoreload {
Some(interval) => format!("{} ms", interval),
None => "disabled".into(),
});
let mut autoreload_interval = tokio::time::interval(tokio::time::Duration::from_millis(15_000));
let autoreload_client_server_map = client_server_map.clone();
if let Some(interval) = config.general.autoreload {
let mut autoreload_interval = tokio::time::interval(tokio::time::Duration::from_millis(interval));
let autoreload_client_server_map = client_server_map.clone();
tokio::task::spawn(async move {
loop {
autoreload_interval.tick().await;
if config.general.autoreload {
info!("Automatically reloading config");
tokio::task::spawn(async move {
loop {
autoreload_interval.tick().await;
debug!("Automatically reloading config");
if let Ok(changed) = reload_config(autoreload_client_server_map.clone()).await {
if changed {
@@ -196,8 +184,10 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
}
};
}
}
});
});
};
#[cfg(windows)]
let mut term_signal = win_signal::ctrl_close().unwrap();
@@ -282,18 +272,20 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let drain_tx = drain_tx.clone();
let client_server_map = client_server_map.clone();
let tls_certificate = config.general.tls_certificate.clone();
let tls_certificate = get_config().general.tls_certificate.clone();
configure_socket(&socket);
tokio::task::spawn(async move {
let start = chrono::offset::Utc::now().naive_utc();
match client::client_entrypoint(
match pgcat::client::client_entrypoint(
socket,
client_server_map,
shutdown_rx,
drain_tx,
admin_only,
tls_certificate.clone(),
tls_certificate,
config.general.log_client_connections,
)
.await
@@ -301,7 +293,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
Ok(()) => {
let duration = chrono::offset::Utc::now().naive_utc() - start;
if config.general.log_client_disconnections {
if get_config().general.log_client_disconnections {
info!(
"Client {:?} disconnected, session duration: {}",
addr,
@@ -318,7 +310,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
Err(err) => {
match err {
errors::Error::ClientBadStartup => debug!("Client disconnected with error {:?}", err),
pgcat::errors::Error::ClientBadStartup => debug!("Client disconnected with error {:?}", err),
_ => warn!("Client disconnected with error {:?}", err),
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,11 +1,13 @@
use std::sync::Arc;
/// A mirrored PostgreSQL client.
/// Packets arrive to us through a channel from the main client and we send them to the server.
use bb8::Pool;
use bytes::{Bytes, BytesMut};
use parking_lot::RwLock;
use crate::config::{get_config, Address, Role, User};
use crate::pool::{ClientServerMap, ServerPool};
use crate::stats::get_reporter;
use log::{error, info, trace, warn};
use tokio::sync::mpsc::{channel, Receiver, Sender};
@@ -21,20 +23,27 @@ impl MirroredClient {
async fn create_pool(&self) -> Pool<ServerPool> {
let config = get_config();
let default = std::time::Duration::from_millis(10_000).as_millis() as u64;
let (connection_timeout, idle_timeout) = match config.pools.get(&self.address.pool_name) {
Some(cfg) => (
cfg.connect_timeout.unwrap_or(default),
cfg.idle_timeout.unwrap_or(default),
),
None => (default, default),
};
let (connection_timeout, idle_timeout, _cfg, prepared_statement_cache_size) =
match config.pools.get(&self.address.pool_name) {
Some(cfg) => (
cfg.connect_timeout.unwrap_or(default),
cfg.idle_timeout.unwrap_or(default),
cfg.clone(),
cfg.prepared_statements_cache_size,
),
None => (default, default, crate::config::Pool::default(), 0),
};
let manager = ServerPool::new(
self.address.clone(),
self.user.clone(),
self.database.as_str(),
ClientServerMap::default(),
get_reporter(),
Arc::new(RwLock::new(None)),
None,
true,
false,
prepared_statement_cache_size,
);
Pool::builder()
@@ -72,12 +81,13 @@ impl MirroredClient {
}
// Incoming data from server (we read to clear the socket buffer and discard the data)
recv_result = server.recv() => {
recv_result = server.recv(None) => {
match recv_result {
Ok(message) => trace!("Received from mirror: {} {:?}", String::from_utf8_lossy(&message[..]), address.clone()),
Err(err) => {
server.mark_bad();
error!("Failed to receive from mirror {:?} {:?}", err, address.clone());
server.mark_bad(
format!("Failed to send to mirror, Discarding message {:?}, {:?}", err, address.clone()).as_str()
);
}
}
}
@@ -89,8 +99,9 @@ impl MirroredClient {
match server.send(&BytesMut::from(&bytes[..])).await {
Ok(_) => trace!("Sent to mirror: {} {:?}", String::from_utf8_lossy(&bytes[..]), address.clone()),
Err(err) => {
server.mark_bad();
error!("Failed to send to mirror, Discarding message {:?}, {:?}", err, address.clone())
server.mark_bad(
format!("Failed to receive from mirror {:?} {:?}", err, address.clone()).as_str()
);
}
}
}
@@ -130,18 +141,18 @@ impl MirroringManager {
bytes_rx,
disconnect_rx: exit_rx,
};
exit_senders.push(exit_tx.clone());
byte_senders.push(bytes_tx.clone());
exit_senders.push(exit_tx);
byte_senders.push(bytes_tx);
client.start();
});
Self {
byte_senders: byte_senders,
byte_senders,
disconnect_senders: exit_senders,
}
}
pub fn send(self: &mut Self, bytes: &BytesMut) {
pub fn send(&mut self, bytes: &BytesMut) {
// We want to avoid performing an allocation if we won't be able to send the message
// There is a possibility of a race here where we check the capacity and then the channel is
// closed or the capacity is reduced to 0, but mirroring is best effort anyway
@@ -163,7 +174,7 @@ impl MirroringManager {
});
}
pub fn disconnect(self: &mut Self) {
pub fn disconnect(&mut self) {
self.disconnect_senders
.iter_mut()
.for_each(|sender| match sender.try_send(()) {

View File

@@ -1,80 +0,0 @@
use log::{Level, Log, Metadata, Record, SetLoggerError};
// This is a special kind of logger that allows sending logs to different
// targets depending on the log level.
//
// By default, if nothing is set, it acts as a regular env_log logger,
// it sends everything to standard error.
//
// If the Env variable `STDOUT_LOG` is defined, it will be used for
// configuring the standard out logger.
//
// The behavior is:
// - If it is an error, the message is written to standard error.
// - If it is not, and it matches the log level of the standard output logger (`STDOUT_LOG` env var), it will be send to standard output.
// - If the above is not true, it is sent to the stderr logger that will log it or not depending on the value
// of the RUST_LOG env var.
//
// So to summarize, if no `STDOUT_LOG` env var is present, the logger is the default logger. If `STDOUT_LOG` is set, everything
// but errors, that matches the log level set in the `STDOUT_LOG` env var is sent to stdout. You can have also some esoteric configuration
// where you set `RUST_LOG=debug` and `STDOUT_LOG=info`, in here, erros will go to stderr, warns and infos to stdout and debugs to stderr.
//
pub struct MultiLogger {
stderr_logger: env_logger::Logger,
stdout_logger: env_logger::Logger,
}
impl MultiLogger {
fn new() -> Self {
let stderr_logger = env_logger::builder().format_timestamp_micros().build();
let stdout_logger = env_logger::Builder::from_env("STDOUT_LOG")
.format_timestamp_micros()
.target(env_logger::Target::Stdout)
.build();
Self {
stderr_logger,
stdout_logger,
}
}
pub fn init() -> Result<(), SetLoggerError> {
let logger = Self::new();
log::set_max_level(logger.stderr_logger.filter());
log::set_boxed_logger(Box::new(logger))
}
}
impl Log for MultiLogger {
fn enabled(&self, metadata: &Metadata) -> bool {
self.stderr_logger.enabled(metadata) && self.stdout_logger.enabled(metadata)
}
fn log(&self, record: &Record) {
if record.level() == Level::Error {
self.stderr_logger.log(record);
} else {
if self.stdout_logger.matches(record) {
self.stdout_logger.log(record);
} else {
self.stderr_logger.log(record);
}
}
}
fn flush(&self) {
self.stderr_logger.flush();
self.stdout_logger.flush();
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_init() {
MultiLogger::init().unwrap();
}
}

120
src/plugins/intercept.rs Normal file
View File

@@ -0,0 +1,120 @@
//! The intercept plugin.
//!
//! It intercepts queries and returns fake results.
use async_trait::async_trait;
use bytes::{BufMut, BytesMut};
use serde::{Deserialize, Serialize};
use sqlparser::ast::Statement;
use log::debug;
use crate::{
config::Intercept as InterceptConfig,
errors::Error,
messages::{command_complete, data_row_nullable, row_description, DataType},
plugins::{Plugin, PluginOutput},
query_router::QueryRouter,
};
// TODO: use these structs for deserialization
#[derive(Serialize, Deserialize)]
pub struct Rule {
query: String,
schema: Vec<Column>,
result: Vec<Vec<String>>,
}
#[derive(Serialize, Deserialize)]
pub struct Column {
name: String,
data_type: String,
}
/// The intercept plugin.
pub struct Intercept<'a> {
pub enabled: bool,
pub config: &'a InterceptConfig,
}
#[async_trait]
impl<'a> Plugin for Intercept<'a> {
async fn run(
&mut self,
query_router: &QueryRouter,
ast: &Vec<Statement>,
) -> Result<PluginOutput, Error> {
if !self.enabled || ast.is_empty() {
return Ok(PluginOutput::Allow);
}
let mut config = self.config.clone();
config.substitute(
&query_router.pool_settings().db,
&query_router.pool_settings().user.username,
);
let mut result = BytesMut::new();
for q in ast {
// Normalization
let q = q.to_string().to_ascii_lowercase();
for (_, target) in config.queries.iter() {
if target.query.as_str() == q {
debug!("Intercepting query: {}", q);
let rd = target
.schema
.iter()
.map(|row| {
let name = &row[0];
let data_type = &row[1];
(
name.as_str(),
match data_type.as_str() {
"text" => DataType::Text,
"anyarray" => DataType::AnyArray,
"oid" => DataType::Oid,
"bool" => DataType::Bool,
"int4" => DataType::Int4,
_ => DataType::Any,
},
)
})
.collect::<Vec<(&str, DataType)>>();
result.put(row_description(&rd));
target.result.iter().for_each(|row| {
let row = row
.iter()
.map(|s| {
let s = s.as_str().to_string();
if s.is_empty() {
None
} else {
Some(s)
}
})
.collect::<Vec<Option<String>>>();
result.put(data_row_nullable(&row));
});
result.put(command_complete("SELECT"));
}
}
}
if !result.is_empty() {
result.put_u8(b'Z');
result.put_i32(5);
result.put_u8(b'I');
return Ok(PluginOutput::Intercept(result));
} else {
Ok(PluginOutput::Allow)
}
}
}

45
src/plugins/mod.rs Normal file
View File

@@ -0,0 +1,45 @@
//! The plugin ecosystem.
//!
//! Currently plugins only grant access or deny access to the database for a particual query.
//! Example use cases:
//! - block known bad queries
//! - block access to system catalogs
//! - block dangerous modifications like `DROP TABLE`
//! - etc
//!
pub mod intercept;
pub mod prewarmer;
pub mod query_logger;
pub mod table_access;
use crate::{errors::Error, query_router::QueryRouter};
use async_trait::async_trait;
use bytes::BytesMut;
use sqlparser::ast::Statement;
pub use intercept::Intercept;
pub use query_logger::QueryLogger;
pub use table_access::TableAccess;
#[derive(Clone, Debug, PartialEq)]
pub enum PluginOutput {
Allow,
Deny(String),
Overwrite(Vec<Statement>),
Intercept(BytesMut),
}
#[async_trait]
pub trait Plugin {
// Run before the query is sent to the server.
#[allow(clippy::ptr_arg)]
async fn run(
&mut self,
query_router: &QueryRouter,
ast: &Vec<Statement>,
) -> Result<PluginOutput, Error>;
// TODO: run after the result is returned
// async fn callback(&mut self, query_router: &QueryRouter);
}

28
src/plugins/prewarmer.rs Normal file
View File

@@ -0,0 +1,28 @@
//! Prewarm new connections before giving them to the client.
use crate::{errors::Error, server::Server};
use log::info;
pub struct Prewarmer<'a> {
pub enabled: bool,
pub server: &'a mut Server,
pub queries: &'a Vec<String>,
}
impl<'a> Prewarmer<'a> {
pub async fn run(&mut self) -> Result<(), Error> {
if !self.enabled {
return Ok(());
}
for query in self.queries {
info!(
"{} Prewarning with query: `{}`",
self.server.address(),
query
);
self.server.query(query).await?;
}
Ok(())
}
}

View File

@@ -0,0 +1,38 @@
//! Log all queries to stdout (or somewhere else, why not).
use crate::{
errors::Error,
plugins::{Plugin, PluginOutput},
query_router::QueryRouter,
};
use async_trait::async_trait;
use log::info;
use sqlparser::ast::Statement;
pub struct QueryLogger<'a> {
pub enabled: bool,
pub user: &'a str,
pub db: &'a str,
}
#[async_trait]
impl<'a> Plugin for QueryLogger<'a> {
async fn run(
&mut self,
_query_router: &QueryRouter,
ast: &Vec<Statement>,
) -> Result<PluginOutput, Error> {
if !self.enabled {
return Ok(PluginOutput::Allow);
}
let query = ast
.iter()
.map(|q| q.to_string())
.collect::<Vec<String>>()
.join("; ");
info!("[pool: {}][user: {}] {}", self.db, self.user, query);
Ok(PluginOutput::Allow)
}
}

View File

@@ -0,0 +1,59 @@
//! This query router plugin will check if the user can access a particular
//! table as part of their query. If they can't, the query will not be routed.
use async_trait::async_trait;
use sqlparser::ast::{visit_relations, Statement};
use crate::{
errors::Error,
plugins::{Plugin, PluginOutput},
query_router::QueryRouter,
};
use log::debug;
use core::ops::ControlFlow;
pub struct TableAccess<'a> {
pub enabled: bool,
pub tables: &'a Vec<String>,
}
#[async_trait]
impl<'a> Plugin for TableAccess<'a> {
async fn run(
&mut self,
_query_router: &QueryRouter,
ast: &Vec<Statement>,
) -> Result<PluginOutput, Error> {
if !self.enabled {
return Ok(PluginOutput::Allow);
}
let mut found = None;
visit_relations(ast, |relation| {
let relation = relation.to_string();
let parts = relation.split('.').collect::<Vec<&str>>();
let table_name = parts.last().unwrap();
if self.tables.contains(&table_name.to_string()) {
found = Some(table_name.to_string());
ControlFlow::<()>::Break(())
} else {
ControlFlow::<()>::Continue(())
}
});
if let Some(found) = found {
debug!("Blocking access to table \"{}\"", found);
Ok(PluginOutput::Deny(format!(
"permission for table \"{}\" denied",
found
)))
} else {
Ok(PluginOutput::Allow)
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +1,41 @@
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Method, Request, Response, Server, StatusCode};
use log::{error, info, warn};
use http_body_util::Full;
use hyper::body;
use hyper::body::Bytes;
use hyper::server::conn::http1;
use hyper::service::service_fn;
use hyper::{Method, Request, Response, StatusCode};
use hyper_util::rt::TokioIo;
use log::{debug, error, info};
use phf::phf_map;
use std::collections::HashMap;
use std::fmt;
use std::net::SocketAddr;
use std::sync::atomic::Ordering;
use tokio::net::TcpListener;
use crate::config::Address;
use crate::pool::get_all_pools;
use crate::stats::{get_address_stats, get_pool_stats, get_server_stats, ServerInformation};
use crate::pool::{get_all_pools, PoolIdentifier};
use crate::stats::get_server_stats;
use crate::stats::pool::PoolStats;
struct MetricHelpType {
help: &'static str,
ty: &'static str,
}
struct ServerPrometheusStats {
bytes_received: u64,
bytes_sent: u64,
transaction_count: u64,
query_count: u64,
error_count: u64,
active_count: u64,
idle_count: u64,
login_count: u64,
tested_count: u64,
}
// reference for metric types: https://prometheus.io/docs/concepts/metric_types/
// counters only increase
// gauges can arbitrarily increase or decrease
@@ -117,22 +138,46 @@ static METRIC_HELP_AND_TYPES_LOOKUP: phf::Map<&'static str, MetricHelpType> = ph
},
"servers_bytes_received" => MetricHelpType {
help: "Volume in bytes of network traffic received by server",
ty: "gauge",
ty: "counter",
},
"servers_bytes_sent" => MetricHelpType {
help: "Volume in bytes of network traffic sent by server",
ty: "gauge",
ty: "counter",
},
"servers_transaction_count" => MetricHelpType {
help: "Number of transactions executed by server",
ty: "gauge",
ty: "counter",
},
"servers_query_count" => MetricHelpType {
help: "Number of queries executed by server",
ty: "gauge",
ty: "counter",
},
"servers_error_count" => MetricHelpType {
help: "Number of errors",
ty: "counter",
},
"servers_idle_count" => MetricHelpType {
help: "Number of server connection in idle state",
ty: "gauge",
},
"servers_active_count" => MetricHelpType {
help: "Number of server connection in active state",
ty: "gauge",
},
"servers_tested_count" => MetricHelpType {
help: "Number of server connection in tested state",
ty: "gauge",
},
"servers_login_count" => MetricHelpType {
help: "Number of server connection in login state",
ty: "gauge",
},
"servers_is_banned" => MetricHelpType {
help: "0 if server is not banned, 1 if server is banned",
ty: "gauge",
},
"servers_is_paused" => MetricHelpType {
help: "0 if server is not paused, 1 if server is paused",
ty: "gauge",
},
"databases_pool_size" => MetricHelpType {
@@ -155,18 +200,17 @@ struct PrometheusMetric<Value: fmt::Display> {
impl<Value: fmt::Display> fmt::Display for PrometheusMetric<Value> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let formatted_labels = self
.labels
let mut sorted_labels: Vec<_> = self.labels.iter().collect();
sorted_labels.sort_by_key(|&(key, _)| key);
let formatted_labels = sorted_labels
.iter()
.map(|(key, value)| format!("{}=\"{}\"", key, value))
.collect::<Vec<_>>()
.join(",");
write!(
f,
"# HELP {name} {help}\n# TYPE {name} {ty}\n{name}{{{formatted_labels}}} {value}\n",
"{name}{{{formatted_labels}}} {value}",
name = format_args!("pgcat_{}", self.name),
help = self.help,
ty = self.ty,
formatted_labels = formatted_labels,
value = self.value
)
@@ -200,7 +244,9 @@ impl<Value: fmt::Display> PrometheusMetric<Value> {
labels.insert("shard", address.shard.to_string());
labels.insert("role", address.role.to_string());
labels.insert("pool", address.pool_name.clone());
labels.insert("index", address.address_index.to_string());
labels.insert("database", address.database.to_string());
labels.insert("username", address.username.clone());
Self::from_name(&format!("databases_{}", name), value, labels)
}
@@ -215,32 +261,47 @@ impl<Value: fmt::Display> PrometheusMetric<Value> {
labels.insert("shard", address.shard.to_string());
labels.insert("role", address.role.to_string());
labels.insert("pool", address.pool_name.clone());
labels.insert("index", address.address_index.to_string());
labels.insert("database", address.database.to_string());
labels.insert("username", address.username.clone());
Self::from_name(&format!("servers_{}", name), value, labels)
}
fn from_address(address: &Address, name: &str, value: i64) -> Option<PrometheusMetric<i64>> {
fn from_address(address: &Address, name: &str, value: u64) -> Option<PrometheusMetric<u64>> {
let mut labels = HashMap::new();
labels.insert("host", address.host.clone());
labels.insert("shard", address.shard.to_string());
labels.insert("pool", address.pool_name.clone());
labels.insert("role", address.role.to_string());
labels.insert("index", address.address_index.to_string());
labels.insert("database", address.database.to_string());
labels.insert("username", address.username.clone());
Self::from_name(&format!("stats_{}", name), value, labels)
}
fn from_pool(pool: &(String, String), name: &str, value: i64) -> Option<PrometheusMetric<i64>> {
fn from_pool(pool_id: PoolIdentifier, name: &str, value: u64) -> Option<PrometheusMetric<u64>> {
let mut labels = HashMap::new();
labels.insert("pool", pool.0.clone());
labels.insert("user", pool.1.clone());
labels.insert("pool", pool_id.db);
labels.insert("user", pool_id.user);
Self::from_name(&format!("pools_{}", name), value, labels)
}
fn get_header(&self) -> String {
format!(
"\n# HELP {name} {help}\n# TYPE {name} {ty}",
name = format_args!("pgcat_{}", self.name),
help = self.help,
ty = self.ty,
)
}
}
async fn prometheus_stats(request: Request<Body>) -> Result<Response<Body>, hyper::http::Error> {
async fn prometheus_stats(
request: Request<body::Incoming>,
) -> Result<Response<Full<Bytes>>, hyper::http::Error> {
match (request.method(), request.uri().path()) {
(&Method::GET, "/metrics") => {
let mut lines = Vec::new();
@@ -248,6 +309,7 @@ async fn prometheus_stats(request: Request<Body>) -> Result<Response<Body>, hype
push_pool_stats(&mut lines);
push_server_stats(&mut lines);
push_database_stats(&mut lines);
lines.push("".to_string()); // Ensure to end the stats with a line terminator as required by the specification.
Response::builder()
.header("content-type", "text/plain; version=0.0.4")
@@ -261,40 +323,60 @@ async fn prometheus_stats(request: Request<Body>) -> Result<Response<Body>, hype
// Adds metrics shown in a SHOW STATS admin command.
fn push_address_stats(lines: &mut Vec<String>) {
let address_stats: HashMap<usize, HashMap<String, i64>> = get_address_stats();
let mut grouped_metrics: HashMap<String, Vec<PrometheusMetric<u64>>> = HashMap::new();
for (_, pool) in get_all_pools() {
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
if let Some(address_stats) = address_stats.get(&address.id) {
for (key, value) in address_stats.iter() {
if let Some(prometheus_metric) =
PrometheusMetric::<i64>::from_address(address, key, *value)
{
lines.push(prometheus_metric.to_string());
} else {
warn!("Metric {} not implemented for {}", key, address.name());
}
let stats = &*address.stats;
for (key, value) in stats.clone() {
if let Some(prometheus_metric) =
PrometheusMetric::<u64>::from_address(address, &key, value)
{
grouped_metrics
.entry(key)
.or_default()
.push(prometheus_metric);
} else {
debug!("Metric {} not implemented for {}", key, address.name());
}
}
}
}
}
for (_key, metrics) in grouped_metrics {
if !metrics.is_empty() {
lines.push(metrics[0].get_header());
for metric in metrics {
lines.push(metric.to_string());
}
}
}
}
// Adds relevant metrics shown in a SHOW POOLS admin command.
fn push_pool_stats(lines: &mut Vec<String>) {
let pool_stats = get_pool_stats();
for (pool, stats) in pool_stats.iter() {
for (name, value) in stats.iter() {
if let Some(prometheus_metric) = PrometheusMetric::<i64>::from_pool(pool, name, *value)
let mut grouped_metrics: HashMap<String, Vec<PrometheusMetric<u64>>> = HashMap::new();
let pool_stats = PoolStats::construct_pool_lookup();
for (pool_id, stats) in pool_stats.iter() {
for (name, value) in stats.clone() {
if let Some(prometheus_metric) =
PrometheusMetric::<u64>::from_pool(pool_id.clone(), &name, value)
{
lines.push(prometheus_metric.to_string());
grouped_metrics
.entry(name)
.or_default()
.push(prometheus_metric);
} else {
warn!(
"Metric {} not implemented for ({},{})",
name, pool.0, pool.1
);
debug!("Metric {} not implemented for ({})", name, *pool_id);
}
}
}
for (_key, metrics) in grouped_metrics {
if !metrics.is_empty() {
lines.push(metrics[0].get_header());
for metric in metrics {
lines.push(metric.to_string());
}
}
}
@@ -302,13 +384,13 @@ fn push_pool_stats(lines: &mut Vec<String>) {
// Adds relevant metrics shown in a SHOW DATABASES admin command.
fn push_database_stats(lines: &mut Vec<String>) {
let mut grouped_metrics: HashMap<String, Vec<PrometheusMetric<u32>>> = HashMap::new();
for (_, pool) in get_all_pools() {
let pool_config = pool.settings.clone();
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
let pool_state = pool.pool_state(shard, server);
let metrics = vec![
("pool_size", pool_config.user.pool_size),
("current_connections", pool_state.connections),
@@ -317,60 +399,132 @@ fn push_database_stats(lines: &mut Vec<String>) {
if let Some(prometheus_metric) =
PrometheusMetric::<u32>::from_database_info(address, key, value)
{
lines.push(prometheus_metric.to_string());
grouped_metrics
.entry(key.to_string())
.or_default()
.push(prometheus_metric);
} else {
warn!("Metric {} not implemented for {}", key, address.name());
debug!("Metric {} not implemented for {}", key, address.name());
}
}
}
}
}
for (_key, metrics) in grouped_metrics {
if !metrics.is_empty() {
lines.push(metrics[0].get_header());
for metric in metrics {
lines.push(metric.to_string());
}
}
}
}
// Adds relevant metrics shown in a SHOW SERVERS admin command.
fn push_server_stats(lines: &mut Vec<String>) {
let server_stats = get_server_stats();
let mut server_stats_by_addresses = HashMap::<String, ServerInformation>::new();
for (_, info) in server_stats {
server_stats_by_addresses.insert(info.address_name.clone(), info);
let mut prom_stats = HashMap::<String, ServerPrometheusStats>::new();
for (_, stats) in server_stats {
let entry = prom_stats
.entry(stats.address_name())
.or_insert(ServerPrometheusStats {
bytes_received: 0,
bytes_sent: 0,
transaction_count: 0,
query_count: 0,
error_count: 0,
active_count: 0,
idle_count: 0,
login_count: 0,
tested_count: 0,
});
entry.bytes_received += stats.bytes_received.load(Ordering::Relaxed);
entry.bytes_sent += stats.bytes_sent.load(Ordering::Relaxed);
entry.transaction_count += stats.transaction_count.load(Ordering::Relaxed);
entry.query_count += stats.query_count.load(Ordering::Relaxed);
entry.error_count += stats.error_count.load(Ordering::Relaxed);
match stats.state.load(Ordering::Relaxed) {
crate::stats::ServerState::Login => entry.login_count += 1,
crate::stats::ServerState::Active => entry.active_count += 1,
crate::stats::ServerState::Tested => entry.tested_count += 1,
crate::stats::ServerState::Idle => entry.idle_count += 1,
}
}
let mut grouped_metrics: HashMap<String, Vec<PrometheusMetric<u64>>> = HashMap::new();
for (_, pool) in get_all_pools() {
for shard in 0..pool.shards() {
for server in 0..pool.servers(shard) {
let address = pool.address(shard, server);
if let Some(server_info) = server_stats_by_addresses.get(&address.name()) {
if let Some(server_info) = prom_stats.get(&address.name()) {
let metrics = [
("bytes_received", server_info.bytes_received),
("bytes_sent", server_info.bytes_sent),
("transaction_count", server_info.transaction_count),
("query_count", server_info.query_count),
("error_count", server_info.error_count),
("idle_count", server_info.idle_count),
("active_count", server_info.active_count),
("login_count", server_info.login_count),
("tested_count", server_info.tested_count),
("is_banned", if pool.is_banned(address) { 1 } else { 0 }),
("is_paused", if pool.paused() { 1 } else { 0 }),
];
for (key, value) in metrics {
if let Some(prometheus_metric) =
PrometheusMetric::<u64>::from_server_info(address, key, value)
{
lines.push(prometheus_metric.to_string());
grouped_metrics
.entry(key.to_string())
.or_default()
.push(prometheus_metric);
} else {
warn!("Metric {} not implemented for {}", key, address.name());
debug!("Metric {} not implemented for {}", key, address.name());
}
}
}
}
}
}
for (_key, metrics) in grouped_metrics {
if !metrics.is_empty() {
lines.push(metrics[0].get_header());
for metric in metrics {
lines.push(metric.to_string());
}
}
}
}
pub async fn start_metric_server(http_addr: SocketAddr) {
let http_service_factory =
make_service_fn(|_conn| async { Ok::<_, hyper::Error>(service_fn(prometheus_stats)) });
let server = Server::bind(&http_addr).serve(http_service_factory);
let listener = TcpListener::bind(http_addr);
let listener = match listener.await {
Ok(listener) => listener,
Err(e) => {
error!("Failed to bind prometheus server to HTTP address: {}.", e);
return;
}
};
info!(
"Exposing prometheus metrics on http://{}/metrics.",
http_addr
);
if let Err(e) = server.await {
error!("Failed to run HTTP server: {}.", e);
loop {
let stream = match listener.accept().await {
Ok((stream, _)) => stream,
Err(e) => {
error!("Error accepting connection: {}", e);
continue;
}
};
let io = TokioIo::new(stream);
tokio::task::spawn(async move {
if let Err(err) = http1::Builder::new()
.serve_connection(io, service_fn(prometheus_stats))
.await
{
eprintln!("Error serving HTTP connection for metrics: {:?}", err);
}
});
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -79,12 +79,12 @@ impl ScramSha256 {
let server_message = Message::parse(message)?;
if !server_message.nonce.starts_with(&self.nonce) {
return Err(Error::ProtocolSyncError(format!("SCRAM")));
return Err(Error::ProtocolSyncError("SCRAM".to_string()));
}
let salt = match general_purpose::STANDARD.decode(&server_message.salt) {
Ok(salt) => salt,
Err(_) => return Err(Error::ProtocolSyncError(format!("SCRAM"))),
Err(_) => return Err(Error::ProtocolSyncError("SCRAM".to_string())),
};
let salted_password = Self::hi(
@@ -166,9 +166,9 @@ impl ScramSha256 {
pub fn finish(&mut self, message: &BytesMut) -> Result<(), Error> {
let final_message = FinalMessage::parse(message)?;
let verifier = match general_purpose::STANDARD.decode(&final_message.value) {
let verifier = match general_purpose::STANDARD.decode(final_message.value) {
Ok(verifier) => verifier,
Err(_) => return Err(Error::ProtocolSyncError(format!("SCRAM"))),
Err(_) => return Err(Error::ProtocolSyncError("SCRAM".to_string())),
};
let mut hmac = match Hmac::<Sha256>::new_from_slice(&self.salted_password) {
@@ -230,14 +230,14 @@ impl Message {
.collect::<Vec<String>>();
if parts.len() != 3 {
return Err(Error::ProtocolSyncError(format!("SCRAM")));
return Err(Error::ProtocolSyncError("SCRAM".to_string()));
}
let nonce = str::replace(&parts[0], "r=", "");
let salt = str::replace(&parts[1], "s=", "");
let iterations = match str::replace(&parts[2], "i=", "").parse::<u32>() {
Ok(iterations) => iterations,
Err(_) => return Err(Error::ProtocolSyncError(format!("SCRAM"))),
Err(_) => return Err(Error::ProtocolSyncError("SCRAM".to_string())),
};
Ok(Message {
@@ -257,7 +257,7 @@ impl FinalMessage {
/// Parse the server final validation message.
pub fn parse(message: &BytesMut) -> Result<FinalMessage, Error> {
if !message.starts_with(b"v=") || message.len() < 4 {
return Err(Error::ProtocolSyncError(format!("SCRAM")));
return Err(Error::ProtocolSyncError("SCRAM".to_string()));
}
Ok(FinalMessage {

File diff suppressed because it is too large Load Diff

View File

@@ -14,11 +14,11 @@ pub enum ShardingFunction {
Sha1,
}
impl ToString for ShardingFunction {
fn to_string(&self) -> String {
match *self {
ShardingFunction::PgBigintHash => "pg_bigint_hash".to_string(),
ShardingFunction::Sha1 => "sha1".to_string(),
impl std::fmt::Display for ShardingFunction {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
ShardingFunction::PgBigintHash => write!(f, "pg_bigint_hash"),
ShardingFunction::Sha1 => write!(f, "sha1"),
}
}
}
@@ -64,7 +64,7 @@ impl Sharder {
fn sha1(&self, key: i64) -> usize {
let mut hasher = Sha1::new();
hasher.update(&key.to_string().as_bytes());
hasher.update(key.to_string().as_bytes());
let result = hasher.finalize();
@@ -202,10 +202,10 @@ mod test {
#[test]
fn test_sha1_hash() {
let sharder = Sharder::new(12, ShardingFunction::Sha1);
let ids = vec![
let ids = [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
];
let shards = vec![
let shards = [
4, 7, 8, 3, 6, 0, 0, 10, 3, 11, 1, 7, 4, 4, 11, 2, 5, 0, 8, 3,
];

File diff suppressed because it is too large Load Diff

226
src/stats/address.rs Normal file
View File

@@ -0,0 +1,226 @@
use std::sync::atomic::*;
use std::sync::Arc;
#[derive(Debug, Clone, Default)]
struct AddressStatFields {
xact_count: Arc<AtomicU64>,
query_count: Arc<AtomicU64>,
bytes_received: Arc<AtomicU64>,
bytes_sent: Arc<AtomicU64>,
xact_time: Arc<AtomicU64>,
query_time: Arc<AtomicU64>,
wait_time: Arc<AtomicU64>,
errors: Arc<AtomicU64>,
}
/// Internal address stats
#[derive(Debug, Clone, Default)]
pub struct AddressStats {
total: AddressStatFields,
current: AddressStatFields,
averages: AddressStatFields,
// Determines if the averages have been updated since the last time they were reported
pub averages_updated: Arc<AtomicBool>,
}
impl IntoIterator for AddressStats {
type Item = (String, u64);
type IntoIter = std::vec::IntoIter<Self::Item>;
fn into_iter(self) -> Self::IntoIter {
vec![
(
"total_xact_count".to_string(),
self.total.xact_count.load(Ordering::Relaxed),
),
(
"total_query_count".to_string(),
self.total.query_count.load(Ordering::Relaxed),
),
(
"total_received".to_string(),
self.total.bytes_received.load(Ordering::Relaxed),
),
(
"total_sent".to_string(),
self.total.bytes_sent.load(Ordering::Relaxed),
),
(
"total_xact_time".to_string(),
self.total.xact_time.load(Ordering::Relaxed),
),
(
"total_query_time".to_string(),
self.total.query_time.load(Ordering::Relaxed),
),
(
"total_wait_time".to_string(),
self.total.wait_time.load(Ordering::Relaxed),
),
(
"total_errors".to_string(),
self.total.errors.load(Ordering::Relaxed),
),
(
"avg_xact_count".to_string(),
self.averages.xact_count.load(Ordering::Relaxed),
),
(
"avg_query_count".to_string(),
self.averages.query_count.load(Ordering::Relaxed),
),
(
"avg_recv".to_string(),
self.averages.bytes_received.load(Ordering::Relaxed),
),
(
"avg_sent".to_string(),
self.averages.bytes_sent.load(Ordering::Relaxed),
),
(
"avg_errors".to_string(),
self.averages.errors.load(Ordering::Relaxed),
),
(
"avg_xact_time".to_string(),
self.averages.xact_time.load(Ordering::Relaxed),
),
(
"avg_query_time".to_string(),
self.averages.query_time.load(Ordering::Relaxed),
),
(
"avg_wait_time".to_string(),
self.averages.wait_time.load(Ordering::Relaxed),
),
]
.into_iter()
}
}
impl AddressStats {
pub fn xact_count_add(&self) {
self.total.xact_count.fetch_add(1, Ordering::Relaxed);
self.current.xact_count.fetch_add(1, Ordering::Relaxed);
}
pub fn query_count_add(&self) {
self.total.query_count.fetch_add(1, Ordering::Relaxed);
self.current.query_count.fetch_add(1, Ordering::Relaxed);
}
pub fn bytes_received_add(&self, bytes: u64) {
self.total
.bytes_received
.fetch_add(bytes, Ordering::Relaxed);
self.current
.bytes_received
.fetch_add(bytes, Ordering::Relaxed);
}
pub fn bytes_sent_add(&self, bytes: u64) {
self.total.bytes_sent.fetch_add(bytes, Ordering::Relaxed);
self.current.bytes_sent.fetch_add(bytes, Ordering::Relaxed);
}
pub fn xact_time_add(&self, time: u64) {
self.total.xact_time.fetch_add(time, Ordering::Relaxed);
self.current.xact_time.fetch_add(time, Ordering::Relaxed);
}
pub fn query_time_add(&self, time: u64) {
self.total.query_time.fetch_add(time, Ordering::Relaxed);
self.current.query_time.fetch_add(time, Ordering::Relaxed);
}
pub fn wait_time_add(&self, time: u64) {
self.total.wait_time.fetch_add(time, Ordering::Relaxed);
self.current.wait_time.fetch_add(time, Ordering::Relaxed);
}
pub fn error(&self) {
self.total.errors.fetch_add(1, Ordering::Relaxed);
self.current.errors.fetch_add(1, Ordering::Relaxed);
}
pub fn update_averages(&self) {
let stat_period_per_second = crate::stats::STAT_PERIOD / 1_000;
// xact_count
let current_xact_count = self.current.xact_count.load(Ordering::Relaxed);
let current_xact_time = self.current.xact_time.load(Ordering::Relaxed);
self.averages.xact_count.store(
current_xact_count / stat_period_per_second,
Ordering::Relaxed,
);
if current_xact_count == 0 {
self.averages.xact_time.store(0, Ordering::Relaxed);
} else {
self.averages
.xact_time
.store(current_xact_time / current_xact_count, Ordering::Relaxed);
}
// query_count
let current_query_count = self.current.query_count.load(Ordering::Relaxed);
let current_query_time = self.current.query_time.load(Ordering::Relaxed);
self.averages.query_count.store(
current_query_count / stat_period_per_second,
Ordering::Relaxed,
);
if current_query_count == 0 {
self.averages.query_time.store(0, Ordering::Relaxed);
} else {
self.averages
.query_time
.store(current_query_time / current_query_count, Ordering::Relaxed);
}
// bytes_received
let current_bytes_received = self.current.bytes_received.load(Ordering::Relaxed);
self.averages.bytes_received.store(
current_bytes_received / stat_period_per_second,
Ordering::Relaxed,
);
// bytes_sent
let current_bytes_sent = self.current.bytes_sent.load(Ordering::Relaxed);
self.averages.bytes_sent.store(
current_bytes_sent / stat_period_per_second,
Ordering::Relaxed,
);
// wait_time
let current_wait_time = self.current.wait_time.load(Ordering::Relaxed);
self.averages.wait_time.store(
current_wait_time / stat_period_per_second,
Ordering::Relaxed,
);
// errors
let current_errors = self.current.errors.load(Ordering::Relaxed);
self.averages
.errors
.store(current_errors / stat_period_per_second, Ordering::Relaxed);
}
pub fn reset_current_counts(&self) {
self.current.xact_count.store(0, Ordering::Relaxed);
self.current.xact_time.store(0, Ordering::Relaxed);
self.current.query_count.store(0, Ordering::Relaxed);
self.current.query_time.store(0, Ordering::Relaxed);
self.current.bytes_received.store(0, Ordering::Relaxed);
self.current.bytes_sent.store(0, Ordering::Relaxed);
self.current.wait_time.store(0, Ordering::Relaxed);
self.current.errors.store(0, Ordering::Relaxed);
}
pub fn populate_row(&self, row: &mut Vec<String>) {
for (_key, value) in self.clone() {
row.push(value.to_string());
}
}
}

204
src/stats/client.rs Normal file
View File

@@ -0,0 +1,204 @@
use super::{get_reporter, Reporter};
use atomic_enum::atomic_enum;
use std::sync::atomic::*;
use std::sync::Arc;
use tokio::time::Instant;
/// The various states that a client can be in
#[atomic_enum]
#[derive(PartialEq)]
pub enum ClientState {
Idle = 0,
Waiting,
Active,
}
impl std::fmt::Display for ClientState {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match *self {
ClientState::Idle => write!(f, "idle"),
ClientState::Waiting => write!(f, "waiting"),
ClientState::Active => write!(f, "active"),
}
}
}
#[derive(Debug, Clone)]
/// Information we keep track of which can be queried by SHOW CLIENTS
pub struct ClientStats {
/// A random integer assigned to the client and used by stats to track the client
client_id: i32,
/// Data associated with the client, not writable, only set when we construct the ClientStat
application_name: String,
username: String,
pool_name: String,
connect_time: Instant,
reporter: Reporter,
/// Total time spent waiting for a connection from pool, measures in microseconds
pub total_wait_time: Arc<AtomicU64>,
/// Maximum time spent waiting for a connection from pool, measures in microseconds
pub max_wait_time: Arc<AtomicU64>,
// Time when the client started waiting for a connection from pool, measures in microseconds
// We use connect_time as the reference point for this value
// U64 can represent ~5850 centuries in microseconds, so we should be fine
pub wait_start_us: Arc<AtomicU64>,
/// Current state of the client
pub state: Arc<AtomicClientState>,
/// Number of transactions executed by this client
pub transaction_count: Arc<AtomicU64>,
/// Number of queries executed by this client
pub query_count: Arc<AtomicU64>,
/// Number of errors made by this client
pub error_count: Arc<AtomicU64>,
}
impl Default for ClientStats {
fn default() -> Self {
ClientStats {
client_id: 0,
connect_time: Instant::now(),
application_name: String::new(),
username: String::new(),
pool_name: String::new(),
total_wait_time: Arc::new(AtomicU64::new(0)),
max_wait_time: Arc::new(AtomicU64::new(0)),
wait_start_us: Arc::new(AtomicU64::new(0)),
state: Arc::new(AtomicClientState::new(ClientState::Idle)),
transaction_count: Arc::new(AtomicU64::new(0)),
query_count: Arc::new(AtomicU64::new(0)),
error_count: Arc::new(AtomicU64::new(0)),
reporter: get_reporter(),
}
}
}
impl ClientStats {
pub fn new(
client_id: i32,
application_name: &str,
username: &str,
pool_name: &str,
connect_time: Instant,
) -> Self {
Self {
client_id,
connect_time,
application_name: application_name.to_string(),
username: username.to_string(),
pool_name: pool_name.to_string(),
..Default::default()
}
}
/// Reports a client is disconnecting from the pooler and
/// update metrics on the corresponding pool.
pub fn disconnect(&self) {
self.reporter.client_disconnecting(self.client_id);
}
/// Register a client with the stats system. The stats system uses client_id
/// to track and aggregate statistics from all source that relate to that client
pub fn register(&self, stats: Arc<ClientStats>) {
self.reporter.client_register(self.client_id, stats);
self.state.store(ClientState::Idle, Ordering::Relaxed);
}
/// Reports a client is done querying the server and is no longer assigned a server connection
pub fn idle(&self) {
self.state.store(ClientState::Idle, Ordering::Relaxed);
}
/// Reports a client is waiting for a connection
pub fn waiting(&self) {
let wait_start = self.connect_time.elapsed().as_micros() as u64;
self.wait_start_us.store(wait_start, Ordering::Relaxed);
self.state.store(ClientState::Waiting, Ordering::Relaxed);
}
/// Reports a client is done waiting for a connection and is about to query the server.
pub fn active(&self) {
self.state.store(ClientState::Active, Ordering::Relaxed);
}
/// Reports a client has failed to obtain a connection from a connection pool
pub fn checkout_error(&self) {
self.state.store(ClientState::Idle, Ordering::Relaxed);
self.update_wait_times();
}
/// Reports a client has succeeded in obtaining a connection from a connection pool
pub fn checkout_success(&self) {
self.state.store(ClientState::Active, Ordering::Relaxed);
self.update_wait_times();
}
/// Reports a client has had the server assigned to it be banned
pub fn ban_error(&self) {
self.state.store(ClientState::Idle, Ordering::Relaxed);
self.error_count.fetch_add(1, Ordering::Relaxed);
}
fn update_wait_times(&self) {
if self.wait_start_us.load(Ordering::Relaxed) == 0 {
return;
}
let wait_time_us = self.get_current_wait_time_us();
self.total_wait_time
.fetch_add(wait_time_us, Ordering::Relaxed);
self.max_wait_time
.fetch_max(wait_time_us, Ordering::Relaxed);
self.wait_start_us.store(0, Ordering::Relaxed);
}
pub fn get_current_wait_time_us(&self) -> u64 {
let wait_start_us = self.wait_start_us.load(Ordering::Relaxed);
let microseconds_since_connection_epoch = self.connect_time.elapsed().as_micros() as u64;
if wait_start_us == 0 || microseconds_since_connection_epoch < wait_start_us {
return 0;
}
microseconds_since_connection_epoch - wait_start_us
}
/// Report a query executed by a client against a server
pub fn query(&self) {
self.query_count.fetch_add(1, Ordering::Relaxed);
}
/// Report a transaction executed by a client a server
/// we report each individual queries outside a transaction as a transaction
/// We only count the initial BEGIN as a transaction, all queries within do not
/// count as transactions
pub fn transaction(&self) {
self.transaction_count.fetch_add(1, Ordering::Relaxed);
}
// Helper methods for show clients
pub fn connect_time(&self) -> Instant {
self.connect_time
}
pub fn client_id(&self) -> i32 {
self.client_id
}
pub fn application_name(&self) -> String {
self.application_name.clone()
}
pub fn username(&self) -> String {
self.username.clone()
}
pub fn pool_name(&self) -> String {
self.pool_name.clone()
}
}

154
src/stats/pool.rs Normal file
View File

@@ -0,0 +1,154 @@
use log::debug;
use super::{ClientState, ServerState};
use crate::{config::PoolMode, messages::DataType, pool::PoolIdentifier};
use std::collections::HashMap;
use std::sync::atomic::*;
use crate::pool::get_all_pools;
#[derive(Debug, Clone)]
/// A struct that holds information about a Pool .
pub struct PoolStats {
pub identifier: PoolIdentifier,
pub mode: PoolMode,
pub cl_idle: u64,
pub cl_active: u64,
pub cl_waiting: u64,
pub cl_cancel_req: u64,
pub sv_active: u64,
pub sv_idle: u64,
pub sv_used: u64,
pub sv_tested: u64,
pub sv_login: u64,
pub maxwait: u64,
}
impl PoolStats {
pub fn new(identifier: PoolIdentifier, mode: PoolMode) -> Self {
PoolStats {
identifier,
mode,
cl_idle: 0,
cl_active: 0,
cl_waiting: 0,
cl_cancel_req: 0,
sv_active: 0,
sv_idle: 0,
sv_used: 0,
sv_tested: 0,
sv_login: 0,
maxwait: 0,
}
}
pub fn construct_pool_lookup() -> HashMap<PoolIdentifier, PoolStats> {
let mut map: HashMap<PoolIdentifier, PoolStats> = HashMap::new();
let client_map = super::get_client_stats();
let server_map = super::get_server_stats();
for (identifier, pool) in get_all_pools() {
map.insert(
identifier.clone(),
PoolStats::new(identifier, pool.settings.pool_mode),
);
}
for client in client_map.values() {
match map.get_mut(&PoolIdentifier {
db: client.pool_name(),
user: client.username(),
}) {
Some(pool_stats) => {
match client.state.load(Ordering::Relaxed) {
ClientState::Active => pool_stats.cl_active += 1,
ClientState::Idle => pool_stats.cl_idle += 1,
ClientState::Waiting => pool_stats.cl_waiting += 1,
}
let wait_start_us = client.wait_start_us.load(Ordering::Relaxed);
if wait_start_us > 0 {
let wait_time_us = client.get_current_wait_time_us();
pool_stats.maxwait = std::cmp::max(pool_stats.maxwait, wait_time_us);
}
}
None => debug!("Client from an obselete pool"),
}
}
for server in server_map.values() {
match map.get_mut(&PoolIdentifier {
db: server.pool_name(),
user: server.username(),
}) {
Some(pool_stats) => match server.state.load(Ordering::Relaxed) {
ServerState::Active => pool_stats.sv_active += 1,
ServerState::Idle => pool_stats.sv_idle += 1,
ServerState::Login => pool_stats.sv_login += 1,
ServerState::Tested => pool_stats.sv_tested += 1,
},
None => debug!("Server from an obselete pool"),
}
}
map
}
pub fn generate_header() -> Vec<(&'static str, DataType)> {
vec![
("database", DataType::Text),
("user", DataType::Text),
("pool_mode", DataType::Text),
("cl_idle", DataType::Numeric),
("cl_active", DataType::Numeric),
("cl_waiting", DataType::Numeric),
("cl_cancel_req", DataType::Numeric),
("sv_active", DataType::Numeric),
("sv_idle", DataType::Numeric),
("sv_used", DataType::Numeric),
("sv_tested", DataType::Numeric),
("sv_login", DataType::Numeric),
("maxwait", DataType::Numeric),
("maxwait_us", DataType::Numeric),
]
}
pub fn generate_row(&self) -> Vec<String> {
vec![
self.identifier.db.clone(),
self.identifier.user.clone(),
self.mode.to_string(),
self.cl_idle.to_string(),
self.cl_active.to_string(),
self.cl_waiting.to_string(),
self.cl_cancel_req.to_string(),
self.sv_active.to_string(),
self.sv_idle.to_string(),
self.sv_used.to_string(),
self.sv_tested.to_string(),
self.sv_login.to_string(),
(self.maxwait / 1_000_000).to_string(),
(self.maxwait % 1_000_000).to_string(),
]
}
}
impl IntoIterator for PoolStats {
type Item = (String, u64);
type IntoIter = std::vec::IntoIter<Self::Item>;
fn into_iter(self) -> Self::IntoIter {
vec![
("cl_idle".to_string(), self.cl_idle),
("cl_active".to_string(), self.cl_active),
("cl_waiting".to_string(), self.cl_waiting),
("cl_cancel_req".to_string(), self.cl_cancel_req),
("sv_active".to_string(), self.sv_active),
("sv_idle".to_string(), self.sv_idle),
("sv_used".to_string(), self.sv_used),
("sv_tested".to_string(), self.sv_tested),
("sv_login".to_string(), self.sv_login),
("maxwait".to_string(), self.maxwait / 1_000_000),
("maxwait_us".to_string(), self.maxwait % 1_000_000),
]
.into_iter()
}
}

229
src/stats/server.rs Normal file
View File

@@ -0,0 +1,229 @@
use super::AddressStats;
use super::{get_reporter, Reporter};
use crate::config::Address;
use atomic_enum::atomic_enum;
use parking_lot::RwLock;
use std::sync::atomic::*;
use std::sync::Arc;
use tokio::time::Instant;
/// The various states that a server can be in
#[atomic_enum]
#[derive(PartialEq)]
pub enum ServerState {
Login = 0,
Active,
Tested,
Idle,
}
impl std::fmt::Display for ServerState {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match *self {
ServerState::Login => write!(f, "login"),
ServerState::Active => write!(f, "active"),
ServerState::Tested => write!(f, "tested"),
ServerState::Idle => write!(f, "idle"),
}
}
}
/// Information we keep track of which can be queried by SHOW SERVERS
#[derive(Debug, Clone)]
pub struct ServerStats {
/// A random integer assigned to the server and used by stats to track the server
server_id: i32,
/// Context information, only to be read
address: Address,
connect_time: Instant,
reporter: Reporter,
/// Data
pub application_name: Arc<RwLock<String>>,
pub state: Arc<AtomicServerState>,
pub bytes_sent: Arc<AtomicU64>,
pub bytes_received: Arc<AtomicU64>,
pub transaction_count: Arc<AtomicU64>,
pub query_count: Arc<AtomicU64>,
pub error_count: Arc<AtomicU64>,
pub prepared_hit_count: Arc<AtomicU64>,
pub prepared_miss_count: Arc<AtomicU64>,
pub prepared_eviction_count: Arc<AtomicU64>,
pub prepared_cache_size: Arc<AtomicU64>,
}
impl Default for ServerStats {
fn default() -> Self {
ServerStats {
server_id: 0,
application_name: Arc::new(RwLock::new(String::new())),
address: Address::default(),
connect_time: Instant::now(),
state: Arc::new(AtomicServerState::new(ServerState::Login)),
bytes_sent: Arc::new(AtomicU64::new(0)),
bytes_received: Arc::new(AtomicU64::new(0)),
transaction_count: Arc::new(AtomicU64::new(0)),
query_count: Arc::new(AtomicU64::new(0)),
error_count: Arc::new(AtomicU64::new(0)),
reporter: get_reporter(),
prepared_hit_count: Arc::new(AtomicU64::new(0)),
prepared_miss_count: Arc::new(AtomicU64::new(0)),
prepared_eviction_count: Arc::new(AtomicU64::new(0)),
prepared_cache_size: Arc::new(AtomicU64::new(0)),
}
}
}
impl ServerStats {
pub fn new(address: Address, connect_time: Instant) -> Self {
Self {
address,
connect_time,
server_id: rand::random::<i32>(),
..Default::default()
}
}
pub fn server_id(&self) -> i32 {
self.server_id
}
/// Register a server connection with the stats system. The stats system uses server_id
/// to track and aggregate statistics from all source that relate to that server
// Delegates to reporter
pub fn register(&self, stats: Arc<ServerStats>) {
self.reporter.server_register(self.server_id, stats);
self.login();
}
/// Reports a server connection is no longer assigned to a client
/// and is available for the next client to pick it up
pub fn idle(&self) {
self.state.store(ServerState::Idle, Ordering::Relaxed);
}
/// Reports a server connection is disconnecting from the pooler.
/// Also updates metrics on the pool regarding server usage.
pub fn disconnect(&self) {
self.reporter.server_disconnecting(self.server_id);
}
/// Reports a server connection is being tested before being given to a client.
pub fn tested(&self) {
self.set_undefined_application();
self.state.store(ServerState::Tested, Ordering::Relaxed);
}
/// Reports a server connection is attempting to login.
pub fn login(&self) {
self.state.store(ServerState::Login, Ordering::Relaxed);
self.set_undefined_application();
}
/// Reports a server connection has been assigned to a client that
/// is about to query the server
pub fn active(&self, application_name: String) {
self.state.store(ServerState::Active, Ordering::Relaxed);
self.set_application(application_name);
}
pub fn address_stats(&self) -> Arc<AddressStats> {
self.address.stats.clone()
}
pub fn check_address_stat_average_is_updated_status(&self) -> bool {
self.address.stats.averages_updated.load(Ordering::Relaxed)
}
pub fn set_address_stat_average_is_updated_status(&self, is_checked: bool) {
self.address
.stats
.averages_updated
.store(is_checked, Ordering::Relaxed);
}
// Helper methods for show_servers
pub fn pool_name(&self) -> String {
self.address.pool_name.clone()
}
pub fn username(&self) -> String {
self.address.username.clone()
}
pub fn address_name(&self) -> String {
self.address.name()
}
pub fn connect_time(&self) -> Instant {
self.connect_time
}
fn set_application(&self, name: String) {
let mut application_name = self.application_name.write();
*application_name = name;
}
fn set_undefined_application(&self) {
self.set_application(String::from("Undefined"))
}
pub fn checkout_time(&self, microseconds: u64, application_name: String) {
// Update server stats and address aggregation stats
self.set_application(application_name);
self.address.stats.wait_time_add(microseconds);
}
/// Report a query executed by a client against a server
pub fn query(&self, milliseconds: u64, application_name: &str) {
self.set_application(application_name.to_string());
self.address.stats.query_count_add();
self.address.stats.query_time_add(milliseconds);
self.query_count.fetch_add(1, Ordering::Relaxed);
}
/// Report a transaction executed by a client a server
/// we report each individual queries outside a transaction as a transaction
/// We only count the initial BEGIN as a transaction, all queries within do not
/// count as transactions
pub fn transaction(&self, application_name: &str) {
self.set_application(application_name.to_string());
self.transaction_count.fetch_add(1, Ordering::Relaxed);
self.address.stats.xact_count_add();
}
/// Report data sent to a server
pub fn data_sent(&self, amount_bytes: usize) {
self.bytes_sent
.fetch_add(amount_bytes as u64, Ordering::Relaxed);
self.address.stats.bytes_sent_add(amount_bytes as u64);
}
/// Report data received from a server
pub fn data_received(&self, amount_bytes: usize) {
self.bytes_received
.fetch_add(amount_bytes as u64, Ordering::Relaxed);
self.address.stats.bytes_received_add(amount_bytes as u64);
}
/// Report a prepared statement that already exists on the server.
pub fn prepared_cache_hit(&self) {
self.prepared_hit_count.fetch_add(1, Ordering::Relaxed);
}
/// Report a prepared statement that does not exist on the server yet.
pub fn prepared_cache_miss(&self) {
self.prepared_miss_count.fetch_add(1, Ordering::Relaxed);
}
pub fn prepared_cache_add(&self) {
self.prepared_cache_size.fetch_add(1, Ordering::Relaxed);
}
pub fn prepared_cache_remove(&self) {
self.prepared_eviction_count.fetch_add(1, Ordering::Relaxed);
self.prepared_cache_size.fetch_sub(1, Ordering::Relaxed);
}
}

View File

@@ -4,7 +4,12 @@ use rustls_pemfile::{certs, read_one, Item};
use std::iter;
use std::path::Path;
use std::sync::Arc;
use tokio_rustls::rustls::{self, Certificate, PrivateKey};
use std::time::SystemTime;
use tokio_rustls::rustls::{
self,
client::{ServerCertVerified, ServerCertVerifier},
Certificate, PrivateKey, ServerName,
};
use tokio_rustls::TlsAcceptor;
use crate::config::get_config;
@@ -64,3 +69,19 @@ impl Tls {
})
}
}
pub struct NoCertificateVerification;
impl ServerCertVerifier for NoCertificateVerification {
fn verify_server_cert(
&self,
_end_entity: &Certificate,
_intermediates: &[Certificate],
_server_name: &ServerName,
_scts: &mut dyn Iterator<Item = &[u8]>,
_ocsp_response: &[u8],
_now: SystemTime,
) -> Result<ServerCertVerified, rustls::Error> {
Ok(ServerCertVerified::assertion())
}
}

34
start_test_env.sh Executable file
View File

@@ -0,0 +1,34 @@
GREEN="\033[0;32m"
RED="\033[0;31m"
BLUE="\033[0;34m"
RESET="\033[0m"
cd tests/docker/
docker compose kill main || true
docker compose build main
docker compose down
docker compose up -d
# wait for the container to start
while ! docker compose exec main ls; do
echo "Waiting for test environment to start"
sleep 1
done
echo "==================================="
docker compose exec -e LOG_LEVEL=error -d main toxiproxy-server
docker compose exec --workdir /app main cargo build
docker compose exec -d --workdir /app main ./target/debug/pgcat ./.circleci/pgcat.toml
docker compose exec --workdir /app/tests/ruby main bundle install
docker compose exec --workdir /app/tests/python main pip3 install -r requirements.txt
echo "Interactive test environment ready"
echo "To run integration tests, you can use the following commands:"
echo -e " ${BLUE}Ruby: ${RED}cd /app/tests/ruby && bundle exec ruby tests.rb --format documentation${RESET}"
echo -e " ${BLUE}Python: ${RED}cd /app/ && pytest ${RESET}"
echo -e " ${BLUE}Rust: ${RED}cd /app/tests/rust && cargo run ${RESET}"
echo -e " ${BLUE}Go: ${RED}cd /app/tests/go && /usr/local/go/bin/go test${RESET}"
echo "the source code for tests are directly linked to the source code in the container so you can modify the code and run the tests again"
echo "You can rebuild PgCat from within the container by running"
echo -e " ${GREEN}cargo build${RESET}"
echo "and then run the tests again"
echo "==================================="
docker compose exec --workdir /app/tests main bash

View File

@@ -1,8 +1,13 @@
FROM rust:bullseye
COPY --from=sclevine/yj /bin/yj /bin/yj
RUN /bin/yj -h
RUN apt-get update && apt-get install llvm-11 psmisc postgresql-contrib postgresql-client ruby ruby-dev libpq-dev python3 python3-pip lcov curl sudo iproute2 -y
RUN cargo install cargo-binutils rustfilt
RUN rustup component add llvm-tools-preview
RUN sudo gem install bundler
RUN wget -O toxiproxy-2.4.0.deb https://github.com/Shopify/toxiproxy/releases/download/v2.4.0/toxiproxy_2.4.0_linux_$(dpkg --print-architecture).deb && \
sudo dpkg -i toxiproxy-2.4.0.deb
RUN wget -O go1.21.3.linux-$(dpkg --print-architecture).tar.gz https://go.dev/dl/go1.21.3.linux-$(dpkg --print-architecture).tar.gz && \
sudo tar -C /usr/local -xzf go1.21.3.linux-$(dpkg --print-architecture).tar.gz && \
rm go1.21.3.linux-$(dpkg --print-architecture).tar.gz

View File

@@ -1,4 +1,3 @@
version: "3"
services:
pg1:
image: postgres:14
@@ -36,9 +35,20 @@ services:
POSTGRES_PASSWORD: postgres
POSTGRES_INITDB_ARGS: --auth-local=scram-sha-256 --auth-host=scram-sha-256 --auth=scram-sha-256
command: ["postgres", "-p", "9432", "-c", "shared_preload_libraries=pg_stat_statements", "-c", "pg_stat_statements.track=all", "-c", "pg_stat_statements.max=100000"]
pg5:
image: postgres:14
network_mode: "service:main"
environment:
POSTGRES_USER: postgres
POSTGRES_DB: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_INITDB_ARGS: --auth-local=md5 --auth-host=md5 --auth=md5
command: ["postgres", "-c", "shared_preload_libraries=pg_stat_statements", "-c", "pg_stat_statements.track=all", "-p", "10432"]
main:
build: .
command: ["bash", "/app/tests/docker/run.sh"]
environment:
- INTERACTIVE_TEST_ENVIRONMENT=true
volumes:
- ../../:/app/
- /app/target/

View File

@@ -5,6 +5,38 @@ rm /app/*.profraw || true
rm /app/pgcat.profdata || true
rm -rf /app/cov || true
# Prepares the interactive test environment
#
if [ -n "$INTERACTIVE_TEST_ENVIRONMENT" ]; then
ports=(5432 7432 8432 9432 10432)
for port in "${ports[@]}"; do
is_it_up=0
attempts=0
while [ $is_it_up -eq 0 ]; do
PGPASSWORD=postgres psql -h 127.0.0.1 -p $port -U postgres -c '\q' > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "PostgreSQL on port $port is up."
is_it_up=1
else
attempts=$((attempts+1))
if [ $attempts -gt 10 ]; then
echo "PostgreSQL on port $port is down, giving up."
exit 1
fi
echo "PostgreSQL on port $port is down, waiting for it to start."
sleep 1
fi
done
done
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 5432 -U postgres -f /app/tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 7432 -U postgres -f /app/tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 8432 -U postgres -f /app/tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 9432 -U postgres -f /app/tests/sharding/query_routing_setup.sql
PGPASSWORD=postgres psql -e -h 127.0.0.1 -p 10432 -U postgres -f /app/tests/sharding/query_routing_setup.sql
sleep 100000000000000000
exit 0
fi
export LLVM_PROFILE_FILE="/app/pgcat-%m-%p.profraw"
export RUSTC_BOOTSTRAP=1
export CARGO_INCREMENTAL=0

5
tests/go/go.mod Normal file
View File

@@ -0,0 +1,5 @@
module pgcat
go 1.21
require github.com/lib/pq v1.10.9

2
tests/go/go.sum Normal file
View File

@@ -0,0 +1,2 @@
github.com/lib/pq v1.10.9 h1:YXG7RB+JIjhP29X+OtkiDnYaXQwpS4JEWq7dtCCRUEw=
github.com/lib/pq v1.10.9/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=

162
tests/go/pgcat.toml Normal file
View File

@@ -0,0 +1,162 @@
#
# PgCat config example.
#
#
# General pooler settings
[general]
# What IP to run on, 0.0.0.0 means accessible from everywhere.
host = "0.0.0.0"
# Port to run on, same as PgBouncer used in this example.
port = "${PORT}"
# Whether to enable prometheus exporter or not.
enable_prometheus_exporter = true
# Port at which prometheus exporter listens on.
prometheus_exporter_port = 9930
# How long to wait before aborting a server connection (ms).
connect_timeout = 1000
# How much time to give the health check query to return with a result (ms).
healthcheck_timeout = 1000
# How long to keep connection available for immediate re-use, without running a healthcheck query on it
healthcheck_delay = 30000
# How much time to give clients during shutdown before forcibly killing client connections (ms).
shutdown_timeout = 5000
# For how long to ban a server if it fails a health check (seconds).
ban_time = 60 # Seconds
# If we should log client connections
log_client_connections = false
# If we should log client disconnections
log_client_disconnections = false
# Reload config automatically if it changes.
autoreload = 15000
server_round_robin = false
# TLS
tls_certificate = "../../.circleci/server.cert"
tls_private_key = "../../.circleci/server.key"
# Credentials to access the virtual administrative database (pgbouncer or pgcat)
# Connecting to that database allows running commands like `SHOW POOLS`, `SHOW DATABASES`, etc..
admin_username = "admin_user"
admin_password = "admin_pass"
# pool
# configs are structured as pool.<pool_name>
# the pool_name is what clients use as database name when connecting
# For the example below a client can connect using "postgres://sharding_user:sharding_user@pgcat_host:pgcat_port/sharded_db"
[pools.sharded_db]
# Pool mode (see PgBouncer docs for more).
# session: one server connection per connected client
# transaction: one server connection per client transaction
pool_mode = "transaction"
# If the client doesn't specify, route traffic to
# this role by default.
#
# any: round-robin between primary and replicas,
# replica: round-robin between replicas only without touching the primary,
# primary: all queries go to the primary unless otherwise specified.
default_role = "any"
# Query parser. If enabled, we'll attempt to parse
# every incoming query to determine if it's a read or a write.
# If it's a read query, we'll direct it to a replica. Otherwise, if it's a write,
# we'll direct it to the primary.
query_parser_enabled = true
# If the query parser is enabled and this setting is enabled, we'll attempt to
# infer the role from the query itself.
query_parser_read_write_splitting = true
# If the query parser is enabled and this setting is enabled, the primary will be part of the pool of databases used for
# load balancing of read queries. Otherwise, the primary will only be used for write
# queries. The primary can always be explicitely selected with our custom protocol.
primary_reads_enabled = true
# So what if you wanted to implement a different hashing function,
# or you've already built one and you want this pooler to use it?
#
# Current options:
#
# pg_bigint_hash: PARTITION BY HASH (Postgres hashing function)
# sha1: A hashing function based on SHA1
#
sharding_function = "pg_bigint_hash"
# Prepared statements cache size.
prepared_statements_cache_size = 500
# Credentials for users that may connect to this cluster
[pools.sharded_db.users.0]
username = "sharding_user"
password = "sharding_user"
# Maximum number of server connections that can be established for this user
# The maximum number of connection from a single Pgcat process to any database in the cluster
# is the sum of pool_size across all users.
pool_size = 5
statement_timeout = 0
[pools.sharded_db.users.1]
username = "other_user"
password = "other_user"
pool_size = 21
statement_timeout = 30000
# Shard 0
[pools.sharded_db.shards.0]
# [ host, port, role ]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
# Database name (e.g. "postgres")
database = "shard0"
[pools.sharded_db.shards.1]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard1"
[pools.sharded_db.shards.2]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ],
]
database = "shard2"
[pools.simple_db]
pool_mode = "session"
default_role = "primary"
query_parser_enabled = true
query_parser_read_write_splitting = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
[pools.simple_db.users.0]
username = "simple_user"
password = "simple_user"
pool_size = 5
statement_timeout = 30000
[pools.simple_db.shards.0]
servers = [
[ "127.0.0.1", 5432, "primary" ],
[ "localhost", 5432, "replica" ]
]
database = "some_db"

52
tests/go/prepared_test.go Normal file
View File

@@ -0,0 +1,52 @@
package pgcat
import (
"context"
"database/sql"
"fmt"
_ "github.com/lib/pq"
"testing"
)
func Test(t *testing.T) {
t.Cleanup(setup(t))
t.Run("Named parameterized prepared statement works", namedParameterizedPreparedStatement)
t.Run("Unnamed parameterized prepared statement works", unnamedParameterizedPreparedStatement)
}
func namedParameterizedPreparedStatement(t *testing.T) {
db, err := sql.Open("postgres", fmt.Sprintf("host=localhost port=%d database=sharded_db user=sharding_user password=sharding_user sslmode=disable", port))
if err != nil {
t.Fatalf("could not open connection: %+v", err)
}
stmt, err := db.Prepare("SELECT $1")
if err != nil {
t.Fatalf("could not prepare: %+v", err)
}
for i := 0; i < 100; i++ {
rows, err := stmt.Query(1)
if err != nil {
t.Fatalf("could not query: %+v", err)
}
_ = rows.Close()
}
}
func unnamedParameterizedPreparedStatement(t *testing.T) {
db, err := sql.Open("postgres", fmt.Sprintf("host=localhost port=%d database=sharded_db user=sharding_user password=sharding_user sslmode=disable", port))
if err != nil {
t.Fatalf("could not open connection: %+v", err)
}
for i := 0; i < 100; i++ {
// Under the hood QueryContext generates an unnamed parameterized prepared statement
rows, err := db.QueryContext(context.Background(), "SELECT $1", 1)
if err != nil {
t.Fatalf("could not query: %+v", err)
}
_ = rows.Close()
}
}

81
tests/go/setup.go Normal file
View File

@@ -0,0 +1,81 @@
package pgcat
import (
"context"
"database/sql"
_ "embed"
"fmt"
"math/rand"
"os"
"os/exec"
"strings"
"testing"
"time"
)
//go:embed pgcat.toml
var pgcatCfg string
var port = rand.Intn(32760-20000) + 20000
func setup(t *testing.T) func() {
cfg, err := os.CreateTemp("/tmp", "pgcat_cfg_*.toml")
if err != nil {
t.Fatalf("could not create temp file: %+v", err)
}
pgcatCfg = strings.Replace(pgcatCfg, "\"${PORT}\"", fmt.Sprintf("%d", port), 1)
_, err = cfg.Write([]byte(pgcatCfg))
if err != nil {
t.Fatalf("could not write temp file: %+v", err)
}
commandPath := "../../target/debug/pgcat"
if os.Getenv("CARGO_TARGET_DIR") != "" {
commandPath = os.Getenv("CARGO_TARGET_DIR") + "/debug/pgcat"
}
cmd := exec.Command(commandPath, cfg.Name())
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
go func() {
err = cmd.Run()
if err != nil {
t.Errorf("could not run pgcat: %+v", err)
}
}()
deadline, cancelFunc := context.WithDeadline(context.Background(), time.Now().Add(5*time.Second))
defer cancelFunc()
for {
select {
case <-deadline.Done():
break
case <-time.After(50 * time.Millisecond):
db, err := sql.Open("postgres", fmt.Sprintf("host=localhost port=%d database=pgcat user=admin_user password=admin_pass sslmode=disable", port))
if err != nil {
continue
}
rows, err := db.QueryContext(deadline, "SHOW STATS")
if err != nil {
continue
}
_ = rows.Close()
_ = db.Close()
break
}
break
}
return func() {
err := cmd.Process.Signal(os.Interrupt)
if err != nil {
t.Fatalf("could not interrupt pgcat: %+v", err)
}
err = os.Remove(cfg.Name())
if err != nil {
t.Fatalf("could not remove temp file: %+v", err)
}
}
}

View File

@@ -36,4 +36,4 @@ SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
SET SERVER ROLE TO 'replica';
-- Read load balancing
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

View File

@@ -1,2 +1,3 @@
pytest
psycopg2==2.9.3
psutil==5.9.1
psutil==5.9.1

71
tests/python/test_auth.py Normal file
View File

@@ -0,0 +1,71 @@
import utils
import signal
class TestTrustAuth:
@classmethod
def setup_method(cls):
config= """
[general]
host = "0.0.0.0"
port = 6432
admin_username = "admin_user"
admin_password = ""
admin_auth_type = "trust"
[pools.sharded_db.users.0]
username = "sharding_user"
password = "sharding_user"
auth_type = "trust"
pool_size = 10
min_pool_size = 1
pool_mode = "transaction"
[pools.sharded_db.shards.0]
servers = [
[ "127.0.0.1", 5432, "primary" ],
]
database = "shard0"
"""
utils.pgcat_generic_start(config)
@classmethod
def teardown_method(self):
utils.pg_cat_send_signal(signal.SIGTERM)
def test_admin_trust_auth(self):
conn, cur = utils.connect_db_trust(admin=True)
cur.execute("SHOW POOLS")
res = cur.fetchall()
print(res)
utils.cleanup_conn(conn, cur)
def test_normal_trust_auth(self):
conn, cur = utils.connect_db_trust(autocommit=False)
cur.execute("SELECT 1")
res = cur.fetchall()
print(res)
utils.cleanup_conn(conn, cur)
class TestMD5Auth:
@classmethod
def setup_method(cls):
utils.pgcat_start()
@classmethod
def teardown_method(self):
utils.pg_cat_send_signal(signal.SIGTERM)
def test_normal_db_access(self):
conn, cur = utils.connect_db(autocommit=False)
cur.execute("SELECT 1")
res = cur.fetchall()
print(res)
utils.cleanup_conn(conn, cur)
def test_admin_db_access(self):
conn, cur = utils.connect_db(admin=True)
cur.execute("SHOW POOLS")
res = cur.fetchall()
print(res)
utils.cleanup_conn(conn, cur)

View File

@@ -1,83 +1,12 @@
from typing import Tuple
import psycopg2
import psutil
import os
import signal
import time
import psycopg2
import utils
SHUTDOWN_TIMEOUT = 5
PGCAT_HOST = "127.0.0.1"
PGCAT_PORT = "6432"
def pgcat_start():
pg_cat_send_signal(signal.SIGTERM)
os.system("./target/debug/pgcat .circleci/pgcat.toml &")
time.sleep(2)
def pg_cat_send_signal(signal: signal.Signals):
try:
for proc in psutil.process_iter(["pid", "name"]):
if "pgcat" == proc.name():
os.kill(proc.pid, signal)
except Exception as e:
# The process can be gone when we send this signal
print(e)
if signal == signal.SIGTERM:
# Returns 0 if pgcat process exists
time.sleep(2)
if not os.system('pgrep pgcat'):
raise Exception("pgcat not closed after SIGTERM")
def connect_db(
autocommit: bool = True,
admin: bool = False,
) -> Tuple[psycopg2.extensions.connection, psycopg2.extensions.cursor]:
if admin:
user = "admin_user"
password = "admin_pass"
db = "pgcat"
else:
user = "sharding_user"
password = "sharding_user"
db = "sharded_db"
conn = psycopg2.connect(
f"postgres://{user}:{password}@{PGCAT_HOST}:{PGCAT_PORT}/{db}?application_name=testing_pgcat",
connect_timeout=2,
)
conn.autocommit = autocommit
cur = conn.cursor()
return (conn, cur)
def cleanup_conn(conn: psycopg2.extensions.connection, cur: psycopg2.extensions.cursor):
cur.close()
conn.close()
def test_normal_db_access():
conn, cur = connect_db(autocommit=False)
cur.execute("SELECT 1")
res = cur.fetchall()
print(res)
cleanup_conn(conn, cur)
def test_admin_db_access():
conn, cur = connect_db(admin=True)
cur.execute("SHOW POOLS")
res = cur.fetchall()
print(res)
cleanup_conn(conn, cur)
def test_shutdown_logic():
@@ -85,17 +14,17 @@ def test_shutdown_logic():
# NO ACTIVE QUERIES SIGINT HANDLING
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and send query (not in transaction)
conn, cur = connect_db()
conn, cur = utils.connect_db()
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
cur.execute("COMMIT;")
# Send sigint to pgcat
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
# Check that any new queries fail after sigint since server should close with no active transactions
@@ -107,18 +36,18 @@ def test_shutdown_logic():
# Fail if query execution succeeded
raise Exception("Server not closed after sigint")
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(conn, cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# NO ACTIVE QUERIES ADMIN SHUTDOWN COMMAND
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
conn, cur = connect_db()
admin_conn, admin_cur = connect_db(admin=True)
conn, cur = utils.connect_db()
admin_conn, admin_cur = utils.connect_db(admin=True)
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
@@ -137,24 +66,24 @@ def test_shutdown_logic():
# Fail if query execution succeeded
raise Exception("Server not closed after sigint")
cleanup_conn(conn, cur)
cleanup_conn(admin_conn, admin_cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(conn, cur)
utils.cleanup_conn(admin_conn, admin_cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# HANDLE TRANSACTION WITH SIGINT
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
conn, cur = connect_db()
conn, cur = utils.connect_db()
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
# Check that any new queries succeed after sigint since server should still allow transaction to complete
@@ -164,18 +93,18 @@ def test_shutdown_logic():
# Fail if query fails since server closed
raise Exception("Server closed while in transaction", e.pgerror)
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(conn, cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# HANDLE TRANSACTION WITH ADMIN SHUTDOWN COMMAND
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
conn, cur = connect_db()
admin_conn, admin_cur = connect_db(admin=True)
conn, cur = utils.connect_db()
admin_conn, admin_cur = utils.connect_db(admin=True)
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
@@ -193,30 +122,30 @@ def test_shutdown_logic():
# Fail if query fails since server closed
raise Exception("Server closed while in transaction", e.pgerror)
cleanup_conn(conn, cur)
cleanup_conn(admin_conn, admin_cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(conn, cur)
utils.cleanup_conn(admin_conn, admin_cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# NO NEW NON-ADMIN CONNECTIONS DURING SHUTDOWN
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
transaction_conn, transaction_cur = connect_db()
transaction_conn, transaction_cur = utils.connect_db()
transaction_cur.execute("BEGIN;")
transaction_cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
start = time.perf_counter()
try:
conn, cur = connect_db()
conn, cur = utils.connect_db()
cur.execute("SELECT 1;")
cleanup_conn(conn, cur)
utils.cleanup_conn(conn, cur)
except psycopg2.OperationalError as e:
time_taken = time.perf_counter() - start
if time_taken > 0.1:
@@ -226,49 +155,49 @@ def test_shutdown_logic():
else:
raise Exception("Able connect to database during shutdown")
cleanup_conn(transaction_conn, transaction_cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(transaction_conn, transaction_cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# ALLOW NEW ADMIN CONNECTIONS DURING SHUTDOWN
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
transaction_conn, transaction_cur = connect_db()
transaction_conn, transaction_cur = utils.connect_db()
transaction_cur.execute("BEGIN;")
transaction_cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
try:
conn, cur = connect_db(admin=True)
conn, cur = utils.connect_db(admin=True)
cur.execute("SHOW DATABASES;")
cleanup_conn(conn, cur)
utils.cleanup_conn(conn, cur)
except psycopg2.OperationalError as e:
raise Exception(e)
cleanup_conn(transaction_conn, transaction_cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(transaction_conn, transaction_cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# ADMIN CONNECTIONS CONTINUING TO WORK AFTER SHUTDOWN
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction
transaction_conn, transaction_cur = connect_db()
transaction_conn, transaction_cur = utils.connect_db()
transaction_cur.execute("BEGIN;")
transaction_cur.execute("SELECT 1;")
admin_conn, admin_cur = connect_db(admin=True)
admin_conn, admin_cur = utils.connect_db(admin=True)
admin_cur.execute("SHOW DATABASES;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
time.sleep(1)
try:
@@ -276,24 +205,24 @@ def test_shutdown_logic():
except psycopg2.OperationalError as e:
raise Exception("Could not execute admin command:", e)
cleanup_conn(transaction_conn, transaction_cur)
cleanup_conn(admin_conn, admin_cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(transaction_conn, transaction_cur)
utils.cleanup_conn(admin_conn, admin_cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
# HANDLE SHUTDOWN TIMEOUT WITH SIGINT
# Start pgcat
pgcat_start()
utils.pgcat_start()
# Create client connection and begin transaction, which should prevent server shutdown unless shutdown timeout is reached
conn, cur = connect_db()
conn, cur = utils.connect_db()
cur.execute("BEGIN;")
cur.execute("SELECT 1;")
# Send sigint to pgcat while still in transaction
pg_cat_send_signal(signal.SIGINT)
utils.pg_cat_send_signal(signal.SIGINT)
# pgcat shutdown timeout is set to SHUTDOWN_TIMEOUT seconds, so we sleep for SHUTDOWN_TIMEOUT + 1 seconds
time.sleep(SHUTDOWN_TIMEOUT + 1)
@@ -307,12 +236,7 @@ def test_shutdown_logic():
# Fail if query execution succeeded
raise Exception("Server not closed after sigint and expected timeout")
cleanup_conn(conn, cur)
pg_cat_send_signal(signal.SIGTERM)
utils.cleanup_conn(conn, cur)
utils.pg_cat_send_signal(signal.SIGTERM)
# - - - - - - - - - - - - - - - - - -
test_normal_db_access()
test_admin_db_access()
test_shutdown_logic()

110
tests/python/utils.py Normal file
View File

@@ -0,0 +1,110 @@
import os
import signal
import time
from typing import Tuple
import tempfile
import psutil
import psycopg2
PGCAT_HOST = "127.0.0.1"
PGCAT_PORT = "6432"
def _pgcat_start(config_path: str):
pg_cat_send_signal(signal.SIGTERM)
os.system(f"./target/debug/pgcat {config_path} &")
time.sleep(2)
def pgcat_start():
_pgcat_start(config_path='.circleci/pgcat.toml')
def pgcat_generic_start(config: str):
tmp = tempfile.NamedTemporaryFile()
with open(tmp.name, 'w') as f:
f.write(config)
_pgcat_start(config_path=tmp.name)
def glauth_send_signal(signal: signal.Signals):
try:
for proc in psutil.process_iter(["pid", "name"]):
if proc.name() == "glauth":
os.kill(proc.pid, signal)
except Exception as e:
# The process can be gone when we send this signal
print(e)
if signal == signal.SIGTERM:
# Returns 0 if pgcat process exists
time.sleep(2)
if not os.system('pgrep glauth'):
raise Exception("glauth not closed after SIGTERM")
def pg_cat_send_signal(signal: signal.Signals):
try:
for proc in psutil.process_iter(["pid", "name"]):
if "pgcat" == proc.name():
os.kill(proc.pid, signal)
except Exception as e:
# The process can be gone when we send this signal
print(e)
if signal == signal.SIGTERM:
# Returns 0 if pgcat process exists
time.sleep(2)
if not os.system('pgrep pgcat'):
raise Exception("pgcat not closed after SIGTERM")
def connect_db(
autocommit: bool = True,
admin: bool = False,
) -> Tuple[psycopg2.extensions.connection, psycopg2.extensions.cursor]:
if admin:
user = "admin_user"
password = "admin_pass"
db = "pgcat"
else:
user = "sharding_user"
password = "sharding_user"
db = "sharded_db"
conn = psycopg2.connect(
f"postgres://{user}:{password}@{PGCAT_HOST}:{PGCAT_PORT}/{db}?application_name=testing_pgcat",
connect_timeout=2,
)
conn.autocommit = autocommit
cur = conn.cursor()
return (conn, cur)
def connect_db_trust(
autocommit: bool = True,
admin: bool = False,
) -> Tuple[psycopg2.extensions.connection, psycopg2.extensions.cursor]:
if admin:
user = "admin_user"
db = "pgcat"
else:
user = "sharding_user"
db = "sharded_db"
conn = psycopg2.connect(
f"postgres://{user}@{PGCAT_HOST}:{PGCAT_PORT}/{db}?application_name=testing_pgcat",
connect_timeout=2,
)
conn.autocommit = autocommit
cur = conn.cursor()
return (conn, cur)
def cleanup_conn(conn: psycopg2.extensions.connection, cur: psycopg2.extensions.cursor):
cur.close()
conn.close()

View File

@@ -1,22 +1,33 @@
GEM
remote: https://rubygems.org/
specs:
activemodel (7.0.4.1)
activesupport (= 7.0.4.1)
activerecord (7.0.4.1)
activemodel (= 7.0.4.1)
activesupport (= 7.0.4.1)
activesupport (7.0.4.1)
activemodel (7.1.4)
activesupport (= 7.1.4)
activerecord (7.1.4)
activemodel (= 7.1.4)
activesupport (= 7.1.4)
timeout (>= 0.4.0)
activesupport (7.1.4)
base64
bigdecimal
concurrent-ruby (~> 1.0, >= 1.0.2)
connection_pool (>= 2.2.5)
drb
i18n (>= 1.6, < 2)
minitest (>= 5.1)
mutex_m
tzinfo (~> 2.0)
ast (2.4.2)
concurrent-ruby (1.1.10)
base64 (0.2.0)
bigdecimal (3.1.8)
concurrent-ruby (1.3.4)
connection_pool (2.4.1)
diff-lcs (1.5.0)
i18n (1.12.0)
drb (2.2.1)
i18n (1.14.5)
concurrent-ruby (~> 1.0)
minitest (5.17.0)
minitest (5.25.1)
mutex_m (0.2.0)
parallel (1.22.1)
parser (3.1.2.0)
ast (~> 2.4.1)
@@ -24,7 +35,8 @@ GEM
pg (1.3.2)
rainbow (3.1.1)
regexp_parser (2.3.1)
rexml (3.2.5)
rexml (3.3.6)
strscan
rspec (3.11.0)
rspec-core (~> 3.11.0)
rspec-expectations (~> 3.11.0)
@@ -50,10 +62,12 @@ GEM
rubocop-ast (1.17.0)
parser (>= 3.1.1.0)
ruby-progressbar (1.11.0)
strscan (3.1.0)
timeout (0.4.1)
toml (0.3.0)
parslet (>= 1.8.0, < 3.0.0)
toxiproxy (2.0.1)
tzinfo (2.0.5)
tzinfo (2.0.6)
concurrent-ruby (~> 1.0)
unicode-display_width (2.1.0)

View File

@@ -11,282 +11,6 @@ describe "Admin" do
processes.pgcat.shutdown
end
describe "SHOW STATS" do
context "clients connect and make one query" do
it "updates *_query_time and *_wait_time" do
connection = PG::connect("#{pgcat_conn_str}?application_name=one_query")
connection.async_exec("SELECT pg_sleep(0.25)")
connection.async_exec("SELECT pg_sleep(0.25)")
connection.async_exec("SELECT pg_sleep(0.25)")
connection.close
# wait for averages to be calculated, we shouldn't do this too often
sleep(15.5)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW STATS")[0]
admin_conn.close
expect(results["total_query_time"].to_i).to be_within(200).of(750)
expect(results["avg_query_time"].to_i).to_not eq(0)
expect(results["total_wait_time"].to_i).to_not eq(0)
expect(results["avg_wait_time"].to_i).to_not eq(0)
end
end
end
describe "SHOW POOLS" do
context "bad credentials" do
it "does not change any stats" do
bad_passsword_url = URI(pgcat_conn_str)
bad_passsword_url.password = "wrong"
expect { PG::connect("#{bad_passsword_url.to_s}?application_name=bad_password") }.to raise_error(PG::ConnectionBad)
sleep(1)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["sv_idle"]).to eq("1")
end
end
context "bad database name" do
it "does not change any stats" do
bad_db_url = URI(pgcat_conn_str)
bad_db_url.path = "/wrong_db"
expect { PG::connect("#{bad_db_url.to_s}?application_name=bad_db") }.to raise_error(PG::ConnectionBad)
sleep(1)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["sv_idle"]).to eq("1")
end
end
context "client connects but issues no queries" do
it "only affects cl_idle stats" do
connections = Array.new(20) { PG::connect(pgcat_conn_str) }
sleep(1)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_idle"]).to eq("20")
expect(results["sv_idle"]).to eq("1")
connections.map(&:close)
sleep(1.1)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_active cl_idle cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["sv_idle"]).to eq("1")
end
end
context "clients connect and make one query" do
it "only affects cl_idle, sv_idle stats" do
connections = Array.new(5) { PG::connect("#{pgcat_conn_str}?application_name=one_query") }
connections.each do |c|
Thread.new { c.async_exec("SELECT pg_sleep(2.5)") }
end
sleep(1.1)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_waiting cl_cancel_req sv_idle sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_active"]).to eq("5")
expect(results["sv_active"]).to eq("5")
sleep(3)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_idle"]).to eq("5")
expect(results["sv_idle"]).to eq("5")
connections.map(&:close)
sleep(1)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["sv_idle"]).to eq("5")
end
end
context "client connects and opens a transaction and closes connection uncleanly" do
it "produces correct statistics" do
connections = Array.new(5) { PG::connect("#{pgcat_conn_str}?application_name=one_query") }
connections.each do |c|
Thread.new do
c.async_exec("BEGIN")
c.async_exec("SELECT pg_sleep(0.01)")
c.close
end
end
sleep(1.1)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["sv_idle"]).to eq("5")
end
end
context "client fail to checkout connection from the pool" do
it "counts clients as idle" do
new_configs = processes.pgcat.current_config
new_configs["general"]["connect_timeout"] = 500
new_configs["general"]["ban_time"] = 1
new_configs["general"]["shutdown_timeout"] = 1
new_configs["pools"]["sharded_db"]["users"]["0"]["pool_size"] = 1
processes.pgcat.update_config(new_configs)
processes.pgcat.reload_config
threads = []
connections = Array.new(5) { PG::connect("#{pgcat_conn_str}?application_name=one_query") }
connections.each do |c|
threads << Thread.new { c.async_exec("SELECT pg_sleep(1)") rescue PG::SystemError }
end
sleep(2)
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_idle"]).to eq("5")
expect(results["sv_idle"]).to eq("1")
threads.map(&:join)
connections.map(&:close)
end
end
context "clients overwhelm server pools" do
let(:processes) { Helpers::Pgcat.single_instance_setup("sharded_db", 2) }
it "cl_waiting is updated to show it" do
threads = []
connections = Array.new(4) { PG::connect("#{pgcat_conn_str}?application_name=one_query") }
connections.each do |c|
threads << Thread.new { c.async_exec("SELECT pg_sleep(1.5)") }
end
sleep(1.1) # Allow time for stats to update
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_idle cl_cancel_req sv_idle sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_waiting"]).to eq("2")
expect(results["cl_active"]).to eq("2")
expect(results["sv_active"]).to eq("2")
sleep(2.5) # Allow time for stats to update
results = admin_conn.async_exec("SHOW POOLS")[0]
%w[cl_active cl_waiting cl_cancel_req sv_active sv_used sv_tested sv_login maxwait].each do |s|
raise StandardError, "Field #{s} was expected to be 0 but found to be #{results[s]}" if results[s] != "0"
end
expect(results["cl_idle"]).to eq("4")
expect(results["sv_idle"]).to eq("2")
threads.map(&:join)
connections.map(&:close)
end
it "show correct max_wait" do
threads = []
connections = Array.new(4) { PG::connect("#{pgcat_conn_str}?application_name=one_query") }
connections.each do |c|
threads << Thread.new { c.async_exec("SELECT pg_sleep(1.5)") }
end
sleep(2.5) # Allow time for stats to update
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW POOLS")[0]
expect(results["maxwait"]).to eq("1")
expect(results["maxwait_us"].to_i).to be_within(200_000).of(500_000)
sleep(4.5) # Allow time for stats to update
results = admin_conn.async_exec("SHOW POOLS")[0]
expect(results["maxwait"]).to eq("0")
threads.map(&:join)
connections.map(&:close)
end
end
end
describe "SHOW CLIENTS" do
it "reports correct number and application names" do
conn_str = processes.pgcat.connection_string("sharded_db", "sharding_user")
connections = Array.new(20) { |i| PG::connect("#{conn_str}?application_name=app#{i % 5}") }
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
sleep(1) # Wait for stats to be updated
results = admin_conn.async_exec("SHOW CLIENTS")
expect(results.count).to eq(21) # count admin clients
expect(results.select { |c| c["application_name"] == "app3" || c["application_name"] == "app4" }.count).to eq(8)
expect(results.select { |c| c["database"] == "pgcat" }.count).to eq(1)
connections[0..5].map(&:close)
sleep(1) # Wait for stats to be updated
results = admin_conn.async_exec("SHOW CLIENTS")
expect(results.count).to eq(15)
connections[6..].map(&:close)
sleep(1) # Wait for stats to be updated
expect(admin_conn.async_exec("SHOW CLIENTS").count).to eq(1)
admin_conn.close
end
it "reports correct number of queries and transactions" do
conn_str = processes.pgcat.connection_string("sharded_db", "sharding_user")
connections = Array.new(2) { |i| PG::connect("#{conn_str}?application_name=app#{i}") }
connections.each do |c|
c.async_exec("SELECT 1")
c.async_exec("SELECT 2")
c.async_exec("SELECT 3")
c.async_exec("BEGIN")
c.async_exec("SELECT 4")
c.async_exec("SELECT 5")
c.async_exec("COMMIT")
end
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
sleep(1) # Wait for stats to be updated
results = admin_conn.async_exec("SHOW CLIENTS")
expect(results.count).to eq(3)
normal_client_results = results.reject { |r| r["database"] == "pgcat" }
expect(normal_client_results[0]["transaction_count"]).to eq("4")
expect(normal_client_results[1]["transaction_count"]).to eq("4")
expect(normal_client_results[0]["query_count"]).to eq("7")
expect(normal_client_results[1]["query_count"]).to eq("7")
admin_conn.close
connections.map(&:close)
end
end
describe "Manual Banning" do
let(:processes) { Helpers::Pgcat.single_shard_setup("sharded_db", 10) }
before do
@@ -357,7 +81,7 @@ describe "Admin" do
end
end
describe "SHOW users" do
describe "SHOW USERS" do
it "returns the right users" do
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW USERS")[0]
@@ -366,4 +90,49 @@ describe "Admin" do
expect(results["pool_mode"]).to eq("transaction")
end
end
[
"SHOW ME THE MONEY",
"SHOW ME THE WAY",
"SHOW UP",
"SHOWTIME",
"HAMMER TIME",
"SHOWN TO BE TRUE",
"SHOW ",
"SHOW ",
"SHOW 1",
";;;;;"
].each do |cmd|
describe "Bad command #{cmd}" do
it "does not panic and responds with PG::SystemError" do
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
expect { admin_conn.async_exec(cmd) }.to raise_error(PG::SystemError).with_message(/Unsupported/)
admin_conn.close
end
end
end
describe "PAUSE" do
it "pauses all pools" do
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
results = admin_conn.async_exec("SHOW DATABASES").to_a
expect(results.map{ |r| r["paused"] }.uniq).to eq(["0"])
admin_conn.async_exec("PAUSE")
results = admin_conn.async_exec("SHOW DATABASES").to_a
expect(results.map{ |r| r["paused"] }.uniq).to eq(["1"])
admin_conn.async_exec("RESUME")
results = admin_conn.async_exec("SHOW DATABASES").to_a
expect(results.map{ |r| r["paused"] }.uniq).to eq(["0"])
end
it "handles errors" do
admin_conn = PG::connect(processes.pgcat.admin_connection_string)
expect { admin_conn.async_exec("PAUSE foo").to_a }.to raise_error(PG::SystemError)
expect { admin_conn.async_exec("PAUSE foo,bar").to_a }.to raise_error(PG::SystemError)
end
end
end

View File

@@ -0,0 +1,215 @@
# frozen_string_literal: true
require_relative 'spec_helper'
require_relative 'helpers/auth_query_helper'
describe "Auth Query" do
let(:configured_instances) {[5432, 10432]}
let(:config_user) { { 'username' => 'sharding_user', 'password' => 'sharding_user' } }
let(:pg_user) { { 'username' => 'sharding_user', 'password' => 'sharding_user' } }
let(:processes) { Helpers::AuthQuery.single_shard_auth_query(pool_name: "sharded_db", pg_user: pg_user, config_user: config_user, extra_conf: config, wait_until_ready: wait_until_ready ) }
let(:config) { {} }
let(:wait_until_ready) { true }
after do
unless @failing_process
processes.all_databases.map(&:reset)
processes.pgcat.shutdown
end
@failing_process = false
end
context "when auth_query is not configured" do
context 'and cleartext passwords are set' do
it "uses local passwords" do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", config_user['username'], config_user['password']))
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
end
end
context 'and cleartext passwords are not set' do
let(:config_user) { { 'username' => 'sharding_user' } }
it "does not start because it is not possible to authenticate" do
@failing_process = true
expect { processes.pgcat }.to raise_error(StandardError, /You have to specify a user password for every pool if auth_query is not specified/)
end
end
end
context 'when auth_query is configured' do
context 'with global configuration' do
around(:example) do |example|
# Set up auth query
Helpers::AuthQuery.set_up_auth_query_for_user(
user: 'md5_auth_user',
password: 'secret'
);
example.run
# Drop auth query support
Helpers::AuthQuery.tear_down_auth_query_for_user(
user: 'md5_auth_user',
password: 'secret'
);
end
context 'with correct global parameters' do
let(:config) { { 'general' => { 'auth_query' => "SELECT * FROM public.user_lookup('$1');", 'auth_query_user' => 'md5_auth_user', 'auth_query_password' => 'secret' } } }
context 'and with cleartext passwords set' do
it 'it uses local passwords' do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username'], pg_user['password']))
expect(conn.exec("SELECT 1 + 2")).not_to be_nil
end
end
context 'and with cleartext passwords not set' do
let(:config_user) { { 'username' => 'sharding_user', 'password' => 'sharding_user' } }
it 'it uses obtained passwords' do
connection_string = processes.pgcat.connection_string("sharded_db", pg_user['username'], pg_user['password'])
conn = PG.connect(connection_string)
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
end
it 'allows passwords to be changed without closing existing connections' do
pgconn = PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username']))
expect(pgconn.exec("SELECT 1 + 2")).not_to be_nil
Helpers::AuthQuery.exec_in_instances(query: "ALTER USER #{pg_user['username']} WITH ENCRYPTED PASSWORD 'secret2';")
expect(pgconn.exec("SELECT 1 + 4")).not_to be_nil
Helpers::AuthQuery.exec_in_instances(query: "ALTER USER #{pg_user['username']} WITH ENCRYPTED PASSWORD '#{pg_user['password']}';")
end
it 'allows passwords to be changed and that new password is needed when reconnecting' do
pgconn = PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username']))
expect(pgconn.exec("SELECT 1 + 2")).not_to be_nil
Helpers::AuthQuery.exec_in_instances(query: "ALTER USER #{pg_user['username']} WITH ENCRYPTED PASSWORD 'secret2';")
newconn = PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username'], 'secret2'))
expect(newconn.exec("SELECT 1 + 2")).not_to be_nil
Helpers::AuthQuery.exec_in_instances(query: "ALTER USER #{pg_user['username']} WITH ENCRYPTED PASSWORD '#{pg_user['password']}';")
end
end
end
context 'with wrong parameters' do
let(:config) { { 'general' => { 'auth_query' => 'SELECT 1', 'auth_query_user' => 'wrong_user', 'auth_query_password' => 'wrong' } } }
context 'and with clear text passwords set' do
it "it uses local passwords" do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username'], pg_user['password']))
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
end
end
context 'and with cleartext passwords not set' do
let(:config_user) { { 'username' => 'sharding_user' } }
it "it fails to start as it cannot authenticate against servers" do
@failing_process = true
expect { PG.connect(processes.pgcat.connection_string("sharded_db", pg_user['username'], pg_user['password'])) }.to raise_error(StandardError, /Error trying to obtain password from auth_query/ )
end
context 'and we fix the issue and reload' do
let(:wait_until_ready) { false }
it 'fails in the beginning but starts working after reloading config' do
connection_string = processes.pgcat.connection_string("sharded_db", pg_user['username'], pg_user['password'])
while !(processes.pgcat.logs =~ /Waiting for clients/) do
sleep 0.5
end
expect { PG.connect(connection_string)}.to raise_error(PG::ConnectionBad)
expect(processes.pgcat.logs).to match(/Error trying to obtain password from auth_query/)
current_config = processes.pgcat.current_config
config = { 'general' => { 'auth_query' => "SELECT * FROM public.user_lookup('$1');", 'auth_query_user' => 'md5_auth_user', 'auth_query_password' => 'secret' } }
processes.pgcat.update_config(current_config.deep_merge(config))
processes.pgcat.reload_config
conn = nil
expect { conn = PG.connect(connection_string)}.not_to raise_error
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
end
end
end
end
end
context 'with per pool configuration' do
around(:example) do |example|
# Set up auth query
Helpers::AuthQuery.set_up_auth_query_for_user(
user: 'md5_auth_user',
password: 'secret'
);
Helpers::AuthQuery.set_up_auth_query_for_user(
user: 'md5_auth_user1',
password: 'secret',
database: 'shard1'
);
example.run
# Tear down auth query
Helpers::AuthQuery.tear_down_auth_query_for_user(
user: 'md5_auth_user',
password: 'secret'
);
Helpers::AuthQuery.tear_down_auth_query_for_user(
user: 'md5_auth_user1',
password: 'secret',
database: 'shard1'
);
end
context 'with correct parameters' do
let(:processes) { Helpers::AuthQuery.two_pools_auth_query(pool_names: ["sharded_db0", "sharded_db1"], pg_user: pg_user, config_user: config_user, extra_conf: config ) }
let(:config) {
{ 'pools' =>
{
'sharded_db0' => {
'auth_query' => "SELECT * FROM public.user_lookup('$1');",
'auth_query_user' => 'md5_auth_user',
'auth_query_password' => 'secret'
},
'sharded_db1' => {
'auth_query' => "SELECT * FROM public.user_lookup('$1');",
'auth_query_user' => 'md5_auth_user1',
'auth_query_password' => 'secret'
},
}
}
}
context 'and with cleartext passwords set' do
it 'it uses local passwords' do
conn = PG.connect(processes.pgcat.connection_string("sharded_db0", pg_user['username'], pg_user['password']))
expect(conn.exec("SELECT 1 + 2")).not_to be_nil
conn = PG.connect(processes.pgcat.connection_string("sharded_db1", pg_user['username'], pg_user['password']))
expect(conn.exec("SELECT 1 + 2")).not_to be_nil
end
end
context 'and with cleartext passwords not set' do
let(:config_user) { { 'username' => 'sharding_user' } }
it 'it uses obtained passwords' do
connection_string = processes.pgcat.connection_string("sharded_db0", pg_user['username'], pg_user['password'])
conn = PG.connect(connection_string)
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
connection_string = processes.pgcat.connection_string("sharded_db1", pg_user['username'], pg_user['password'])
conn = PG.connect(connection_string)
expect(conn.async_exec("SELECT 1 + 2")).not_to be_nil
end
end
end
end
end
end

BIN
tests/ruby/capture Normal file

Binary file not shown.

102
tests/ruby/copy_spec.rb Normal file
View File

@@ -0,0 +1,102 @@
# frozen_string_literal: true
require_relative 'spec_helper'
describe "COPY Handling" do
let(:processes) { Helpers::Pgcat.single_instance_setup("sharded_db", 5) }
before do
new_configs = processes.pgcat.current_config
# Allow connections in the pool to expire faster
new_configs["general"]["idle_timeout"] = 5
processes.pgcat.update_config(new_configs)
# We need to kill the old process that was using the default configs
processes.pgcat.stop
processes.pgcat.start
processes.pgcat.wait_until_ready
end
before do
processes.all_databases.first.with_connection do |conn|
conn.async_exec "CREATE TABLE copy_test_table (a TEXT,b TEXT,c TEXT,d TEXT)"
end
end
after do
processes.all_databases.first.with_connection do |conn|
conn.async_exec "DROP TABLE copy_test_table;"
end
end
after do
processes.all_databases.map(&:reset)
processes.pgcat.shutdown
end
describe "COPY FROM" do
context "within transaction" do
it "finishes within alloted time" do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", "sharding_user"))
Timeout.timeout(3) do
conn.async_exec("BEGIN")
conn.copy_data "COPY copy_test_table FROM STDIN CSV" do
sleep 0.5
conn.put_copy_data "some,data,to,copy\n"
conn.put_copy_data "more,data,to,copy\n"
end
conn.async_exec("COMMIT")
end
res = conn.async_exec("SELECT * FROM copy_test_table").to_a
expect(res).to eq([
{"a"=>"some", "b"=>"data", "c"=>"to", "d"=>"copy"},
{"a"=>"more", "b"=>"data", "c"=>"to", "d"=>"copy"}
])
end
end
context "outside transaction" do
it "finishes within alloted time" do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", "sharding_user"))
Timeout.timeout(3) do
conn.copy_data "COPY copy_test_table FROM STDIN CSV" do
sleep 0.5
conn.put_copy_data "some,data,to,copy\n"
conn.put_copy_data "more,data,to,copy\n"
end
end
res = conn.async_exec("SELECT * FROM copy_test_table").to_a
expect(res).to eq([
{"a"=>"some", "b"=>"data", "c"=>"to", "d"=>"copy"},
{"a"=>"more", "b"=>"data", "c"=>"to", "d"=>"copy"}
])
end
end
end
describe "COPY TO" do
before do
conn = PG.connect(processes.pgcat.connection_string("sharded_db", "sharding_user"))
conn.async_exec("BEGIN")
conn.copy_data "COPY copy_test_table FROM STDIN CSV" do
conn.put_copy_data "some,data,to,copy\n"
conn.put_copy_data "more,data,to,copy\n"
end
conn.async_exec("COMMIT")
conn.close
end
it "works" do
res = []
conn = PG.connect(processes.pgcat.connection_string("sharded_db", "sharding_user"))
conn.copy_data "COPY copy_test_table TO STDOUT CSV" do
while row=conn.get_copy_data
res << row
end
end
expect(res).to eq(["some,data,to,copy\n", "more,data,to,copy\n"])
end
end
end

View File

@@ -0,0 +1,173 @@
module Helpers
module AuthQuery
def self.single_shard_auth_query(
pg_user:,
config_user:,
pool_name:,
extra_conf: {},
log_level: 'debug',
wait_until_ready: true
)
user = {
"pool_size" => 10,
"statement_timeout" => 0,
}
pgcat = PgcatProcess.new(log_level)
pgcat_cfg = pgcat.current_config.deep_merge(extra_conf)
primary = PgInstance.new(5432, pg_user["username"], pg_user["password"], "shard0")
replica = PgInstance.new(10432, pg_user["username"], pg_user["password"], "shard0")
# Main proxy configs
pgcat_cfg["pools"] = {
"#{pool_name}" => {
"default_role" => "any",
"pool_mode" => "transaction",
"load_balancing_mode" => "random",
"primary_reads_enabled" => false,
"query_parser_enabled" => false,
"sharding_function" => "pg_bigint_hash",
"shards" => {
"0" => {
"database" => "shard0",
"servers" => [
["localhost", primary.port.to_i, "primary"],
["localhost", replica.port.to_i, "replica"],
]
},
},
"users" => { "0" => user.merge(config_user) }
}
}
pgcat_cfg["general"]["port"] = pgcat.port.to_i
pgcat.update_config(pgcat_cfg)
pgcat.start
pgcat.wait_until_ready(
pgcat.connection_string(
"sharded_db",
pg_user['username'],
pg_user['password']
)
) if wait_until_ready
OpenStruct.new.tap do |struct|
struct.pgcat = pgcat
struct.primary = primary
struct.replicas = [replica]
struct.all_databases = [primary]
end
end
def self.two_pools_auth_query(
pg_user:,
config_user:,
pool_names:,
extra_conf: {},
log_level: 'debug'
)
user = {
"pool_size" => 10,
"statement_timeout" => 0,
}
pgcat = PgcatProcess.new(log_level)
pgcat_cfg = pgcat.current_config
primary = PgInstance.new(5432, pg_user["username"], pg_user["password"], "shard0")
replica = PgInstance.new(10432, pg_user["username"], pg_user["password"], "shard0")
pool_template = Proc.new do |database|
{
"default_role" => "any",
"pool_mode" => "transaction",
"load_balancing_mode" => "random",
"primary_reads_enabled" => false,
"query_parser_enabled" => false,
"sharding_function" => "pg_bigint_hash",
"shards" => {
"0" => {
"database" => database,
"servers" => [
["localhost", primary.port.to_i, "primary"],
["localhost", replica.port.to_i, "replica"],
]
},
},
"users" => { "0" => user.merge(config_user) }
}
end
# Main proxy configs
pgcat_cfg["pools"] = {
"#{pool_names[0]}" => pool_template.call("shard0"),
"#{pool_names[1]}" => pool_template.call("shard1")
}
pgcat_cfg["general"]["port"] = pgcat.port
pgcat.update_config(pgcat_cfg.deep_merge(extra_conf))
pgcat.start
pgcat.wait_until_ready(pgcat.connection_string("sharded_db0", pg_user['username'], pg_user['password']))
OpenStruct.new.tap do |struct|
struct.pgcat = pgcat
struct.primary = primary
struct.replicas = [replica]
struct.all_databases = [primary]
end
end
def self.create_query_auth_function(user)
return <<-SQL
CREATE OR REPLACE FUNCTION public.user_lookup(in i_username text, out uname text, out phash text)
RETURNS record AS $$
BEGIN
SELECT usename, passwd FROM pg_catalog.pg_shadow
WHERE usename = i_username INTO uname, phash;
RETURN;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
GRANT EXECUTE ON FUNCTION public.user_lookup(text) TO #{user};
SQL
end
def self.exec_in_instances(query:, instance_ports: [ 5432, 10432 ], database: 'postgres', user: 'postgres', password: 'postgres')
instance_ports.each do |port|
c = PG.connect("postgres://#{user}:#{password}@localhost:#{port}/#{database}")
c.exec(query)
c.close
end
end
def self.set_up_auth_query_for_user(user:, password:, instance_ports: [ 5432, 10432 ], database: 'shard0' )
instance_ports.each do |port|
connection = PG.connect("postgres://postgres:postgres@localhost:#{port}/#{database}")
connection.exec(self.drop_query_auth_function(user)) rescue PG::UndefinedFunction
connection.exec("DROP ROLE #{user}") rescue PG::UndefinedObject
connection.exec("CREATE ROLE #{user} ENCRYPTED PASSWORD '#{password}' LOGIN;")
connection.exec(self.create_query_auth_function(user))
connection.close
end
end
def self.tear_down_auth_query_for_user(user:, password:, instance_ports: [ 5432, 10432 ], database: 'shard0' )
instance_ports.each do |port|
connection = PG.connect("postgres://postgres:postgres@localhost:#{port}/#{database}")
connection.exec(self.drop_query_auth_function(user)) rescue PG::UndefinedFunction
connection.exec("DROP ROLE #{user}")
connection.close
end
end
def self.drop_query_auth_function(user)
return <<-SQL
REVOKE ALL ON FUNCTION public.user_lookup(text) FROM public, #{user};
DROP FUNCTION public.user_lookup(in i_username text, out uname text, out phash text);
SQL
end
end
end

View File

@@ -7,10 +7,24 @@ class PgInstance
attr_reader :password
attr_reader :database_name
def self.mass_takedown(databases)
raise StandardError "block missing" unless block_given?
databases.each do |database|
database.toxiproxy.toxic(:limit_data, bytes: 1).toxics.each(&:save)
end
sleep 0.1
yield
ensure
databases.each do |database|
database.toxiproxy.toxics.each(&:destroy)
end
end
def initialize(port, username, password, database_name)
@original_port = port
@original_port = port.to_i
@toxiproxy_port = 10000 + port.to_i
@port = @toxiproxy_port
@port = @toxiproxy_port.to_i
@username = username
@password = password
@@ -48,9 +62,9 @@ class PgInstance
def take_down
if block_given?
Toxiproxy[@toxiproxy_name].toxic(:limit_data, bytes: 5).apply { yield }
Toxiproxy[@toxiproxy_name].toxic(:limit_data, bytes: 1).apply { yield }
else
Toxiproxy[@toxiproxy_name].toxic(:limit_data, bytes: 5).toxics.each(&:save)
Toxiproxy[@toxiproxy_name].toxic(:limit_data, bytes: 1).toxics.each(&:save)
end
end
@@ -89,6 +103,6 @@ class PgInstance
end
def count_select_1_plus_2
with_connection { |c| c.async_exec("SELECT SUM(calls) FROM pg_stat_statements WHERE query = 'SELECT $1 + $2'")[0]["sum"].to_i }
with_connection { |c| c.async_exec("SELECT SUM(calls) FROM pg_stat_statements WHERE query LIKE '%SELECT $1 + $2%'")[0]["sum"].to_i }
end
end

Some files were not shown because too many files have changed in this diff Show More