Commit Graph

25 Commits

Author SHA1 Message Date
Vita Tauer
a1690a76c5 Add 'Fail' mode when no shard selected 2024-11-11 12:04:15 +01:00
Lev Kokotov
e76d720ffb Dont cache prepared statement with errors (#647)
* Fix prepared statement not found when prepared stmt has error

* cleanup debug

* remove more debug msgs

* sure debugged this..

* version bump

* add rust tests
2023-11-28 21:13:30 -08:00
Mostafa Abdelraouf
0b01d70b55 Allow configuring routing decision when no shard is selected (#578)
The TL;DR for the change is that we allow QueryRouter to set the active shard to None. This signals to the Pool::get method that we have no shard selected. The get method follows a no_shard_specified_behavior config to know how to route the query.

Original PR description
Ruby-pg library makes a startup query to SET client_encoding to ... if Encoding.default_internal value is set (Code). This query is troublesome because we cannot possibly attach a routing comment to it. PgCat, by default, will route that query to the default shard.

Everything is fine until shard 0 has issues, Clients will all be attempting to send this query to shard0 which increases the connection latency significantly for all clients, even those not interested in shard0

This PR introduces no_shard_specified_behavior that defines the behavior in case we have routing-by-comment enabled but we get a query without a comment. The allowed behaviors are

random: Picks a shard at random
random_healthy: Picks a shard at random favoring shards with the least number of recent connection/checkout errors
shard_<number>: e.g. shard_0, shard_4, etc. picks a specific shard, everytime
In order to achieve this, this PR introduces an error_count on the Address Object that tracks the number of errors since the last checkout and uses that metric to sort shards by error count before making a routing decision.
I didn't want to use address stats to avoid introducing a routing dependency on internal stats (We might do that in the future but I prefer to avoid this for the time being.

I also made changes to the test environment to replace Ruby's TOML reader library, It appears to be abandoned and does not support mixed arrays (which we use in the config toml), and it also does not play nicely with single-quoted regular expressions. I opted for using yj which is a CLI tool that can convert from toml to JSON and back. So I refactor the tests to use that library.
2023-09-11 13:47:28 -05:00
Sebastian Webber
9ab128579d parse server error messages (#543)
This commit adds a parser to the Postgres error message, providing better
error messages.

Implemented based in:
  https://www.postgresql.org/docs/12/protocol-error-fields.html

Signed-off-by: Sebastian Webber <sebastian@swebber.me>
2023-08-09 09:14:05 -07:00
Lev Kokotov
c7d6273037 Support for prepared statements (#474)
* Start prepared statements

* parse

* Ok

* optional

* dont rewrite anonymous prepared stmts

* Dont rewrite anonymous prep statements

* hm?

* prep statements

* I see!

* comment

* Print config value

* Rewrite bind and add sqlx test

* fmt

* ok

* Fix

* Fix stats

* its late

* clean up PREPARE
2023-06-16 12:57:44 -07:00
Lev Kokotov
09e54e1175 Plugins! (#420)
* Some queries

* Plugins!!

* cleanup

* actual names

* the actual plugins

* comment

* fix tests

* Tests

* unused errors

* Increase reaper rate to actually enforce settings

* ok
2023-05-03 09:13:05 -07:00
Jose Fernández
7dfbd993f2 Add dns_cache for server addresses as in pgbouncer (#249)
* Add dns_cache so server addresses are cached and invalidated when DNS changes.

Adds a module to deal with dns_cache feature. It's
main struct is CachedResolver, which is a simple thread safe
hostname <-> Ips cache with the ability to refresh resolutions
every `dns_max_ttl` seconds. This way, a client can check whether its
ip address has changed.

* Allow reloading dns cached

* Add documentation for dns_cached
2023-05-02 10:26:40 +02:00
Lev Kokotov
692353c839 A couple things (#397)
* Format cleanup

* fmt

* finally
2023-04-10 14:51:01 -07:00
Jose Fernández
6f768a84ce Auth passthrough (auth_query) (#266)
* Add a new exec_simple_query method

This adds a new `exec_simple_query` method so we can make 'out of band'
queries to servers that don't interfere with pools at all.
In order to reuse startup code for making these simple queries,
we need to set the stats (`Reporter`) optional, so using these
simple queries wont interfere with stats.

* Add auth passthough (auth_query)

Adds a feature that allows setting auth passthrough for md5 auth.

It adds 3 new (general and pool) config parameters:

- `auth_query`: An string containing a query that will be executed on boot
to obtain the hash of a given user. This query have to use a placeholder `$1`,
so pgcat can replace it with the user its trying to fetch the hash from.
- `auth_query_user`: The user to use for connecting to the server and executing the
auth_query.
- `auth_query_password`: The password to use for connecting to the server and executing the
auth_query.

The configuration can be done either on the general config (so pools share them) or in a per-pool basis.

The behavior is, at boot time, when validating server connections, a hash is fetched per server
and stored in the pool. When new server connections are created, and no cleartext password is specified,
the obtained hash is used for creating them, if the hash could not be obtained for whatever reason, it retries
it.

When client authentication is tried, it uses cleartext passwords if specified, it not, it checks whether
we have query_auth set up, if so, it tries to use the obtained hash for making client auth. If there is no
hash (we could not obtain one when validating the connection), a new fetch is tried.

Once we have a hash, we authenticate using it against whathever the client has sent us, if there is a failure
we refetch the hash and retry auth (so password changes can be done).

The idea with this 'retrial' mechanism is to make it fault tolerant, so if for whatever reason hash could not be
obtained during connection validation, or the password has change, we can still connect later.

* Add documentation for Auth passthrough
2023-03-30 13:29:23 -07:00
Mostafa Abdelraouf
2cc6a09fba Add Manual host banning to PgCat (#340)
Sometimes we want an admin to be able to ban a host for some time to route traffic away from that host for reasons like partial outages, replication lag, and scheduled maintenance.

We can achieve this today using a configuration update but a quicker approach is to send a control command to PgCat that bans the replica for some specified duration.

This command does not change the current banning rules like

Primaries cannot be banned
When all replicas are banned, all replicas are unbanned
2023-03-06 06:10:59 -06:00
zainkabani
ca8901910c Removes message cloning operation required for query router (#285)
* Removes message cloning operation required for query router

* fmt

* flakey?

* ?
2023-01-19 07:19:49 -08:00
zainkabani
c62b86f4e6 Adds details to errors and fixes error propagation bug (#239) 2022-11-17 09:24:39 -08:00
Lev Kokotov
9d84d6f131 Graceful shutdown and refactor (#144)
* Graceful shutdown and refactor

* ok

* _Graceful_ shutdown

* Remove hardcoded setting

* clean up

* end

* timeout

* hmm

* hmm!

* bash

* bash

* hmm

* maybe maybe

* Adds tests and move non-admin connection rejection to startup (#145)

* Move error response

* Adds tests and removes unused variable

* Adds debug log

Co-authored-by: zainkabani <77307340+zainkabani@users.noreply.github.com>
2022-08-25 06:40:56 -07:00
Lev Kokotov
3285006440 Statement timeout + replica imbalance fix (#122)
* Statement timeout

* send error message too

* Correct error messages

* Fix replica inbalance

* disable stmt timeout by default

* Redundant mark_bad

* revert healthcheck delay

* tests

* set it to 0

* reload config again
2022-08-13 13:45:58 -07:00
Lev Kokotov
b974aacd71 check 2022-06-27 09:46:33 -07:00
Lev Kokotov
d3310a62c2 Client md5 auth and clean up scram (#77)
* client md5 auth and clean up scram

* add pw

* add user

* add user

* log
2022-06-20 06:15:54 -07:00
Lev Kokotov
df85139281 Update README. Comments. Version bump. (#60)
* update readme

* comments

* just a version bump
2022-03-10 01:33:29 -08:00
Lev Kokotov
a9b2a41a9b fixes to the banlist 2022-02-09 21:19:14 -08:00
Lev Kokotov
edf6a69ca4 warnings 2022-02-08 17:08:17 -08:00
Lev Kokotov
c27a7d30dc config support; started more sharding 2022-02-08 09:25:59 -08:00
Lev Kokotov
83daaf92d1 Less dirty servers & fix python 2022-02-03 18:02:50 -08:00
Lev Kokotov
2114cb2a97 healthchecks! 2022-02-03 17:32:04 -08:00
Lev Kokotov
e3ec5036d7 connection pool 2022-02-03 16:25:05 -08:00
Lev Kokotov
f4d647ce2f it works! 2022-02-03 15:17:04 -08:00
Lev Kokotov
c0b747ba34 working startup 2022-02-03 13:35:40 -08:00