This commit adds the TCP_NODELAY option to the socket configuration in
`configure_socket` function. Without this option, we observed significant
performance issues when executing SELECT queries with large responses.
Before the fix:
postgres=> SELECT repeat('a', 1); SELECT repeat('a', 8153);
Time: 1.368 ms
Time: 41.364 ms
After the fix:
postgres=> SELECT repeat('a', 1); SELECT repeat('a', 8153);
Time: 1.332 ms
Time: 1.528 ms
By setting TCP_NODELAY, we eliminate the Nagle's algorithm delay, which
results in a substantial improvement in response times for large queries.
This problem was discussed in https://github.com/postgresml/pgcat/issues/616.
The pool maxwait metric currently operates differently from Pgbouncer.
The way it operates today is that we keep track of max_wait on each connected client, when SHOW POOLS query is made, we go over the connected clients and we get the max of max_wait times among clients. This means the pool maxwait will never reset, it will always be monotonically increasing until the client with the highest maxwait disconnects.
This PR changes this behavior, by keeping track of the wait_start time on each client, when a client goes into WAITING state, we record the time offset from connect_time. When we either successfully or unsuccessfully checkout a connection from the pool, we reset the wait_start time.
When SHOW POOLS query is made, we go over all connected clients and we only consider clients whose wait_start is non-zero, for clients that have non-zero wait times, we compare them and report the maximum waiting time as maxwait for the pool.
* Revert "Reset wait times when checked out successfully (#656)"
This reverts commit ec3920d60f.
* Revert "Not sure how this sneaked past CI"
This reverts commit 4c5498b915.
* Revert "only report wait times from clients currently waiting to match behavior of pgbouncer (#655)"
This reverts commit 0e8064b049.
* Fix prepared statement not found when prepared stmt has error
* cleanup debug
* remove more debug msgs
* sure debugged this..
* version bump
* add rust tests
* Expose clients maxwait time in SHOW CLIENTS response via PgCat admin
Displays the maxwait via maxwait_seconds and maxwait_us columns for each client that can be used to track down the wait time per client in a case where the overall pool stats shows waiting time. The maxwait_us, similar to the pool stats setup, is configured to display as a remainder alongside the maxwait_seconds.
* Use maxwait instead of maxwait_seconds to match pools column name
---------
Co-authored-by: Calvin Hughes <9379992+calvinhughes@users.noreply.github.com>
* Add golang test suite to reproduce issue with unnamed parameterized prepared statements
* Allow caching of unnamed prepared statements
* Passthrough describe on portals
* Remove unneeded kill
* Update Dockerfile.ci with golang
* Move out update of Dockerfiles to separate PR
* Initial commit
* Cleanup and add stats
* Use an arc instead of full clones to store the parse packets
* Use mutex instead
* fmt
* clippy
* fmt
* fix?
* fix?
* fmt
* typo
* Update docs
* Refactor custom protocol
* fmt
* move custom protocol handling to before parsing
* Support describe
* Add LRU for server side statement cache
* rename variable
* Refactoring
* Move docs
* Fix test
* fix
* Update tests
* trigger build
* Add more tests
* Reorder handling sync
* Support when a named describe is sent along with Parse (go pgx) and expecting results
* don't talk to client if not needed when client sends Parse
* fmt :(
* refactor tests
* nit
* Reduce hashing
* Reducing work done to decode describe and parse messages
* minor refactor
* Merge branch 'main' into zain/reimplment-prepared-statements-with-global-lru-cache
* Rewrite extended and prepared protocol message handling to better support mocking response packets and close
* An attempt to better handle if there are DDL changes that might break cached plans with ideas about how to further improve it
* fix
* Minor stats fixed and cleanup
* Cosmetic fixes (#64)
* Cosmetic fixes
* fix test
* Change server drop for statement cache error to a `deallocate all`
* Updated comments and added new idea for handling DDL changes impacting cached plans
* fix test?
* Revert test change
* trigger build, flakey test
* Avoid potential race conditions by changing get_or_insert to promote for pool LRU
* remove ps enabled variable on the server in favor of using an option
* Add close to the Extended Protocol buffer
---------
Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
* Improved logging
* Improved logging for more `Address` usages
* Fixed lint issues.
* Reverted the `Address` logging changes.
* Applied the PR comment by @levkk.
* Applied the PR comment by @levkk.
* Applied the PR comment by @levkk.
* Applied the PR comment by @levkk.
The TL;DR for the change is that we allow QueryRouter to set the active shard to None. This signals to the Pool::get method that we have no shard selected. The get method follows a no_shard_specified_behavior config to know how to route the query.
Original PR description
Ruby-pg library makes a startup query to SET client_encoding to ... if Encoding.default_internal value is set (Code). This query is troublesome because we cannot possibly attach a routing comment to it. PgCat, by default, will route that query to the default shard.
Everything is fine until shard 0 has issues, Clients will all be attempting to send this query to shard0 which increases the connection latency significantly for all clients, even those not interested in shard0
This PR introduces no_shard_specified_behavior that defines the behavior in case we have routing-by-comment enabled but we get a query without a comment. The allowed behaviors are
random: Picks a shard at random
random_healthy: Picks a shard at random favoring shards with the least number of recent connection/checkout errors
shard_<number>: e.g. shard_0, shard_4, etc. picks a specific shard, everytime
In order to achieve this, this PR introduces an error_count on the Address Object that tracks the number of errors since the last checkout and uses that metric to sort shards by error count before making a routing decision.
I didn't want to use address stats to avoid introducing a routing dependency on internal stats (We might do that in the future but I prefer to avoid this for the time being.
I also made changes to the test environment to replace Ruby's TOML reader library, It appears to be abandoned and does not support mixed arrays (which we use in the config toml), and it also does not play nicely with single-quoted regular expressions. I opted for using yj which is a CLI tool that can convert from toml to JSON and back. So I refactor the tests to use that library.
* User server parameters struct instead of server info bytesmut
* Refactor to use hashmap for all params and add server parameters to client
* Sync parameters on client server checkout
* minor refactor
* update client side parameters when changed
* Move the SET statement logic from the C packet to the S packet.
* trigger build
* revert validation changes
* remove comment
* Try fix
* Reset cleanup state after sync
* fix server version test
* Track application name through client life for stats
* Add tests
* minor refactoring
* fmt
* fix
* fmt
* Make infer role configurable and fix double parse bug
* Fix tests
* Enable infer_role_from query in toml for tests
* Fix test
* Add max length config, add logging for which application is failing to parse, and change config name
* fmt
* Update src/config.rs
---------
Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>
* Move connection checkin log messages to their own target
Under heavy load they can happen thousands of times per second, and
should generally be considered a nuisance at best. This marks the state
discard as an info rather than a warning, and moves all the messages
into their own log-target, so they can be filtered separately from the
more relevant warnings.
Signed-off-by: D.S. Ljungmark <spider@skuggor.se>
* Remove left-over env_logger dependencies
When moving to tracing-subscriber for logging, the env_logger
dependencies were left around, this cuts them out as dead code.
Signed-off-by: D.S. Ljungmark <spider@skuggor.se>
* Restore ability to filter log messages at runtime
This restores the RUST_LOG filters from env_logger but now with the
tracing subscriber setup. The filters are chained so commandline options
mark the default in case either option is set, which should be the path
of least confusion for users. ( RUST_LOG setting level to debug, and
commandline to warning is an odd user case, and I don't know what a user
who does that is expecting. )
It also bumps the version number as a fix to see which versions have
which behaviour.
Signed-off-by: D.S. Ljungmark <spider@skuggor.se>
---------
Signed-off-by: D.S. Ljungmark <spider@skuggor.se>
add --no-color option to disable colors
this commit adds a new option to disable colors in the terminal and also
moves the logger configuration to a different crate.
Signed-off-by: Sebastian Webber <sebastian@swebber.me>
This commit adds the clap library and configures the necessary args to
parse from the command line, expanding the current option of a single
file and adding support for environment variables.
Signed-off-by: Sebastian Webber <sebastian@swebber.me>
This commit adds a new function to handle notify and use it
in the SHOW HELP command, which displays the available options
in the admin console.
Also, adding Fabrízio as a co-author for all the help with the
protocol and the help to structure this PR.
Signed-off-by: Sebastian Webber <sebastian@swebber.me>
Co-authored-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>