pgcat

mirror of https://github.com/postgresml/pgcat.git synced 2026-03-23 01:16:30 +00:00

Author	SHA1	Message	Date
Zain Kabani	f94ce97ebc	Handle and track startup parameters (#478 ) * User server parameters struct instead of server info bytesmut * Refactor to use hashmap for all params and add server parameters to client * Sync parameters on client server checkout * minor refactor * update client side parameters when changed * Move the SET statement logic from the C packet to the S packet. * trigger build * revert validation changes * remove comment * Try fix * Reset cleanup state after sync * fix server version test * Track application name through client life for stats * Add tests * minor refactoring * fmt * fix * fmt	2023-08-10 08:18:46 -07:00
Zain Kabani	e14b283f0c	Make infer role configurable and fix double parse bug (#533 ) * Make infer role configurable and fix double parse bug * Fix tests * Enable infer_role_from query in toml for tests * Fix test * Add max length config, add logging for which application is failing to parse, and change config name * fmt * Update src/config.rs --------- Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>	2023-08-08 13:10:03 -07:00
Mostafa Abdelraouf	2a8f3653a6	Fix COPY FROM and add tests (#522 ) * Fix COPY FROM and add tests * E * fmt	2023-07-20 23:06:01 -07:00
Lev Kokotov	4b78af9676	Implement Close for prepared statements (#482 ) * Partial support for Close * Close * respect config value * prepared spec * Hmm * Print cache size	2023-06-18 23:02:34 -07:00
Lev Kokotov	c7d6273037	Support for prepared statements (#474 ) * Start prepared statements * parse * Ok * optional * dont rewrite anonymous prepared stmts * Dont rewrite anonymous prep statements * hm? * prep statements * I see! * comment * Print config value * Rewrite bind and add sqlx test * fmt * ok * Fix * Fix stats * its late * clean up PREPARE	2023-06-16 12:57:44 -07:00
Mostafa Abdelraouf	a8a30ad43b	Refactor Pool Stats to be based off of Server/Client stats (#445 ) What is wrong Stats reported by SHOW POOLS seem to be leaking. We see lingering cl_idle , cl_waiting, and similarly for sv_idle , sv_active. We confirmed that these are reporting issues not actual lingering clients. This behavior is readily reproducible by running while true; do psql "postgres://sharding_user:sharding_user@localhost:6432/sharded_db" -c "SELECT 1" > /dev/null 2>&1 & done Why it happens I wasn't able to get to figure our the reason for the bug but my best guess is that we have race conditions when updating pool-level stats. So even though individual update operations are atomic, we perform a check then update sequence which is not protected by a guard. https://github.com/postgresml/pgcat/blob/main/src/stats/pool.rs#L174-L179 I am also suspecting that using Relaxed ordering might allow this behavior (I changed all operations to use Ordering::SeqCst but still got lingering clients) How to fix Since SHOW POOLS/SHOW SERVER/SHOW CLIENTS all show the current state of the proxy (as opposed to SHOW STATS which show aggregate values), this PR refactors SHOW POOLS to have it construct the results directly from SHOW SERVER and SHOW CLIENT datasets. This reduces the complexity of stat updates and eliminates the need for having locks when updating pool stats as we only care about updating individual client/server states. This will change the semantics of maxwait, so instead of it holding the maxwait time ever encountered by a client (connected or disconnected), it will only consider connected clients which should be okay given PgCat tends to hold on to client connections more than Pgbouncer.	2023-05-23 08:44:49 -05:00
Lev Kokotov	100778670c	Ensure data makes it to the client (#446 ) * Ensure data makes it to the client * flush all buffers	2023-05-18 16:41:22 -07:00
Lev Kokotov	09e54e1175	Plugins! (#420 ) * Some queries * Plugins!! * cleanup * actual names * the actual plugins * comment * fix tests * Tests * unused errors * Increase reaper rate to actually enforce settings * ok	2023-05-03 09:13:05 -07:00
Lev Kokotov	0d504032b2	Server TLS (#417 ) * Server TLS * Finish up TLS * thats it * diff * remove dead code * maybe? * dirty shutdown * skip flakey test * remove unused error * fetch config once	2023-04-30 09:41:46 -07:00
Lev Kokotov	3dae3d0777	Separate server and client passwords optionally (#407 ) * Separate server and user passwords * config	2023-04-18 09:57:17 -07:00
Lev Kokotov	692353c839	A couple things (#397 ) * Format cleanup * fmt * finally	2023-04-10 14:51:01 -07:00
Jose Fernández	6f768a84ce	Auth passthrough (auth_query) (#266 ) * Add a new exec_simple_query method This adds a new `exec_simple_query` method so we can make 'out of band' queries to servers that don't interfere with pools at all. In order to reuse startup code for making these simple queries, we need to set the stats (`Reporter`) optional, so using these simple queries wont interfere with stats. * Add auth passthough (auth_query) Adds a feature that allows setting auth passthrough for md5 auth. It adds 3 new (general and pool) config parameters: - `auth_query`: An string containing a query that will be executed on boot to obtain the hash of a given user. This query have to use a placeholder `$1`, so pgcat can replace it with the user its trying to fetch the hash from. - `auth_query_user`: The user to use for connecting to the server and executing the auth_query. - `auth_query_password`: The password to use for connecting to the server and executing the auth_query. The configuration can be done either on the general config (so pools share them) or in a per-pool basis. The behavior is, at boot time, when validating server connections, a hash is fetched per server and stored in the pool. When new server connections are created, and no cleartext password is specified, the obtained hash is used for creating them, if the hash could not be obtained for whatever reason, it retries it. When client authentication is tried, it uses cleartext passwords if specified, it not, it checks whether we have query_auth set up, if so, it tries to use the obtained hash for making client auth. If there is no hash (we could not obtain one when validating the connection), a new fetch is tried. Once we have a hash, we authenticate using it against whathever the client has sent us, if there is a failure we refetch the hash and retry auth (so password changes can be done). The idea with this 'retrial' mechanism is to make it fault tolerant, so if for whatever reason hash could not be obtained during connection validation, or the password has change, we can still connect later. * Add documentation for Auth passthrough	2023-03-30 13:29:23 -07:00
Jose Fernández	58ce76d9b9	Refactor stats to use atomics (#375 ) * Refactor stats to use atomics When we are dealing with a high number of connections, generated stats cannot be consumed fast enough by the stats collector loop. This makes the stats subsystem inconsistent and a log of warning messages are thrown due to unregistered server/clients. This change refactors the stats subsystem so it uses atomics: - Now counters are handled using U64 atomics - Event system is dropped and averages are calculated using a loop every 15 seconds. - Now, instead of snapshots being generated ever second we keep track of servers/clients that have registered. Each pool/server/client has its own instance of the counter and makes changes directly, instead of adding an event that gets processed later. * Manually mplement Hash/Eq in `config::Address` ignoring stats * Add tests for client connection counters * Allow connecting to dockerized dev pgcat from the host * stats: Decrease cl_idle when idle socket disconnects	2023-03-28 17:19:37 +02:00
Zain Kabani	ca4431b67e	Add idle client in transaction configuration (#380 ) * Add idle client in transaction configuration * fmt * Update docs * trigger build * Add tests * Make the config dynamic from reloads * fmt * comments * trigger build * fix config.md * remove error	2023-03-24 08:20:30 -07:00
Lev Kokotov	b4baa86e8a	Extended query protocol sharding (#339 ) * Prepared stmt sharding s tests * len check * remove python test * latest rust * move that to debug for sure * Add the actual tests * latest image * Update tests/ruby/sharding_spec.rb	2023-03-10 07:55:22 -08:00
Mostafa Abdelraouf	2cc6a09fba	Add Manual host banning to PgCat (#340 ) Sometimes we want an admin to be able to ban a host for some time to route traffic away from that host for reasons like partial outages, replication lag, and scheduled maintenance. We can achieve this today using a configuration update but a quicker approach is to send a control command to PgCat that bans the replica for some specified duration. This command does not change the current banning rules like Primaries cannot be banned When all replicas are banned, all replicas are unbanned	2023-03-06 06:10:59 -06:00
zainkabani	d81a744154	Fix logging mistakes (#313 ) Mistakenly logging username as poolname and poolname as username	2023-02-07 14:16:28 -06:00
Lev Kokotov	24e79dcf05	Startup improvements & PAUSE/RESUME (#300 ) * Dont require servers to be online to start pooler * PAUSE/RESUME * fix * Refresh pool * Fixes * lint	2023-01-28 15:36:35 -08:00
zainkabani	ca8901910c	Removes message cloning operation required for query router (#285 ) * Removes message cloning operation required for query router * fmt * flakey? * ?	2023-01-19 07:19:49 -08:00
zainkabani	85ac3ef9a5	Buffer client CopyData messages (#284 ) Buffers CopyData messages and removes buffer clone for the sync message	2023-01-17 17:39:55 -08:00
zainkabani	c62b86f4e6	Adds details to errors and fixes error propagation bug (#239 )	2022-11-17 09:24:39 -08:00
zainkabani	fcd2cae4e1	Move get_config in startup to admin branch to scope down usage (#238 )	2022-11-17 09:22:12 -08:00
zainkabani	fe0b012832	Adds configuration for logging connections and removes get_config from entrypoint (#236 ) * Adds configuration for logging connections and removes get_config from entrypoint * typo * rename connection config var and add to toml files * update config log * fmt	2022-11-16 22:15:47 -08:00
zainkabani	b7e70b885c	Default to using username when database isn't present on startup (#234 )	2022-11-16 18:49:04 -08:00
Cluas	dfa26ec6f8	chore: make clippy lint happy (#225 ) * chore: make clippy happy * chore: cargo fmt * chore: cargo fmt	2022-11-09 10:04:31 -08:00
Lev Kokotov	0524787d31	Automatic sharding: part one of many (#194 ) Starting automatic sharding	2022-10-25 11:47:41 -07:00
zainkabani	19f635881a	Don't send discard all when state is changed in transaction (#186 ) * Don't send discard all when state is changed in transaction * Remove unnecessary clone * spelling * Move transaction check to SET command * Add test for set command in transaction * type * Update comments * Update comments * use moves instead of clones for initial message * don't make message mutable * Update unwrap * but i'm not a wrapper * Add set local test * change continue	2022-10-13 19:33:12 -07:00
Lev Kokotov	7987c5ffad	Replace a few types with more developer-friendly names (#182 ) * Replace a few types with more developer-friendly names * UserPool -> PoolIdentifier	2022-10-01 10:25:59 -07:00
Mostafa Abdelraouf	d126c7424d	Log failed client logins (#173 ) * Log failed client logins * more logging * remove clones * remove	2022-09-23 09:08:38 -07:00
zainkabani	f72dac420b	Add defaults for configs (#174 ) * add statement timeout to readme * Add defaults to various configs * primary read enabled default to false	2022-09-22 23:00:46 -07:00
Mostafa Abdelraouf	f7a951745c	Report Query times (#166 ) * Report avg and total query timing * Report query times * fmt	2022-09-15 02:21:45 -04:00
Mostafa Abdelraouf	4ae1bc8d32	Add SHOW CLIENTS / SHOW SERVERS + Stats refactor and tests (#159 ) * wip * Main Thread Panic when swarmed with clients * fix * fix * 1024 * fix * remove test * Add SHOW CLIENTS * revert * fmt * Refactor + tests * fmt * add test * Add SHOW SERVERS + Make PR unreviewable * prometheus * add state to clients and servers * fmt * Add application_name to server stats * Add tests for waiting clients * Docs * remove comment * comments * typo * cleanup * CI	2022-09-14 11:20:41 -04:00
Mostafa Abdelraouf	9514b3b2d1	Clean connection state up after protocol named prepared statement (#163 ) * Clean connection state up after protocol named prepared statement * Avoid cloning + add test * fmt	2022-09-07 20:37:17 -07:00
Mostafa Abdelraouf	744ceada86	Better logging for failure to get connection from pool (#161 )	2022-09-07 08:24:07 -07:00
Mostafa Abdelraouf	a5c8dd69b2	Avoid reporting ProtocolSyncError when admin session disconnects (#160 ) * Avoid reporting ProtocolSyncError when admin session disconnects * rebuild * rebuild	2022-09-06 22:22:31 -07:00
Mostafa Abdelraouf	976b406468	Main Thread Panic when swarmed with clients (#158 ) * Main Thread Panic when swarmed with clients * fix * fix * 1024 * fix * remove test * Update src/client.rs * Update src/main.rs * Update src/client.rs * Update src/main.rs Co-authored-by: Lev Kokotov <levkk@users.noreply.github.com>	2022-09-05 01:21:06 -07:00
zain-kabani	417358c35d	Patch graceful shutdown bug (#157 ) * Fixes non-admin client counting error * Add log when sigterm received and log number of active clients when shutdown timeout is reached	2022-09-05 01:02:49 -07:00
Mostafa Abdelraouf	23a642f4a4	Send DISCARD ALL even if client is not in transaction (#152 ) * Send DISCARD ALL even if client is not in transaction * fmt * Added tests + avoided sending extra discard all * Adds set name logic to beginning of handle client * fmt * refactor dead code handling * Refactor reading command tag * remove unnecessary trim * Removing debugging statement * typo * typo{ * documentation * edit text * un-unwrap * run ci * run ci Co-authored-by: Zain Kabani <zain.kabani@instacart.com>	2022-09-01 20:06:55 -07:00
Mostafa Abdelraouf	7f20dc3054	Better handling extended protocol messages in the event of busy pool (#155 ) * Better handling for checkout errors during extended protocol messages * Fix specs * comment	2022-09-01 15:02:39 -07:00
Lev Kokotov	9d84d6f131	Graceful shutdown and refactor (#144 ) * Graceful shutdown and refactor * ok * _Graceful_ shutdown * Remove hardcoded setting * clean up * end * timeout * hmm * hmm! * bash * bash * hmm * maybe maybe * Adds tests and move non-admin connection rejection to startup (#145) * Move error response * Adds tests and removes unused variable * Adds debug log Co-authored-by: zainkabani <77307340+zainkabani@users.noreply.github.com>	2022-08-25 06:40:56 -07:00
Mostafa Abdelraouf	c054ff068d	Avoid sending `Z` packet in the middle of extended protocol packet sequence if we fail to get connection from pool (#137 ) * Failing test * maybe * try fail * try * add message * pool size * correct user * more * debug * try fix * see stdout * stick? * fix configs * modify * types * m * maybe * make tests idempotent * hopefully fails * Add client fix * revert pgcat.toml change * Fix tests	2022-08-23 11:02:23 -07:00
Lev Kokotov	5a0cea6a24	Really fix idle servers (#141 )	2022-08-22 11:56:40 -07:00
Lev Kokotov	d0e8171b1b	Fix too many idle servers (#140 ) * Fix too many idle servers * oops	2022-08-22 11:52:34 -07:00
Mostafa Abdelraouf	5f5b5e2543	Random instance selection (#136 ) * wip * revert some' * revert more * poor-man's integration test * remove test * fmt * --workspace * fix build * fix integration test * another stab * log * run after integration * cargo test after integration * revert * revert more * Refactor + clean up * more clean up	2022-08-21 22:15:20 -07:00
zainkabani	5948fef6cf	Minor Refactoring of re-used code and server stat reporting (#129 ) * Minor changes to stats reporting and recduce re-used code * fmt	2022-08-18 05:12:38 -07:00
Lev Kokotov	3285006440	Statement timeout + replica imbalance fix (#122 ) * Statement timeout * send error message too * Correct error messages * Fix replica inbalance * disable stmt timeout by default * Redundant mark_bad * revert healthcheck delay * tests * set it to 0 * reload config again	2022-08-13 13:45:58 -07:00
zainkabani	f963b12821	Health check delay (#118 ) * initial commit of server check delay implementation * fmt * spelling * Update name to last_healthcheck and some comments * Moved server tested stat to after require_healthcheck check * Make health check delay configurable * Rename to last_activity * Fix typo * Add debug log for healthcheck * Add address to debug log	2022-08-11 14:42:40 -07:00
Mostafa Abdelraouf	7592339092	Prevent clients from sticking to old pools after config update (#113 ) * Re-acquire pool at the beginning of Protocol loop * Fix query router + add tests for recycling behavior	2022-08-09 12:18:27 -07:00
zainkabani	3719c22322	Implementing graceful shutdown (#105 ) * Initial commit for graceful shutdown * fmt * Add .vscode to gitignore * Updates shutdown logic to use channels * fmt * fmt * Adds shutdown timeout * Fmt and updates tomls * Updates readme * fmt and updates log levels * Update python tests to test shutdown * merge changes * Rename listener rx and update bash to be in line with master * Update python test bash script ordering * Adds error response message before shutdown * Add details on shutdown event loop * Fixes response length for error * Adds handler for sigterm * Uses ready for query function and fixes number of bytes * fmt	2022-08-08 16:01:24 -07:00
Mostafa Abdelraouf	1b648ca00e	Send proper server parameters to clients using admin db (#103 ) * Send proper server parameters to clients using admin db * clean up * fix python test * build * Add python * missing & * debug ls * fix tests * fix tests * fix * Fix warning * Address comments	2022-07-31 19:52:23 -07:00

1 2 3

111 Commits