pgcat

mirror of https://github.com/postgresml/pgcat.git synced 2026-03-23 17:36:28 +00:00

Author	SHA1	Message	Date
Tommy Li	9937193332	Allow pause/resuming all pools (#566 ) support pausing all pools	2023-08-29 10:07:36 -07:00
Mostafa Abdelraouf	a8a30ad43b	Refactor Pool Stats to be based off of Server/Client stats (#445 ) What is wrong Stats reported by SHOW POOLS seem to be leaking. We see lingering cl_idle , cl_waiting, and similarly for sv_idle , sv_active. We confirmed that these are reporting issues not actual lingering clients. This behavior is readily reproducible by running while true; do psql "postgres://sharding_user:sharding_user@localhost:6432/sharded_db" -c "SELECT 1" > /dev/null 2>&1 & done Why it happens I wasn't able to get to figure our the reason for the bug but my best guess is that we have race conditions when updating pool-level stats. So even though individual update operations are atomic, we perform a check then update sequence which is not protected by a guard. https://github.com/postgresml/pgcat/blob/main/src/stats/pool.rs#L174-L179 I am also suspecting that using Relaxed ordering might allow this behavior (I changed all operations to use Ordering::SeqCst but still got lingering clients) How to fix Since SHOW POOLS/SHOW SERVER/SHOW CLIENTS all show the current state of the proxy (as opposed to SHOW STATS which show aggregate values), this PR refactors SHOW POOLS to have it construct the results directly from SHOW SERVER and SHOW CLIENT datasets. This reduces the complexity of stat updates and eliminates the need for having locks when updating pool stats as we only care about updating individual client/server states. This will change the semantics of maxwait, so instead of it holding the maxwait time ever encountered by a client (connected or disconnected), it will only consider connected clients which should be okay given PgCat tends to hold on to client connections more than Pgbouncer.	2023-05-23 08:44:49 -05:00
Zain Kabani	7f57a89d75	Fix time based average stats (#442 ) * keep track of current stats and zero them after updating averages * Try tests * typo * remove commented test stuff * Avoid dividing by zero * Fix test * refactor, get rid of iterator. do it manually * trigger build * Fix	2023-05-17 21:38:10 -07:00
Zain Kabani	73260690b0	Fixes average stats bug (#436 ) * Add test * Fix test * Add fix	2023-05-11 17:37:58 -07:00
Lev Kokotov	811885f464	Actually plugins (#421 ) * more plugins * clean up * fix tests * fix flakey test	2023-05-03 16:13:45 -07:00
Kian-Meng Ang	d568739db9	Fix typos (#398 ) Found via `typos --format brief`	2023-04-10 18:37:16 -07:00
Jose Fernández	58ce76d9b9	Refactor stats to use atomics (#375 ) * Refactor stats to use atomics When we are dealing with a high number of connections, generated stats cannot be consumed fast enough by the stats collector loop. This makes the stats subsystem inconsistent and a log of warning messages are thrown due to unregistered server/clients. This change refactors the stats subsystem so it uses atomics: - Now counters are handled using U64 atomics - Event system is dropped and averages are calculated using a loop every 15 seconds. - Now, instead of snapshots being generated ever second we keep track of servers/clients that have registered. Each pool/server/client has its own instance of the counter and makes changes directly, instead of adding an event that gets processed later. * Manually mplement Hash/Eq in `config::Address` ignoring stats * Add tests for client connection counters * Allow connecting to dockerized dev pgcat from the host * stats: Decrease cl_idle when idle socket disconnects	2023-03-28 17:19:37 +02:00
Mostafa Abdelraouf	2cc6a09fba	Add Manual host banning to PgCat (#340 ) Sometimes we want an admin to be able to ban a host for some time to route traffic away from that host for reasons like partial outages, replication lag, and scheduled maintenance. We can achieve this today using a configuration update but a quicker approach is to send a control command to PgCat that bans the replica for some specified duration. This command does not change the current banning rules like Primaries cannot be banned When all replicas are banned, all replicas are unbanned	2023-03-06 06:10:59 -06:00
Nicholas Dujay	37e1c5297a	implement show users (#329 ) * implement show users * fix compile errors * add basic ruby test * gitignore things	2023-02-21 13:08:43 -08:00
Mostafa Abdelraouf	f9134807d7	More Test coverage + fix some code coverage bugs (#321 ) Connection to the CI databases is viewed by Postgres as coming from localhost. The pg_hba.conf file generated by the docker image uses trust for these connections, that's why we had no test coverage on SASL and md5 branches. This PR fixes this issue. There was also an issue with under-reporting code coverage. This should be fixed now	2023-02-16 23:09:22 -06:00
Mostafa Abdelraouf	3d33ccf4b0	Fix maxwait metric (#183 ) Max wait was being reported as 0 after #159 This PR fixes that and adds test	2022-10-05 21:41:09 -05:00
Mostafa Abdelraouf	af064ef447	Set client state to idle after error (#179 ) * Set client state to idle after error * fmt * spelling * clean up	2022-09-24 09:09:15 -07:00
Mostafa Abdelraouf	f7a951745c	Report Query times (#166 ) * Report avg and total query timing * Report query times * fmt	2022-09-15 02:21:45 -04:00
Mostafa Abdelraouf	4ae1bc8d32	Add SHOW CLIENTS / SHOW SERVERS + Stats refactor and tests (#159 ) * wip * Main Thread Panic when swarmed with clients * fix * fix * 1024 * fix * remove test * Add SHOW CLIENTS * revert * fmt * Refactor + tests * fmt * add test * Add SHOW SERVERS + Make PR unreviewable * prometheus * add state to clients and servers * fmt * Add application_name to server stats * Add tests for waiting clients * Docs * remove comment * comments * typo * cleanup * CI	2022-09-14 11:20:41 -04:00

14 Commits