Flush stats (#38)

* flush stats

* stats

* refactor
This commit is contained in:
Lev Kokotov
2022-02-22 18:10:30 -08:00
committed by GitHub
parent 3f16123cc5
commit af1716bcd7
3 changed files with 128 additions and 109 deletions

View File

@@ -11,14 +11,14 @@ Meow. PgBouncer rewritten in Rust, with sharding, load balancing and failover su
## Features ## Features
| **Feature** | **Status** | **Comments** | | **Feature** | **Status** | **Comments** |
|--------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------| |--------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :heavy_check_mark: | Identical to PgBouncer. | | Transaction pooling | :white_check_mark: | Identical to PgBouncer. |
| Session pooling | :heavy_check_mark: | Identical to PgBouncer. | | Session pooling | :white_check_mark: | Identical to PgBouncer. |
| `COPY` support | :heavy_check_mark: | Both `COPY TO` and `COPY FROM` are supported. | | `COPY` support | :white_check_mark: | Both `COPY TO` and `COPY FROM` are supported. |
| Query cancellation | :heavy_check_mark: | Supported both in transaction and session pooling modes. | | Query cancellation | :white_check_mark: | Supported both in transaction and session pooling modes. |
| Load balancing of read queries | :heavy_check_mark: | Using round-robin between replicas. Primary is included when `primary_reads_enabled` is enabled (default). | | Load balancing of read queries | :white_check_mark: | Using round-robin between replicas. Primary is included when `primary_reads_enabled` is enabled (default). |
| Sharding | :heavy_check_mark: | Transactions are sharded using `SET SHARD TO` and `SET SHARDING KEY TO` syntax extensions; see examples below. | | Sharding | :white_check_mark: | Transactions are sharded using `SET SHARD TO` and `SET SHARDING KEY TO` syntax extensions; see examples below. |
| Failover | :heavy_check_mark: | Replicas are tested with a health check. If a health check fails, remaining replicas are attempted; see below for algorithm description and examples. | | Failover | :white_check_mark: | Replicas are tested with a health check. If a health check fails, remaining replicas are attempted; see below for algorithm description and examples. |
| Statistics reporting | :heavy_check_mark: | Statistics similar to PgBouncers are reported via StatsD. | | Statistics reporting | :white_check_mark: | Statistics similar to PgBouncers are reported via StatsD. |
| Live configuration reloading | :construction_worker: | Reload config with a `SIGHUP` to the process, e.g. `kill -s SIGHUP $(pgrep pgcat)`. Not all settings can be reloaded without a restart. | | Live configuration reloading | :construction_worker: | Reload config with a `SIGHUP` to the process, e.g. `kill -s SIGHUP $(pgrep pgcat)`. Not all settings can be reloaded without a restart. |
| Client authentication | :x: :wrench: | On the roadmap; currently all clients are allowed to connect and one user is used to connect to Postgres. | | Client authentication | :x: :wrench: | On the roadmap; currently all clients are allowed to connect and one user is used to connect to Postgres. |
@@ -75,15 +75,15 @@ See [sharding README](./tests/sharding/README.md) for sharding logic testing.
| **Feature** | **Tested in CI** | **Tested manually** | **Comments** | | **Feature** | **Tested in CI** | **Tested manually** | **Comments** |
|-----------------------|--------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------| |-----------------------|--------------------|---------------------|--------------------------------------------------------------------------------------------------------------------------|
| Transaction pooling | :heavy_check_mark: | :heavy_check_mark: | Used by default for all tests. | | Transaction pooling | :white_check_mark: | :white_check_mark: | Used by default for all tests. |
| Session pooling | :heavy_check_mark: | :heavy_check_mark: | Tested by running pgbench with `--protocol prepared` which only works in session mode. | | Session pooling | :white_check_mark: | :white_check_mark: | Tested by running pgbench with `--protocol prepared` which only works in session mode. |
| `COPY` | :heavy_check_mark: | :heavy_check_mark: | `pgbench -i` uses `COPY`. `COPY FROM` is tested as well. | | `COPY` | :white_check_mark: | :white_check_mark: | `pgbench -i` uses `COPY`. `COPY FROM` is tested as well. |
| Query cancellation | :heavy_check_mark: | :heavy_check_mark: | `psql -c 'SELECT pg_sleep(1000);'` and press `Ctrl-C`. | | Query cancellation | :white_check_mark: | :white_check_mark: | `psql -c 'SELECT pg_sleep(1000);'` and press `Ctrl-C`. |
| Load balancing | :x: | :heavy_check_mark: | We could test this by emitting statistics for each replica and compare them. | | Load balancing | :x: | :white_check_mark: | We could test this by emitting statistics for each replica and compare them. |
| Failover | :x: | :heavy_check_mark: | Misconfigure a replica in `pgcat.toml` and watch it forward queries to spares. CI testing could include using Toxiproxy. | | Failover | :x: | :white_check_mark: | Misconfigure a replica in `pgcat.toml` and watch it forward queries to spares. CI testing could include using Toxiproxy. |
| Sharding | :heavy_check_mark: | :heavy_check_mark: | See `tests/sharding` and `tests/ruby` for an Rails/ActiveRecord example. | | Sharding | :white_check_mark: | :white_check_mark: | See `tests/sharding` and `tests/ruby` for an Rails/ActiveRecord example. |
| Statistics reporting | :x: | :heavy_check_mark: | Run `nc -l -u 8125` and watch the stats come in every 15 seconds. | | Statistics reporting | :x: | :white_check_mark: | Run `nc -l -u 8125` and watch the stats come in every 15 seconds. |
| Live config reloading | :heavy_check_mark: | :heavy_check_mark: | Run `kill -s SIGHUP $(pgrep pgcat)` and watch the config reload. | | Live config reloading | :white_check_mark: | :white_check_mark: | Run `kill -s SIGHUP $(pgrep pgcat)` and watch the config reload. |
## Usage ## Usage

View File

@@ -111,9 +111,9 @@ async fn main() {
// Collect statistics and send them to StatsD // Collect statistics and send them to StatsD
let (tx, rx) = mpsc::channel(100); let (tx, rx) = mpsc::channel(100);
let collector_tx = tx.clone();
tokio::task::spawn(async move { tokio::task::spawn(async move {
let mut stats_collector = Collector::new(rx); let mut stats_collector = Collector::new(rx, collector_tx);
stats_collector.collect().await; stats_collector.collect().await;
}); });

View File

@@ -4,7 +4,6 @@ use statsd::Client;
use tokio::sync::mpsc::{Receiver, Sender}; use tokio::sync::mpsc::{Receiver, Sender};
use std::collections::HashMap; use std::collections::HashMap;
use std::time::Instant;
use crate::config::get_config; use crate::config::get_config;
@@ -24,6 +23,7 @@ enum EventName {
ServerTested, ServerTested,
ServerLogin, ServerLogin,
ServerDisconnecting, ServerDisconnecting,
FlushStatsToStatsD,
} }
#[derive(Debug)] #[derive(Debug)]
@@ -44,155 +44,167 @@ impl Reporter {
} }
pub fn query(&self) { pub fn query(&self) {
let statistic = Event { let event = Event {
name: EventName::Query, name: EventName::Query,
value: 1, value: 1,
process_id: None, process_id: None,
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn transaction(&self) { pub fn transaction(&self) {
let statistic = Event { let event = Event {
name: EventName::Transaction, name: EventName::Transaction,
value: 1, value: 1,
process_id: None, process_id: None,
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn data_sent(&self, amount: usize) { pub fn data_sent(&self, amount: usize) {
let statistic = Event { let event = Event {
name: EventName::DataSent, name: EventName::DataSent,
value: amount as i64, value: amount as i64,
process_id: None, process_id: None,
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn data_received(&self, amount: usize) { pub fn data_received(&self, amount: usize) {
let statistic = Event { let event = Event {
name: EventName::DataReceived, name: EventName::DataReceived,
value: amount as i64, value: amount as i64,
process_id: None, process_id: None,
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn checkout_time(&self, ms: u128) { pub fn checkout_time(&self, ms: u128) {
let statistic = Event { let event = Event {
name: EventName::CheckoutTime, name: EventName::CheckoutTime,
value: ms as i64, value: ms as i64,
process_id: None, process_id: None,
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn client_waiting(&self, process_id: i32) { pub fn client_waiting(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ClientWaiting, name: EventName::ClientWaiting,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn client_active(&self, process_id: i32) { pub fn client_active(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ClientActive, name: EventName::ClientActive,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn client_idle(&self, process_id: i32) { pub fn client_idle(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ClientIdle, name: EventName::ClientIdle,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn client_disconnecting(&self, process_id: i32) { pub fn client_disconnecting(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ClientDisconnecting, name: EventName::ClientDisconnecting,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn server_active(&self, process_id: i32) { pub fn server_active(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ServerActive, name: EventName::ServerActive,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn server_idle(&self, process_id: i32) { pub fn server_idle(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ServerIdle, name: EventName::ServerIdle,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn server_login(&self, process_id: i32) { pub fn server_login(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ServerLogin, name: EventName::ServerLogin,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn server_tested(&self, process_id: i32) { pub fn server_tested(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ServerTested, name: EventName::ServerTested,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
pub fn server_disconnecting(&self, process_id: i32) { pub fn server_disconnecting(&self, process_id: i32) {
let statistic = Event { let event = Event {
name: EventName::ServerDisconnecting, name: EventName::ServerDisconnecting,
value: 1, value: 1,
process_id: Some(process_id), process_id: Some(process_id),
}; };
let _ = self.tx.try_send(statistic); let _ = self.tx.try_send(event);
} }
// pub fn flush_to_statsd(&self) {
// let event = Event {
// name: EventName::FlushStatsToStatsD,
// value: 0,
// process_id: None,
// };
// let _ = self.tx.try_send(event);
// }
} }
pub struct Collector { pub struct Collector {
rx: Receiver<Event>, rx: Receiver<Event>,
tx: Sender<Event>,
client: Client, client: Client,
} }
impl Collector { impl Collector {
pub fn new(rx: Receiver<Event>) -> Collector { pub fn new(rx: Receiver<Event>, tx: Sender<Event>) -> Collector {
Collector { Collector {
rx: rx, rx,
tx,
client: Client::new(&get_config().general.statsd_address, "pgcat").unwrap(), client: Client::new(&get_config().general.statsd_address, "pgcat").unwrap(),
} }
} }
@@ -218,8 +230,19 @@ impl Collector {
]); ]);
let mut client_server_states: HashMap<i32, EventName> = HashMap::new(); let mut client_server_states: HashMap<i32, EventName> = HashMap::new();
let tx = self.tx.clone();
let mut now = Instant::now(); tokio::task::spawn(async move {
let mut interval = tokio::time::interval(tokio::time::Duration::from_millis(15000));
loop {
interval.tick().await;
let _ = tx.try_send(Event {
name: EventName::FlushStatsToStatsD,
value: 0,
process_id: None,
});
}
});
loop { loop {
let stat = match self.rx.recv().await { let stat = match self.rx.recv().await {
@@ -284,65 +307,61 @@ impl Collector {
EventName::ClientDisconnecting | EventName::ServerDisconnecting => { EventName::ClientDisconnecting | EventName::ServerDisconnecting => {
client_server_states.remove(&stat.process_id.unwrap()); client_server_states.remove(&stat.process_id.unwrap());
} }
EventName::FlushStatsToStatsD => {
for (_, state) in &client_server_states {
match state {
EventName::ClientActive => {
let counter = stats.entry("cl_active").or_insert(0);
*counter += 1;
}
EventName::ClientWaiting => {
let counter = stats.entry("cl_waiting").or_insert(0);
*counter += 1;
}
EventName::ClientIdle => {
let counter = stats.entry("cl_idle").or_insert(0);
*counter += 1;
}
EventName::ServerIdle => {
let counter = stats.entry("sv_idle").or_insert(0);
*counter += 1;
}
EventName::ServerActive => {
let counter = stats.entry("sv_active").or_insert(0);
*counter += 1;
}
EventName::ServerTested => {
let counter = stats.entry("sv_tested").or_insert(0);
*counter += 1;
}
EventName::ServerLogin => {
let counter = stats.entry("sv_login").or_insert(0);
*counter += 1;
}
_ => unreachable!(),
};
}
info!("{:?}", stats);
let mut pipeline = self.client.pipeline();
for (key, value) in stats.iter_mut() {
pipeline.gauge(key, *value as f64);
*value = 0;
}
pipeline.send(&self.client);
}
}; };
// It's been 15 seconds. If there is no traffic, it won't publish anything,
// but it also doesn't matter then.
if now.elapsed().as_secs() > 15 {
for (_, state) in &client_server_states {
match state {
EventName::ClientActive => {
let counter = stats.entry("cl_active").or_insert(0);
*counter += 1;
}
EventName::ClientWaiting => {
let counter = stats.entry("cl_waiting").or_insert(0);
*counter += 1;
}
EventName::ClientIdle => {
let counter = stats.entry("cl_idle").or_insert(0);
*counter += 1;
}
EventName::ServerIdle => {
let counter = stats.entry("sv_idle").or_insert(0);
*counter += 1;
}
EventName::ServerActive => {
let counter = stats.entry("sv_active").or_insert(0);
*counter += 1;
}
EventName::ServerTested => {
let counter = stats.entry("sv_tested").or_insert(0);
*counter += 1;
}
EventName::ServerLogin => {
let counter = stats.entry("sv_login").or_insert(0);
*counter += 1;
}
_ => unreachable!(),
};
}
info!("{:?}", stats);
let mut pipeline = self.client.pipeline();
for (key, value) in stats.iter_mut() {
pipeline.gauge(key, *value as f64);
*value = 0;
}
pipeline.send(&self.client);
now = Instant::now();
}
} }
} }
} }