Hello Guys,
What is the Problem?I'm facing slow Grafana dashboard performance, I'm using Prometheus as my datastore,
just need to debug/understand the bottleneck/slowness.
What I've tried to improve performance?
1. Tried Trickster as a caching/accelerator layer between Prometheus and Grafana.
2. Increase some query parameters limits.
--query.max-concurrency=20
Maximum number of queries executed concurrently.
--query.max-samples=50000000
Maximum number of samples a single query can load into memory.
These help to reduce connection timeout issues but not help for slow performance
3. Check System resources usage - Its good enough to handle the query.
What I need to know ?1. Want understand more about below timing stats which can fetch from prometheus query logs (evalTotalTime,execQueueTime,execTotalTime",innerEvalTime,queryPreparationTime",resultSortTime )
"stats": {
"timings": {
"evalTotalTime": 0.000447452,
"execQueueTime": 7.599e-06,
"execTotalTime": 0.000461232,
"innerEvalTime": 0.000427033,
"queryPreparationTime": 1.4177e-05,
"resultSortTime": 6.48e-07
}
2. We're using Prometheus widely but unable to find a useful resource for performance tuning, so can you guys please flood this email chain with the tunable options/ideas to improve Prometheus query performance, guide me, to do anything better to narrow down the exact area which contributing the slowness.
Stack Details
OS: Centos 7
Version: Prometheus 2.20
Deployment: Docker compose stack (Prometheus, Grafana, Trickster)