I am requesting Prometheus's query_range interface between 15 seconds and 25 seconds every minute, and the number of requests is about 200,000 to 300,000 every minute.
The interface of Prometheus sometimes returns data very slowly because CPU is 100% usage with 4 cores and 8g RAM.
Here are some typical query logs:
{"error":"query was canceled in expression evaluation","httpRequest":{"clientIP":"172.17.0.42","method":"GET","path":"/api/v1/query_range"},"params":{"end":"2022-03-30T09:35:59.000Z","query":"jvm_threads_current{appName=\"sunfire-selfmonitor-reduce.url4625\", }","start":"2022-03-30T09:35:00.000Z","step":60},"stats":{"timings":{"evalTotalTime":5.672762408,"resultSortTime":0,"queryPreparationTime":5.464064693,"innerEvalTime":0,"execQueueTime":0.000004868,"execTotalTime":5.995925647}},"ts":"2022-03-30T09:35:22.686Z"}
{"error":"query was canceled in expression evaluation","httpRequest":{"clientIP":"172.17.0.126","method":"GET","path":"/api/v1/query_range"},"params":{"end":"2022-03-30T09:38:59.000Z","query":"jvm_memory_pool_bytes_max{appName=\"sunfire-selfmonitor-reduce.url1927\", pool=~\"Code Cache\"}","start":"2022-03-30T09:38:00.000Z","step":60},"stats":{"timings":{"evalTotalTime":3.5984797520000003,"resultSortTime":0,"queryPreparationTime":2.605931976,"innerEvalTime":0,"execQueueTime":0.911022524,"execTotalTime":5.196457487}},"ts":"2022-03-30T09:38:25.080Z"}
{"error":"query was canceled in expression evaluation","httpRequest":{"clientIP":"172.17.0.126","method":"GET","path":"/api/v1/query_range"},"params":{"end":"2022-03-30T09:38:59.000Z","query":"jvm_buffer_pool_used_bytes{appName=\"sunfire-selfmonitor-reduce.url132\", pool=~\"mapped\"} / jvm_buffer_pool_capacity_bytes{appName=\"sunfire-selfmonitor-reduce.url132\", pool=~\"mapped\"}","start":"2022-03-30T09:38:00.000Z","step":60},"stats":{"timings":{"evalTotalTime":3.5998245989999997,"resultSortTime":0,"queryPreparationTime":2.607460066,"innerEvalTime":0,"execQueueTime":0.910834909,"execTotalTime":5.20090463}},"ts":"2022-03-30T09:38:25.081Z"}
- '--storage.tsdb.retention.time=5m'
- '--storage.tsdb.max-block-duration=5m'
- '--storage.tsdb.min-block-duration=5m'
The reason I use these three parameters is because I want to reduce the memory and disk usage as much as possible (currently this should keep the memory at 5 minutes of data, right?)
I don't know if it is because of the setting of these parameters that the CPU consumption is too high, because I see that the official does not recommend setting these parameters.If I shouldn't use these parameters, then I wonder if I just want to keep 5-10 minutes of data in memory and on disk, is there a way to do it?