How to get rid of context deadline exceeded without changing the scrape_timeout and scrape_interval

3,398 views
Skip to first unread message

suyog kulkarni

unread,
Mar 6, 2021, 1:54:37 AM3/6/21
to Prometheus Users
Hello ,

I have prometheus deployed  as container and currently i am trying to scrape 100 static/dynamic  endpoints in single job and the corresponding scrape interval is 20s and timeout is 15s. with this interval and timeout all my endpoints are showing context-deadline-exceeded. While if i change the scrape timeout and scrape interval to 50s each then my 100 endpoints are up. my target endpoint is node-expoter with heavy list of metrics.
I dont want to change the scrape interval or scrape timeout. How do i get rid of context-deadline-exceeded without changing the scrape interval and scrape timeout.
Thanks in advance.

Regards,
Suyog Kulkarni

Julien Pivotto

unread,
Mar 6, 2021, 3:15:00 AM3/6/21
to suyog kulkarni, Prometheus Users
Can we know more about your setup? Which version of prometheus do you use? Do you use TLS? Do you directly target the node exporters or is there a proxy in the middle?

Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/374154c1-c27a-4ee1-ae24-a41922034db7n%40googlegroups.com.

Matthias Rampke

unread,
Mar 6, 2021, 3:29:11 AM3/6/21
to Julien Pivotto, suyog kulkarni, Prometheus Users
How long does it take when you request the metrics directly from node exporter ? Can you exclude one collector at a time to see which one is the slow one?

Generally, node exporter doesn't "do" much other than collect metrics from the kernel; if it is slow most likely something else is too.

"context timeout exceeded" is Go's words for "this took longer than the configured timeout". The only way to make it go away is to make whatever is happening take less time than the timeout, or extend the timeout to be more than it takes.

/MR

suyog kulkarni

unread,
Mar 6, 2021, 4:00:20 AM3/6/21
to Prometheus Users
Current prometheus version is 2.20. There is proxy settings also. 
Here are my prometheus-config.

- job_name: spc-targets-node-exporter
  proxy_url: <Proxy-URL>
  scrape_interval: 15s
  scrape_timeout: 30s
  static_configs:
  - targets:
    - pce-9100:9100
    - pce-9101:9100
    - pce-9102:9100
    - pce-9103:9100
    - pce-9104:9100
         .
.
.
    - pce-N:9100

suyog kulkarni

unread,
Mar 6, 2021, 4:02:38 AM3/6/21
to Prometheus Users
The current node-exporter is setup as daemon-set in kubernetes cluster same where prometheus is deployed.

Matthias Rampke

unread,
Mar 6, 2021, 4:16:41 AM3/6/21
to suyog kulkarni, Prometheus Users
Try removing the proxy, within a cluster Prometheus should be able to connect to the pods directly. Consider using Kubernetes SD to discover the daemonset pods dynamically.

/MR

Julien Pivotto

unread,
Mar 6, 2021, 6:24:39 AM3/6/21
to Matthias Rampke, suyog kulkarni, Prometheus Users
Indeed, it seems this seems an issue with the proxy rather than prometheus.

Reply all
Reply to author
Forward
0 new messages