Beginner user : collector returning zero on occasion

127 views
Skip to first unread message

Patrick Macdonald

unread,
Dec 12, 2020, 4:10:05 PM12/12/20
to Prometheus Users
I'm just starting with prometheus. I've setup a python collector that I'm successfully scraping. The collector is doing a few api queries with one of our products. This can take 5-10 seconds. Even if I set the scrape duration to greater than the maximum query time, I'm finding the collector is returning zero as a value regularly. 
In the collector logs I'm getting the following error : 
```
BrokenPipeError: [Errno 32] Broken pipe 
```
The main collector method is : 

def collect(self):
        logging.info("collect : {}".format(self))
        g = GaugeMetricFamily("sg_one_time_query", 'Help text', labels=['site', 'field'])

        data = self.get_shotgun_API_response()
        logging.info("data : {}".format(data))
        if data:
            g.add_metric([_SG_SCRIPT_SITE, 'sg_one_time_query'], data['sg_one_time_query'])
            g.add_metric([_SG_SCRIPT_SITE, 'sg_one_time_update'], data['sg_one_time_update'])
            g.add_metric([_SG_SCRIPT_SITE, 'sg_open_ended_query'], data['sg_open_ended_query'])
            yield g

So what I think is happening is that Prometheus is scraping this collector before the get_shogun_API_response method has returned a value.

Does anyone know how I can fix the collect() method so it gracefully handles situations when the next scrape happens before the get_shotgun_API_response method has returned a value to yield? 

Thanks in advance for your help! 
Cheers
Patrick

reForm Studios

unread,
Dec 12, 2020, 4:24:20 PM12/12/20
to Prometheus Users
Here's the full stack trace. I'm not sure where the ip 172.20.0.2:59810 is coming from at all....  :

Exception happened during processing of request from ('172.20.0.2', 59810)

Traceback (most recent call last):

File "/usr/local/lib/python3.7/socketserver.py", line 650, in process_request_thread

self.finish_request(request, client_address)

File "/usr/local/lib/python3.7/socketserver.py", line 360, in finish_request

self.RequestHandlerClass(request, client_address, self)

File "/usr/local/lib/python3.7/socketserver.py", line 720, in __init__

self.handle()

File "/usr/local/lib/python3.7/http/server.py", line 426, in handle

self.handle_one_request()

File "/usr/local/lib/python3.7/http/server.py", line 414, in handle_one_request

method()

File "/usr/local/lib/python3.7/site-packages/prometheus_client/exposition.py", line 159, in do_GET

self.wfile.write(output)

File "/usr/local/lib/python3.7/socketserver.py", line 799, in write

self._sock.sendall(b)

BrokenPipeError: [Errno 32] Broken pipe

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f0381231-ef8a-4b23-979d-49ee62e2b177n%40googlegroups.com.

reForm Studios

unread,
Dec 12, 2020, 6:07:27 PM12/12/20
to Prometheus Users
On further investigation, if I graph the data in grafana, I see zero values correlating to the broken pipe error. But if I view the data in a grafana table, there are no zero values, so I'm beginning to think my problem is more an issue of my expecting grafana to handle the granularity of the prometheus timeseries data. 
That being said, the missing data also shows if I graph it in Prometheus :
image.png
And here's the same in Grafana : 
image.png
and here's the table
image.png

So, I'm thinking the issue is more my lack of understanding of how the data is handled by Prometheus. Eg, given a scrape interval of x and a min step of y, and where y is smaller than a gap in data, then I should expect to see the zero values in the graphs as shown above. 

Which brings me back round to wondering why there are such gaps and why I'm getting Broken Pipe errors which are causing this. 

Ben Kochie

unread,
Dec 13, 2020, 6:14:30 AM12/13/20
to pat...@reformstudios.com, Prometheus Users
Those are probably not zeros, but null values. You mention adjusting the scrape interval, but did you adjust the scrape timeout? The default scrape timeout is 10 seconds. Also note, the timeout can never be longer than the interval as a single Prometheus server will never concurrently scrape the same target. However, targets are expected to handle concurrent scrapes, as you can have multiple Prometheus servers scraping the same targets.

Reply all
Reply to author
Forward
0 new messages