Question regarding Open Metrics format and how to handle errors

18 views
Skip to first unread message

vivek kodira

unread,
Jun 9, 2020, 3:45:13 AM6/9/20
to Prometheus Users
Hi Folks,

This is my first question here and it is not specifically about prometheus but about the way metrics are logged to a file. I hope you can help.

We've implemented a service which logs metrics in the open metrics format and as recommended in this page. So typical metric entries will look like this:

app_disk_used_bytes 5.20941207552e+13 1588085025536
app_disk_free_bytes 7.1712562479104e+13 1588085025536
app_io_counters_read_bytes 2.4027136e+07 1588085025536

My doubt is - if/when errors occur when gathering a metric, how do we log them in this file? Can the "value" be replaced with a text indicating an "error". Or is the recommendation that errors not be logged and be recorded elsewhere?

I asked this question on the git repo and was advised to try asking here instead.


Stuart Clark

unread,
Jun 9, 2020, 4:03:59 AM6/9/20
to vivek kodira, Prometheus Users

Have an additional metric that is a counter of errors, so if an error occur that is incremented. You can then display or alert on the increase.

Another common option is an "up" metric (e.g. mysq_up) which has a value of 0 if the metrics can't be fetched & 1 if all is OK.

For the broken metrics themselves you could leave them out (no metric is returned at all during that scrape) or possibly return NaN.

-- 
Stuart Clark

vivek kodira

unread,
Jun 11, 2020, 4:53:54 AM6/11/20
to Prometheus Users
Thanks Stuart. For anyone else with this issue, there is indeed an example [here](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md) that I had missed:

The error is passed as a label.
`msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9`
Reply all
Reply to author
Forward
0 new messages