Question regarding Open Metrics format and how to handle errors

vivek kodira

unread,

Jun 9, 2020, 3:45:13 AM6/9/20

to Prometheus Users

Hi Folks,

This is my first question here and it is not specifically about prometheus but about the way metrics are logged to a file. I hope you can help.

We've implemented a service which logs metrics in the open metrics format and as recommended in this page. So typical metric entries will look like this:

app_disk_used_bytes 5.20941207552e+13 1588085025536
app_disk_free_bytes 7.1712562479104e+13 1588085025536
app_io_counters_read_bytes 2.4027136e+07 1588085025536

My doubt is - if/when errors occur when gathering a metric, how do we log them in this file? Can the "value" be replaced with a text indicating an "error". Or is the recommendation that errors not be logged and be recorded elsewhere?

I asked this question on the git repo and was advised to try asking here instead.

Stuart Clark

unread,

Jun 9, 2020, 4:03:59 AM6/9/20

to vivek kodira, Prometheus Users

Have an additional metric that is a counter of errors, so if an error occur that is incremented. You can then display or alert on the increase.

Another common option is an "up" metric (e.g. mysq_up) which has a value of 0 if the metrics can't be fetched & 1 if all is OK.

For the broken metrics themselves you could leave them out (no metric is returned at all during that scrape) or possibly return NaN.

-- 
Stuart Clark

vivek kodira

unread,

Jun 11, 2020, 4:53:54 AM6/11/20

to Prometheus Users

Thanks Stuart. For anyone else with this issue, there is indeed an example [here](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md) that I had missed:

The error is passed as a label.

`msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9`

Reply all

Reply to author

Forward