/federate returns illegally formatted metrics?

686 views
Skip to first unread message

JAE HOON KO

unread,
Sep 7, 2016, 7:56:38 PM9/7/16
to Prometheus Developers
Hi,

It seems that Prometheus (v1.1.0) returns illegally formated metrics when the content encoding is text/plain.
According to the document, Promethus prescribes that a header (starting with # TYPE) should precede metrics themselves (https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details).
However, if I federate a prometheus like: curl -g 'localhost:9090/federate?match[]={__name__=~"..*"}'
returns:

# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{statsdtype="timers",gs_app="someapp",job="aggregator_10s",instance="172.24.0.7:9080",gs_dc="us-east-1",gs_svcuser="svcuser2",gs_statistics="lower",gs_service="hermes",gs_hostname="someapp-01",gs_env="dev",gs_instanceid="8ef31bcd",id="prom-1"} 156 1473292292860
# TYPE netdev untyped
netdev{dev="eth0",gs_svcuser="svcuser2",gs_env="dev",gs_instanceid="aabd3924",type="byteout",instance="container-metrics:9092",gs_dc="us-east-1",job="promclient_triton_5s",id="prom-1"} 2.198199005e+09 1473292299525
# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{gs_dc="us-east-1",job="aggregator_10s",gs_statistics="sum",gs_service="hermes",gs_instanceid="8ef31bcd",gs_svcuser="svcuser2",statsdtype="timers",gs_env="dev",instance="172.24.0.7:9080",gs_app="someapp",gs_hostname="someapp-01",id="prom-1"} 6778 1473292292860
# TYPE http_request_duration_microseconds untyped
http_request_duration_microseconds{job="promclient_triton_5s",gs_dc="us-east-1",instance="container-metrics:9091",gs_env="dev",handler="prometheus",quantile="0.5",id="prom-1"} 1242.205 1473292300837
# TYPE go_gc_duration_seconds untyped
go_gc_duration_seconds{quantile="0.75",gs_dc="us-east-1",gs_env="dev",job="promclient_triton_5s",instance="container-metrics:9092",id="prom-1"} 0.00040349 1473292299525
# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{gs_service="hermes",gs_app="someapp",gs_instanceid="8ef31bcd",gs_svcuser="svcuser1",gs_env="dev",gs_hostname="someapp-01",gs_statistics="lower",statsdtype="timers",gs_dc="us-east-1",job="aggregator_10s",instance="172.24.0.7:9080",id="prom-1"} 77 1473292292860

As you can see, every time series is preceded by a header, resulting multile identical headers are repeating if there's more than one timeseries belonging to the metric.
prometheus-to-prometheus federation works well. I suspect this is because protobuf encoding is applied.

I'm not sure whether this is a bug or a feature. But I really wanna have it well-formatted. I'm building a proxy between two prometheus', which tweaks query.

Thanks

Brian Brazil

unread,
Sep 7, 2016, 10:55:10 PM9/7/16
to JAE HOON KO, Prometheus Developers
This is acceptable output, as each of the time series is unique. Both parsers will handle this.

What are you trying to do? Federation is intended for transferring aggregated stats, not an all the data of a Prometheus.

Brian

 

Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Tobias Schmidt

unread,
Sep 7, 2016, 11:19:55 PM9/7/16
to Brian Brazil, JAE HOON KO, Prometheus Developers
The issue is that time series in the generated text output are grouped by metric family. Here is a better example:

http://prometheus/federate?match[]={job=%22prometheus%22}
...
# TYPE http_request_duration_microseconds untyped
http_request_duration_microseconds{job="prometheus",instance="localhost:9090",handler="query",quantile="0.99",owner="prodeng"} NaN 1473304381655
# TYPE http_request_duration_microseconds untyped
http_request_duration_microseconds{job="prometheus",instance="localhost:9090",handler="alerts",quantile="0.5",owner="prodeng"} NaN 1473304381655
...

Having more than one time series per metric is common also in aggregated data.


 

Thanks
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.

Tobias Schmidt

unread,
Sep 7, 2016, 11:25:02 PM9/7/16
to Brian Brazil, JAE HOON KO, Prometheus Developers
On Wed, Sep 7, 2016 at 11:19 PM Tobias Schmidt <tob...@gmail.com> wrote:
The issue is that time series in the generated text output are grouped by metric family.

*not

JAE HOON KO

unread,
Sep 8, 2016, 12:13:43 AM9/8/16
to Prometheus Developers
Sorry, I had to be more specific.

If I scrape a prometheus using /federate like: curl -L -g 'http://localhost:9090/federate?match[]={__name__="cpu"}'

It responses with:

# TYPE cpu untyped
cpu{job="promclient_triton_5s",instance="container-metrics:9092",gs_instanceid="aabd3924",gs_svcuser="svcuser2",type="idle",gs_dc="us-east-1",gs_env="dev",id="prom-1"} 0.0886400014181114 1473307588759
# TYPE cpu untyped
cpu{instance="container-metrics:9092",gs_instanceid="aabd3924",gs_svcuser="svcuser2",type="wait",gs_dc="us-east-1",gs_env="dev",job="promclient_triton_5s",id="prom-1"} 0.6554035022739826 1473307588759
# TYPE cpu untyped
cpu{instance="container-metrics:9093",job="promclient_triton_5s",gs_dc="us-east-1",gs_env="dev",gs_instanceid="231deaa1",gs_svcuser="svcuser2",type="sys",id="prom-1"} 0.3062938875174634 1473307586226
# TYPE cpu untyped
cpu{job="promclient_triton_5s",gs_instanceid="fb310edz",gs_svcuser="svcuser1",type="user",gs_dc="us-east-1",instance="container-metrics:9091",gs_env="dev",id="prom-1"} 0.5589919972685718 1473307585070
...

But according to Prometheus doc, it says: "Only one TYPE line may exist for the same metric name. The TYPE line for a metric name has to appear before the first sample is reported for that metric name." (https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details)
To me, the response is violating this rule!

If I set up a dummy HTTP server that returns the above response, and have a prometheus scrape from it, prometheus complains like: text format parsing error in line 3: second TYPE line for metric name "cpu", or TYPE reported after samples

If I have the prometheus to scrape from another prometheus, it works. The only difference is that the dummy HTTP server returns in text/plain while prometheus encodes the response in protobuf.


2016년 9월 8일 목요일 오전 8시 56분 38초 UTC+9, JAE HOON KO 님의 말:

Brian Brazil

unread,
Sep 8, 2016, 12:23:53 AM9/8/16
to JAE HOON KO, Prometheus Developers
On 8 September 2016 at 05:13, JAE HOON KO <roen...@gmail.com> wrote:
Sorry, I had to be more specific.

If I scrape a prometheus using /federate like: curl -L -g 'http://localhost:9090/federate?match[]={__name__="cpu"}'

It responses with:

# TYPE cpu untyped
cpu{job="promclient_triton_5s",instance="container-metrics:9092",gs_instanceid="aabd3924",gs_svcuser="svcuser2",type="idle",gs_dc="us-east-1",gs_env="dev",id="prom-1"} 0.0886400014181114 1473307588759
# TYPE cpu untyped
cpu{instance="container-metrics:9092",gs_instanceid="aabd3924",gs_svcuser="svcuser2",type="wait",gs_dc="us-east-1",gs_env="dev",job="promclient_triton_5s",id="prom-1"} 0.6554035022739826 1473307588759
# TYPE cpu untyped
cpu{instance="container-metrics:9093",job="promclient_triton_5s",gs_dc="us-east-1",gs_env="dev",gs_instanceid="231deaa1",gs_svcuser="svcuser2",type="sys",id="prom-1"} 0.3062938875174634 1473307586226
# TYPE cpu untyped
cpu{job="promclient_triton_5s",gs_instanceid="fb310edz",gs_svcuser="svcuser1",type="user",gs_dc="us-east-1",instance="container-metrics:9091",gs_env="dev",id="prom-1"} 0.5589919972685718 1473307585070
...

But according to Prometheus doc, it says: "Only one TYPE line may exist for the same metric name. The TYPE line for a metric name has to appear before the first sample is reported for that metric name." (https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details)
To me, the response is violating this rule!

If I set up a dummy HTTP server that returns the above response, and have a prometheus scrape from it, prometheus complains like: text format parsing error in line 3: second TYPE line for metric name "cpu", or TYPE reported after samples


That would be a bug in the Go text parser then. Filed https://github.com/prometheus/common/issues/54

If you strip off all the TYPE lines it should work.

Brian 


If I have the prometheus to scrape from another prometheus, it works. The only difference is that the dummy HTTP server returns in text/plain while prometheus encodes the response in protobuf.


2016년 9월 8일 목요일 오전 8시 56분 38초 UTC+9, JAE HOON KO 님의 말:
Hi,

It seems that Prometheus (v1.1.0) returns illegally formated metrics when the content encoding is text/plain.
According to the document, Promethus prescribes that a header (starting with # TYPE) should precede metrics themselves (https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details).
However, if I federate a prometheus like: curl -g 'localhost:9090/federate?match[]={__name__=~"..*"}'
returns:

# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{statsdtype="timers",gs_app="someapp",job="aggregator_10s",instance="172.24.0.7:9080",gs_dc="us-east-1",gs_svcuser="svcuser2",gs_statistics="lower",gs_service="hermes",gs_hostname="someapp-01",gs_env="dev",gs_instanceid="8ef31bcd",id="prom-1"} 156 1473292292860
# TYPE netdev untyped
netdev{dev="eth0",gs_svcuser="svcuser2",gs_env="dev",gs_instanceid="aabd3924",type="byteout",instance="container-metrics:9092",gs_dc="us-east-1",job="promclient_triton_5s",id="prom-1"} 2.198199005e+09 1473292299525
# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{gs_dc="us-east-1",job="aggregator_10s",gs_statistics="sum",gs_service="hermes",gs_instanceid="8ef31bcd",gs_svcuser="svcuser2",statsdtype="timers",gs_env="dev",instance="172.24.0.7:9080",gs_app="someapp",gs_hostname="someapp-01",id="prom-1"} 6778 1473292292860
# TYPE http_request_duration_microseconds untyped
http_request_duration_microseconds{job="promclient_triton_5s",gs_dc="us-east-1",instance="container-metrics:9091",gs_env="dev",handler="prometheus",quantile="0.5",id="prom-1"} 1242.205 1473292300837
# TYPE go_gc_duration_seconds untyped
go_gc_duration_seconds{quantile="0.75",gs_dc="us-east-1",gs_env="dev",job="promclient_triton_5s",instance="container-metrics:9092",id="prom-1"} 0.00040349 1473292299525
# TYPE hermes_req_proc_time untyped
hermes_req_proc_time{gs_service="hermes",gs_app="someapp",gs_instanceid="8ef31bcd",gs_svcuser="svcuser1",gs_env="dev",gs_hostname="someapp-01",gs_statistics="lower",statsdtype="timers",gs_dc="us-east-1",job="aggregator_10s",instance="172.24.0.7:9080",id="prom-1"} 77 1473292292860

As you can see, every time series is preceded by a header, resulting multile identical headers are repeating if there's more than one timeseries belonging to the metric.
prometheus-to-prometheus federation works well. I suspect this is because protobuf encoding is applied.

I'm not sure whether this is a bug or a feature. But I really wanna have it well-formatted. I'm building a proxy between two prometheus', which tweaks query.

Thanks

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Richard Hartmann

unread,
Sep 8, 2016, 1:32:05 AM9/8/16
to Brian Brazil, JAE HOON KO, Prometheus Developers
On Thu, Sep 8, 2016 at 6:23 AM, Brian Brazil
<brian....@robustperception.io> wrote:

> That would be a bug in the Go text parser then. Filed
> https://github.com/prometheus/common/issues/54

Interesting; I ran into this early this year and changed my (text)
exporters to only print TYPE and HELP only once per metric name.

Depending on the cardinality of the metric names, wouldn't it make
sense to make a best effort list of already exposed metric names? At
least in text format, this can increase network load by a small but
noticeable proportion.


Richard
Reply all
Reply to author
Forward
0 new messages